filmeu

Class Topics on Data Engineering for Data Science

  • Presentation

    Presentation

    This course focuses on "data engineering" and its intersection with "data science". In this context, it is intended that students gain technical skills in several independent but related topics. The most relevant areas of this course are databases and programming, which are the fundamental skills one needs to be able to play the role of "data engineer" in academic and/or industrial projects. The inclusion of the course in the masters program is justified with the importance of data collection, validation and processing skills in order to be able to have data that can be "explored" with the knowledge acquired in the other curricular units.

  • Code

    Code

    ULHT6347-23271
  • Syllabus

    Syllabus

    1. .Introduction to Data Engineering
    2. Git & GitHub
      1. Introduction to version control systems
      2. Learning elementary work processes using the Git software and the GitHub online platform
    3. Databases & SQL
      1. Relational Databases
      2. SQL language
      3. SQL Injection
    4. Programming with PHP
      1. From the data extraction and data processing points of view
    5. Algoritmic complexity and efficiency
      1. It's importance when dealing with large amounts of data
    6. Programming with Python
      1. From the exploratory data analysis point of view
      2. Jupiter notebook
      3. Production of packages for publishing and distributing software ("deployment")
    7. Linux
      1. Introduction to the GNU/Linux operating system
      2. File system navigation (commands)
      3. Commands for process control
    8. Advanced Data Engineering Tools
      1. Introductory presentation of advanced tools such as:
        • Hadoop
        • Cassandra DB
  • Objectives

    Objectives

    Students are expected to learning technical skills related with:

    - Version control (Git & GitHub)

    - Relational Data Bases (e.g. MySQL) and SQL

    - Programming with the PHP language, focused on the interaction with relational databases 

    - Programming with the Python language

    - Elementary notions of algorithmo complexity and efficiency

    - Linux

    It is also expected that the students improve their creativity and critical thinking skills.

  • Teaching methodologies and assessment

    Teaching methodologies and assessment

    Theoretical-practical classes with exposition of theory and presentation of practical examples.

    Exercises to be carried out during the class, with the support and validation of the Teacher.

    Exercises to do at home.

    Assessment: 3 mini-tests and a project

  • Office Hours

    Office Hours

    O atendimento será feito por agendamento caso-a-caso. O aluno deverá contactar o professor por e-mail (bcipriano@ulusofona.pt), explicando a razão da mecessidade de contacto. Em função da disponibilidade do Professor e do Aluno, o apoio poderá ser por e-mail, por video-conferência, ou até presencial caso o Professor entenda que se justifica.

SINGLE REGISTRATION
Lisboa 2020 Portugal 2020 Small Logo EU small Logo PRR republica 150x50 Logo UE Financed Provedor do Estudante Livro de reclamaões Elogios