filmeu

Class Fundaments of Data Engineering

  • Presentation

    Presentation

    This course encompasses a range of subjects about data engineering from the perspective of a Data Scientist, covering various aspects from different data sources to the structuring and provisioning of processed data for modeling and visualization purposes. Students acquire foundational conceptual knowledge in data engineering, enabling them to furnish data of high quality for utilization in data science applications.

  • Code

    Code

    ULHT6634-24446
  • Syllabus

    Syllabus

    This course is divided into the following programmatic content:

    • S1 - Introduction 
      • What is a Data Engineer?
      • What does a Data Scientist need to know about data engineering?
      • Data engineering pipelines.
    • S2 - Relational Databases
      • SQL review. 
      • Relational concepts used in data models.
    • S3 - Data Modeling 
      • Data sources. 
      • Data Lake. 
      • Data Warehouse. 
      • Data Lakehouse. 
      • Data models.
    • S4 - Fundamentals of Big Data 
      • Hadoop. 
      • MapReduce.
    • S5 - Data Transformation
      • Necessary transformations for storing data for Data Science projects.
    • S6 - Data Visualization Tools
    • S7 - NoSQL Databases
    • S8 - Data Engineering Projects
  • Objectives

    Objectives

    LG1. Learn fundamental concepts of data engineering.

    LG2. Understand the data engineering lifecycle.

    LG3. Use SQL to transform and query data.

    LG4. Understand data modeling techniques for organizing and managing data.

    LG5. Build pipelines to collect, transform, analyze and visualize data from operational source systems. 

    LG6. Be able to apply the principles used in class to build a simple data pipeline and visualize the data.

  • Teaching methodologies and assessment

    Teaching methodologies and assessment

    The lectures are conducted in person and are primarily based on exposition. Students are encouraged to actively participate by asking questions stimulating their interest in the subject matter. When appropriate, specific problems that students are familiar with are analysed before the presentation of content. Some topics arise from analysing problems, the resolution of which naturally leads to their questioning and/or formulation. Whenever possible, examples and counterexamples are provided to illustrate the content. At the end of most lectures, problems are presented for students to work on independently to ensure a thorough understanding of the concepts and techniques covered.

  • References

    References

    • Lau, S., Gonzales, J., Nolan, D. - Learning Data Science. available at: https://learningds.org/intro.html
    • Kleppmann, M. (2017). Designing Data-Intensive Applications. O'Reilly Media.

     

SINGLE REGISTRATION
Lisboa 2020 Portugal 2020 Small Logo EU small Logo PRR republica 150x50 Logo UE Financed Provedor do Estudante Livro de reclamaões Elogios