-
Presentation
Presentation
This module aims to introduce the foundational techniques and methodologies for analysing data from the interdisciplinary perspective of a Data Scientist. At the beginning of this module, students learn about the diverse nature of data and the symbolic power of different data structures. This foundational understanding naturally leads to a second stage in which students learn how to interrogate and extract information from data and justify their choices. During this stage, students learn about statistical inference, hypothesis testing, Frequentist vs Bayesian approaches to data, correlation, and causation. Finally, in the third part of the module, students learn the basics of machine learning through the theory and practice of regression, classification, and dimensionality reduction methods.
-
Class from course
Class from course
-
Degree | Semesters | ECTS
Degree | Semesters | ECTS
Bachelor | Semestral | 5
-
Year | Nature | Language
Year | Nature | Language
3 | Mandatory | Português
-
Code
Code
ULHT260-22513
-
Prerequisites and corequisites
Prerequisites and corequisites
Not applicable
-
Professional Internship
Professional Internship
Não
-
Syllabus
Syllabus
Introduction to Data Science and roles in the field Data types and tabular data structures Descriptive statistics: central tendency and dispersion Univariate visualization: histograms, KDE, boxplots, outliers, skew Bivariate analysis: scatter and pair plots, linear and non-linear relationships Correlation: Pearson and Spearman, strength vs significance Categorical data: contingency tables, proportions, grouped summaries Exploratory Data Analysis (EDA) pipelines and data storytelling Machine learning essentials: supervised vs unsupervised, features and labels, data splitting, cross-validation, sklearn pipelines Linear regression: model fit, residuals, R², overfitting, interpretation Logistic regression: binary classification, sigmoid, log-odds, ROC, accuracy, F1 Clustering: k-means, elbow method, intro to hierarchical clustering, applications Model validation and evaluation Statistical thinking. Review and integration of concept
-
Objectives
Objectives
Characterize data types and structures in tabular datasets Compute and interpret descriptive statistics Create and interpret univariate and bivariate visualizations Conduct structured exploratory data analysis and communicate findings Build and evaluate baseline models including linear and logistic regression Apply unsupervised methods such as k-means clustering Select and interpret evaluation metrics, understanding bias-variance trade-offs Connect statistical thinking concepts to machine learning practice
-
Teaching methodologies and assessment
Teaching methodologies and assessment
The module integrates innovative teaching methods such as vibe-coding — hands-on programming sessions where lecturer and students explore solutions in real time, fostering intuition and creativity in code. Assessment focuses on the process rather than solely the final product, valuing individual progress, experimentation, and problem-solving skills. A dialogical approach is emphasised, encouraging continuous interaction between lecturer and students and supporting the development of critical thinking through debate and shared reflection.
-
References
References
Grus, J. (2019). Data science from scratch: first principles with python. O'Reilly Media.
-
Office Hours
Office Hours
-
Mobility
Mobility
No