Paradigms, workflow and data preprocessing
Machine learning studies algorithms that improve automatically from data instead of being explicitly programmed. A task is framed as learning a mapping from input features to an output target that generalises to unseen data.
Supervised learning uses labelled examples for regression and classification, unsupervised learning discovers structure in unlabelled data through clustering and dimensionality reduction, and reinforcement learning learns a policy from rewards.
A project flows through data collection, cleaning, feature engineering, a train and test split, model training, evaluation and deployment. Numerical features are scaled and categorical features are encoded before training.
This unit defined machine learning, separated the three learning paradigms, and walked through the standard workflow with the preprocessing and data-splitting steps that precede modelling.