Paradigms and Tasks
- Supervised Learning
- Regression
- Classification
- Unsupervised Learning
- Anomaly Detection
- Clustering
- Density Estimation
- Dimension Reduction
- Reinforcement Learning
Supervised Learning
In supervised learning, the available columns will be separated into output columns and a single input column. The goal of supervised learning is to predict the output, given the inputs.
By convention, we use \(x\) for input, and \(y\) for output. Beyond that, there are several additional names given to the input and output.
- Input: Features, Predictors, Independent Variables
- Output: Target, Response, Dependant Variables
In machine learning, and in CS 307, there is a preference for using features and target.1
What distinguishes regression from classification? The (statistical) data type of the target.
Regression
The regression task is used to predict a numeric target.
Classification
The classification task is used to predict a categorical target.
Footnotes
“Predictors” for input would quickly become confusing given that prediction is such a core concept in machine learning. “Independent” and “dependant” variables are simply bad terminology as independent has a very specific meaning in statistics and machine learning.↩︎