Linear Regression

2. Linear Regression

Objectives

  • Understand the basic concepts of linear regression.

  • Apply linear regression to real data.

  • Evaluate the performance of a linear regression model.

  • Interpret the results of a linear regression.

Expected time to complete: 4 hours

Linear regression is a fundamental model for regression problems, where we aim to predict a continuous value (i.e. a quantitative response) based on one or more explanatory variables. It is a linear model that assumes a linear relationship between the input variable(s) \(x\) or \(\mathbf{x}\) and the single output variable \(y\). Thus, it can be used to understand the relationship between two or more variables. In this chapter, we will learn how to use linear regression to make predictions and understand the relationship between variables. We will learn the basic concepts of linear regression and apply it to real data. We will also learn how to evaluate the performance of a linear regression model and interpret the results.

We will mainly use the Advertising data to explain the key ideas underlying linear regression. The Advertising dataset contains sales revenue generated with respect to advertisement spends across multiple channels (TV, radio, newspaper) for a single product in a single market. The goal is to predict sales revenue based on the advertisement spends.

Ingredients

  • Input: features of data samples

  • Output: target values of data samples

  • Model: fit a line (or plane/hyperplane) to the training data and assign the value on the fitted line (or plane/hyperplane) to the test data

    • Hyperparameter(s): None

    • Parameters: the intercept(s) and slope(s) of the fitted line (or plane/hyperplane), also known as the bias(es) and weight(s), respectively

  • Loss function: minimise the total distances of the training data points to the fitted line (or plane/hyperplane)

  • Learning algorithm: closed-form analytical solution based on linear algebra

System transparency

System logic

  • Condition to produce certain output: to produce an output \(y\), locate this \(y\) value on the fitted line (or plane/hyperplane) and then find the corresponding input \(x\) (or \(\mathbf{x}\)) value