Regression Technique

In statistics and machine learning, regression analysis is a procedure for assessing the inter-variable relationships. It includes several methods for demonstrating and examining numerous variables, where the goal is to analyze the association amongst a dependent (i.e. response) variable and one or more independent (i.e. predictor) variables. Specifically, regression analysis can recognize which among the independent variables closely associated with the dependent variable, and to discover the forms of these relationships.

Regression is used for numeric prediction and forecasting. Numeric prediction is the task of predicting continuous (i.e. ordered) values for given input. For example, we may want to predict the salary of IT professionals with 5 years of work experience, or the price of a new house. It basically tries to determine a function that represents data with least possible error. Regression analysis is a good choice when all of the independent variables are continuous valued as well.

We discuss here three types of regression analysis –

1. Simple Linear Regression

2. Multiple Linear regression

3. Polynomial Regression

They are described below one-by-one.

1. Simple Linear Regression

Simple linear regression, also called straight-line regression, requires a dependent variable, y, and a single independent variable, x. This is the simplest form of regression, and models y as a linear function of x. That is,

y = w₀ + w₁x ———> (1)

where w₀ and w₁are regression coefficients specifying the Y-intercept and slope of the line, respectively.

These coefficients can be solved using the method of least squares, which estimates the best-fitting straight line as the one that minimizes the error between the actual data and the estimate of the line.

2. Multiple Linear regression

Multiple linear regression is an extension of the straight-line linear regression for involving more than one independent variable. An example of a multiple linear regression model based on two independent variables, x₁ and x₂, is given by

y = w₀ + w₁x + w₂x ———> (2)

The method of least squares can be applied here to solve for w₀, w₁, and w₂.

3. Polynomial Regression

Polynomial regression technique is useful when there is just one independent variable using a nonlinear model. Then, it can be modeled by adding polynomial terms to the basic linear model. By applying transformations to the variables, we can convert the nonlinear model into a linear one. Let us consider a cubic polynomial relationship given by

y = w₀ + w₁x + w₂x² + w₃x³ ———> (3)

To convert this equation to linear form, we define new variables x₁ _,x₂ _,x₃ such that:

x₁= x x₂ = x² x₃ = x³ ———> (4)

Equation (3) can then be converted to linear form by applying the above assignments, resulting in the equation

y = w₀ + w₁x₁ + w₂x₂ + w₃x₃ ———> (5)

which is easily solved by the method of least squares.

It is to be noted that polynomial regression is a special case of multiple linear regression. That is, the addition of high-order terms like x², x³, and so on, which are simple functions of the single variable, x, can be considered equivalent to adding new independent variables.