Regression Technique

In statistics and machine learning, regression analysis is a procedure for assessing the inter-variable relationships. It includes several methods for demonstrating and examining numerous variables, where the goal is to analyze the association amongst a dependent (i.e. response) variable and one or more independent (i.e. predictor) variables. Specifically, regression analysis can recognize which among the independent variables closely associated with the dependent variable, and to discover the forms of these relationships.

Regression is used for numeric prediction and forecasting. Numeric prediction is the task of predicting continuous (i.e. ordered) values for given input. For example, we may want to predict the salary of IT professionals with 5 years of work experience, or the price of a new house. It basically tries to determine a function that represents data with least possible error. Regression analysis is a good choice when all of the independent variables are continuous valued as well.

We discuss here three types of regression analysis –

1. Simple Linear Regression

2. Multiple Linear regression

3. Polynomial Regression

They are described below one-by-one.

 
1. Simple Linear Regression

Simple linear regression, also called straight-line regression, requires a dependent variable,  y, and a single independent variable, x. This is the simplest form of regression, and models y as a linear function of x. That is,

y = w0 + w1x    ———> (1)

where w0 and ware regression coefficients specifying the Y-intercept and slope of the line, respectively.

These coefficients can be solved using the method of least squares, which estimates the best-fitting straight line as the one that minimizes the error between the actual data and the estimate of the line.

 
2. Multiple Linear regression

Multiple linear regression is an extension of the straight-line linear regression for involving more than one independent variable. An example of a multiple linear regression model based on two independent variables, x1 and x2, is given by

y = w0 + w1x + w2x    ———> (2)

The method of least squares can be applied here to solve for w0, w1, and w2.

 
3. Polynomial Regression 

Polynomial regression technique is useful when there is just one independent variable using a nonlinear model. Then, it can be modeled by adding polynomial terms to the basic linear model. By applying transformations to the variables, we can convert the nonlinear model into a linear one. Let us consider a cubic polynomial relationship given by

y = w0 + w1x + w2x2 + w3x3   ———> (3)

To convert this equation to linear form, we define new variables x1 , x2 , x3 such that:

x= x      x2 = x2      x3 = x3   ———> (4)

Equation (3) can then be converted to linear form by applying the above assignments, resulting in the equation 

y = w0 + w1x1 + w2x2 + w3x3    ———> (5)

which is easily solved by the method of least squares.

It is to be noted that polynomial regression is a special case of multiple linear regression. That is, the addition of high-order terms like x2, x3, and so on, which are simple functions of the single variable, x, can be considered equivalent to adding new independent variables.