# Importance of Regression

Regression is a statistical analysis or method of prediction that helps us understand how the relationship between two variables might change over time, and how one variable can help predict another. This post will explore regression in more detail and why it is used.

Regression models are made up of two parts: the independent and dependent variables. The independent variables are what measure the output (y) we want to predict – this could be our income when working out mortgage repayment options, or the number of children in an elementary school classroom.

The dependent variable measures what these outcomes depend on – this could be it’s distance from your place of work, or increases in population as you get closer to a city centre. A good example of this is a basic formula for students:

rent = k * room size + β*(number of people) + ε

In this example, k is the constant (the value from which all other values are derived from), the height of the room (room size), and the distance to the city centre (β). The further we move away from the city centre, k decreases (it becomes less expensive to live there), while β increases. ε represents any other errors or variation in rents that can’t be accounted for by our formula.

As β increases closer to the city centre, so does ε. Since rents are very high at the city centre, α is small. As the distance increases further from the city centre, so does k (we’d want to pay a lot for a place that is far away), and as ε increases, we will get some increase in our rent, as well as some decrease (other errors/variation or lower rent).

Regression is also used when we want to find out how much a variable affects something else. For example, if you have a whole classroom full of students who have either straight A’s or C’s in their exam results, you might ask if this affects what percentage of people attain straight A’s in other classes.

It’s possible that the percentage of students with straight A’s is a function of how many students attain straight A’s, but is also affected by chance or other variables, so we would use regression to see if there is some relation between these two factors.

Regression can be messy and complex to use. It can be useful when it comes to understanding how changes in variables could affect our predicted output. Regression models are used when we want to know whether the relationship between our independent and dependent variables changes over time, or whether the model can tell us what this relationship looks like.

Image: Pexels