linear regression

Algebra

(noun)

An approach to modeling the linear relationship between a dependent variable, $y$ and an independent variable, $x$.

Related Terms

  • curve fitting
  • least squares approximation
  • outlier
  • mathematical model
Finance

(noun)

In statistics, linear regression is an approach to modeling the relationship between a scalar dependent variable y and one or more explanatory variables denoted X.

Related Terms

  • Ordinary least squares regression
  • independent
Statistics

(noun)

an approach to modeling the relationship between a scalar dependent variable $y$ and one or more explanatory variables denoted $x$.

Examples of linear regression in the following topics:

  • The Equation of a Line

    • In statistics, linear regression can be used to fit a predictive model to an observed data set of $y$ and $x$ values.
    • In statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable.
    • Simple linear regression fits a straight line through the set of $n$ points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.
    • Linear regression was the first type of regression analysis to be studied rigorously, and to be used extensively in practical applications.
    • If the goal is prediction, or forecasting, linear regression can be used to fit a predictive model to an observed data set of $y$ and $X$ values.
  • Introduction to inference for linear regression

    • In this section we discuss uncertainty in the estimates of the slope and y-intercept for a regression line.
    • However, in the case of regression, we will identify standard errors using statistical software.
    • This video introduces consideration of the uncertainty associated with the parameter estimates in linear regression.
  • Evaluating Model Utility

    • Multiple regression is beneficial in some respects, since it can show the relationships between more than just two variables; however, it should not always be taken at face value.
    • It is easy to throw a big data set at a multiple regression and get an impressive-looking output.
    • But many people are skeptical of the usefulness of multiple regression, especially for variable selection, and you should view the results with caution.
    • You should examine the linear regression of the dependent variable on each independent variable, one at a time, examine the linear regressions between each pair of independent variables, and consider what you know about the subject matter.
    • You should probably treat multiple regression as a way of suggesting patterns in your data, rather than rigorous hypothesis testing.
  • Slope and Intercept

    • A simple example is the equation for the regression line which follows:
    • Linear regression is an approach to modeling the relationship between a scalar dependent variable $y$ and one or more explanatory (independent) variables denoted $X$.
    • The case of one explanatory variable is called simple linear regression.
    • For more than one explanatory variable, it is called multiple linear regression.
    • (This term should be distinguished from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable).
  • Polynomial Regression

    • The goal of polynomial regression is to model a non-linear relationship between the independent and dependent variables.
    • Although polynomial regression fits a nonlinear model to the data, as a statistical estimation problem it is linear, in the sense that the regression function $E(y\ | \ x)$ is linear in the unknown parameters that are estimated from the data.
    • For this reason, polynomial regression is considered to be a special case of multiple linear regression.
    • This is similar to the goal of non-parametric regression, which aims to capture non-linear regression relationships.
    • Explain how the linear and nonlinear aspects of polynomial regression make it a special case of multiple linear regression.
  • Checking the Model and Assumptions

    • These assumptions are similar to those of standard linear regression models.
    • Linearity.
    • Fortunately, slight deviations from linearity will not greatly affect a multiple regression model.
    • In effect, residuals appear clustered and spread apart on their predicted plots for larger and smaller values for points along the linear regression line; the mean squared error for the model will be incorrect.
    • Paraphrase the assumptions made by multiple regression models of linearity, homoscedasticity, normality, multicollinearity and sample size.
  • Estimating and Making Inferences About the Slope

    • The purpose of a multiple regression is to find an equation that best predicts the $Y$ variable as a linear function of the $X$ variables.
    • You use multiple regression when you have three or more measurement variables.
    • The purpose of a multiple regression is to find an equation that best predicts the $Y$ variable as a linear function of the $X$variables.
    • When the purpose of multiple regression is prediction, the important result is an equation containing partial regression coefficients (slopes).
    • A graphical representation of a best fit line for simple linear regression.
  • Predictions and Probabilistic Models

    • Best-practice advice here is that a linear-in-variables and linear-in-parameters relationship should not be chosen simply for computational convenience, but that all available knowledge should be deployed in constructing a regression model.
    • A scatterplot shows a linear relationship between a quantitative explanatory variable $x$ and a quantitative response variable $y$.
    • A good rule of thumb when using the linear regression method is to look at the scatter plot of the data.
    • This graph is a visual example of why it is important that the data have a linear relationship.
    • Each of these four data sets has the same linear regression line and therefore the same correlation, 0.816.
  • Regression Analysis for Forecast Improvement

    • One can forecast based on linear relationships.
    • Regression Analysis is a causal / econometric forecasting method.
    • These methods include both parametric (linear or non-linear) and non-parametric techniques.
    • The predictors are linearly independent, i.e. it is not possible to express any predictor as a linear combination of the others.
    • Familiar methods, such as linear regression and ordinary least squares regression, are parametric, in that the regression function is defined in terms of a finite number of unknown parameters that are estimated from the data.
  • Multiple Regression Models

    • Multiple regression is used to find an equation that best predicts the $Y$ variable as a linear function of the multiple $X$ variables.
    • You use multiple regression when you have three or more measurement variables.
    • The purpose of a multiple regression is to find an equation that best predicts the $Y$ variable as a linear function of the $X$ variables.
    • Multiple regression is a statistical way to try to control for this; it can answer questions like, "If sand particle size (and every other measured variable) were the same, would the regression of beetle density on wave exposure be significant?
    • As you are doing a multiple regression, there is also a null hypothesis for each $X$ variable, meaning that adding that $X$ variable to the multiple regression does not improve the fit of the multiple regression equation any more than expected by chance.
Subjects
  • Accounting
  • Algebra
  • Art History
  • Biology
  • Business
  • Calculus
  • Chemistry
  • Communications
  • Economics
  • Finance
  • Management
  • Marketing
  • Microbiology
  • Physics
  • Physiology
  • Political Science
  • Psychology
  • Sociology
  • Statistics
  • U.S. History
  • World History
  • Writing

Except where noted, content and user contributions on this site are licensed under CC BY-SA 4.0 with attribution required.