Introduction

Regression analysis models the relationship between a dependent variable (Y) and one or more independent variables (X). It's one of the most widely used tools in business analytics for prediction and understanding relationships.


Simple Linear Regression

Y = β₀ + β₁X + ε

β₀ = intercept, β₁ = slope, ε = error

Example

Sales = 100 + 5×Advertising
Base sales = ₹100; each ₹1 in advertising adds ₹5 in sales


Multiple Regression

Y = β₀ + β₁X₁ + β₂X₂ + ... + βₙXₙ + ε

Multiple regression allows you to control for other factors and understand relative importance of variables.


Key Assumptions

AssumptionDescription
LinearityRelationship is linear
IndependenceErrors are independent
HomoscedasticityConstant error variance
NormalityErrors normally distributed
No multicollinearityIVs not highly correlated

Interpreting Results

  • R²: % of variance explained (0-1, higher is better)
  • p-value: Statistical significance (< 0.05)
  • Coefficients: Effect size per unit change in X
Warning: Correlation ≠ causation. High R² doesn't prove causality.

Conclusion

Key Takeaways

  • Regression models relationship between Y and X
  • OLS minimizes sum of squared errors
  • Check assumptions before interpreting
  • shows variance explained
  • Correlation ≠ causation