The present report was commissioned by the chief analyst of the Diligent Consulting Group as a fulfillment of the request of Loving Organic Foods company. The request was to explore the factors that may motivate the customers to increase their spending on organic foods. Preliminary analysis revealed that age might be one of the crucial demographic factors that affect expenditures on organic foods. The present report utilizes linear regression analysis to assess the possibility of age being a significant factor that contributes to the motivations of the customers to buy more organic foods.
Linear Regression Analysis
Linear regression is a method to predict one (dependent) variable using the values of another (independent) variable. The linear regression analysis has proven to be a scientific and reliable way to predict the future (Alexander et al., 2017). Linear regression models are utilized in a wide variety of areas, including business, biology, medicine, and behavioral studies (Alexander et al., 2017). The central value of the method is that it provides an easily-comprehendible way to make valid predictions without needing much training. There are two general types of regression analyses, including simple linear regression, which includes only one independent variable, and multiple linear regression, which includes two or more independent variables.
The present report uses simple linear regression analysis with “Age” as an independent variable and “Annual Amount Spent on Organic Food.” The analysis utilizes a dataset consisting of 124 observations, which is enough to make reliable estimations. Linear regression can be assessed using various statistical software, such R, SPSS, and Minitab. The present paper utilizes Excel’s Analysis ToolPak add-in to perform the analysis. The following model was assessed:
- y = Annual Amount Spent on Organic Food;
- x = Age;
- a = intercept;
- b = regression coefficient.
Table 1 below provide an output of Excel’s regression analysis of annual expenditures on organic food against the age of the participants.
Table 1. Regression analysis output.
|Adjusted R Square||0.00511641|
|Coefficients||Standard Error||t Stat||P-value||Lower 95%||Upper 95%||Lower 95.0%||Upper 95.0%|
Interpretation of the Coefficient of Determination
The coefficient of determination (R-squared) shows how much variability in the dependent variable the proposed model can explain (Alexander et al., 2017). In other words, the R-squared coefficient shows how well the variability in the dependent variable fits the model. The coefficient differs between 0 and 1, where 1 is the perfect fit and 0 is the lack of fit.
According to Table 1, the R-squared coefficient is approximately equal to 0.013. This implies that the model can explain only 1.13% of variations in the dependent variable, which is the annual amount spent on organic food. Thus, the predictive ability of the simple regression model is very low.
Interpretation of the Coefficient Estimate for the Age Variable
The coefficient for the independent variable demonstrates by how much the value of the dependent variable changes with every unit of the independent variable. In this case, the coefficient tells by how much the expenditures on organic food increase if the age of the respondent is increased by 1. The analysis revealed that if the age is increased by one year, the expenditures increase by $26.29 annually, which is not a notable change.
Interpretation of Statistical Significance
The statistical significance of a coefficient is quantified using the p-value. If the p-value is above the preset level of statistical significance (alpha), the coefficient is considered statistically insignificant. In other words, large p-values demonstrate that the independent variable cannot be used as a significant predictor of the independent variable. According to Alexander et al. (2017), common significance levels are 0.1, 0.05, and 0.01. Table 1 demonstrates that the estimation for the p-value of the coefficient is p = 0.203776. The value is above all the commonly used significance levels, which implies that the coefficient is statistically insignificant. Thus, the Age variable cannot be used as a predictor of the annual amount spent on organic food according to the simple linear regression model.
The output provided in Table 1 can be used to create a linear regression equation that can be used for predictions. The regression equation will be as follows:
Regression Equation Utilization
The equation providing in Section 7 can be used for creating predictions for the dependent variable by substituting the x for any value. In other words, the annual amount spent on organic food can be estimated by multiplying the age of a person by 26.29 and adding 9778.28. However, while making such predictions, the goodness of fit (coefficient of determination) and the p-value for the Age coefficient should be taken into consideration. In the case with the model mentioned in Section 7, the predictions would be unreliable, as the coefficient of the Age variable is insignificant and the predictive ability of the model is low.
Estimation for an Average Consumer
Assuming that the discussed model provides reliable results, estimations for an average customer can be made. In this case, an average customer can be found by taking the mean value of the age of all participants in the sample. It can be done by utilizing the AVERAGE() function in Excel. The calculations demonstrated that the average customer is 48 years old. Thus, the annual amount spent on organic food can be calculated the following way:
The calculations demonstrate that an average customer spends $11,040.2 per year on organic food. However, the calculations may be flawed due to the low R-squared coefficient and high p-value for the Age coefficient.
Alexander, H., Illowsky, B., & Dean, S. (2017). Introductory business statistics. Openstax.