This memo is addressed to present the results of the correlation and regression analyses of the sample. The investigated factors are weight and height of 20 females.
Sample Identification and Data Collection
The data for the sample was collected by making an on-line survey.
The number of observation for the sample is 20.
Values: weight and height.
I asked my female mates and friends to fill in the data of their height and weight. Thereafter, when I had received 20 responses, I translated the data in common measurements (as some people gave the data of their height in centimeters, some in feet) and built a table.
The scatter plot
The data also allows determining the mean height and weight for the sample.
The mean height is 5,35 ft.
The mean weight is 58,1 kg.
Now, we can compare our results with the Center for Disease Control and Prevention’s findings. According to the CDC, the average height for adult women, ages 20 years old and older, during the timeframe of 2003 to 2006 was 63,8 inches or 5,31 ft and the average weight for women ages 20 years old and older was 164.7 pounds or 74,7 kg. I think that my results on average weight differ from the results of CDC because in my 20-cases sample the females were of the age 20-25 years. Thus, this difference may reflect the fact that the younger females tend to weigh less than the elder, still this assumption needs further investigation and separate research.
The correlation between sample height and weight
Using the Microsoft Excel functions I determined the positive correlation between the height and weight. According to my data, the correlation between two variables is 0,84953 or 84,95%. The positive correlation means that two variables move in tandem, that is when one variable increases, the second also increases, and vice versa. In our case this means that on average people who are taller weight more than smaller people.
The regression analysis
I used the statistical program Statistica 7 to make an analysis.
So, the regression equation is
Y = -102,240 + 29,962*X, where Y is weight in kilograms, X is height in feet.
Using this equation, I predicted the average weight for females who are:
a) 4,3 ft tall:
Y= -102,240 + 29,962*4,3= 26,59 kg;
b) 5,7 ft tall:
Y= -102,240 + 29,962*5,7= 68,54 kg;
c) 6,2 ft tall:
Y= -102,240 + 29,962*6,2= 82,52 kg;
d)8,9 ft tall
Y= -102,240 + 29,962*8,9= 164,42 kg.
The results of the predictions are presented in a table:
The coefficient of determination R2 in or regression is 0,72 or 72%. This means that 72% of the variations in dependent variable (Y or weight in our case) can be explained by the variations in independent variable (X or height in our case), and 28% of the variations are due to other factors.
To conclude, it is essential to add that correlation and regression analyses are widely used techniques in statistics and econometrics. While conducting the research I learnt how to choose a sample and compare the sample results with national statistical survey. Also, processing the regression analysis I understood how to use regression equation for predictions and what minimum requirements (normal distribution, linear independence, homoscedasticity) should be met in order to complete regression analysis.