Interpreting diagrams for OCR GCSE Maths
This page covers the following topics:
2. Line of best fit
Scatter graphs can be used to investigate the correlation between the variables they represent. The variables can have a positive, negative or no correlation, depending on the steepness of the pattern the scatter points follow. The two variables that are positively correlated increase together and a variable that is negatively correlated decreases while the other one increases. Although two variables can be correlated, it does not necessarily mean that the one variable causes the other, as there may be a third one that affects both. Thus, it can be said that correlation does not imply causation.
The line of best fit is a straight line drawn through as many scatter points as possible. Its gradient should generally follow the same steepness of the scatter points. The line of best fit can then be used to make predictions about the values of the variables.
Scatter plots and the lines of best fit through them can be used to make predictions about values of one variable given the other. The stronger the correlation between the two variables, the more accurate the predictions will be.
The line of best fit of a scatter plot can be used for interpolation. Interpolation is the process used to estimate the value of the dependent variable from the independent one without a scatter point for that value. For interpolation to work, the value must lie within the range of the values of the graph.
Extrapolation is the process used to estimate the value of the dependent variable from the independent one without a scatter point for that value for values which are outside the recorded range. Since the values are not within the range, estimates made by extrapolation are less accurate than those made by interpolation.
Can predictions be made for the values of two variables which are very weakly correlated?
Predictions can be made for these variables, however since their correlation is very weak, the predictions will not be very accurate.
The line of best fit between the number of hours spent studying by a student in a week and their percentage grade is given by the equation y = 4x + 18. Explain what this says about the correlation between the two variables.
The positive gradient of the line of best fit means that the two variables are positively correlated. The gradient of 4 means that for every unit increase in the number of hours spent studying by the student, their grade increases by 4%. The y-inctercept means that a student that does no studying in a week will get a grade of 18%.
Describe what you would expect the relationship between the number of ice creams sold and the number of flipflops sold to be.
It is expected that the two variables are positively correlated, however this may not be because the increase in sales in the one causes the sales to increase for the other. The positive correlation between the two variables is better explained by the fact that in summer, the demand for both ice cream and swimsuits increases. Thus, although the two variables are positively correlated, they do not cause one another, as the third variable of the season may be the one affecting both of them and causing them to be correlated.
Find the predicted value of y for when x is 10, given that the equation of the line of best fit for the scatter plot for the two variables is y = 6x + 27.
y = 6(10) + 27 = 87.
Using the given scatter plot, find an estimate for the value of y when x is 10.
The equation of the line of best fit must be found. Gradient = (10 − 5)/(12 − 0) = 5/12. Since the y-intercept is 5, the equation of the line of best fit is y = (5/12)x + 5. When x = 10, y = (5/12)(10) + 5 = 55/6.
End of page