How come the least squares regression lines for regress y on x and regress x on y is NOT that as follows ?
Regress y on x, the least squares regrestion line is y= a + bx
y = a + bx
y - a = bx
(y - a)/b = x
x = -(a/b) + (1/b)y
So why for the case of Regress x on y,
the least squares regresiion line is NOT x = -(a/b) + (1/b)y ?
Thanks for the help.
Deleted
i think its because your "x on y regression line" is simply a rearrangement of your equation of your y on x regression line.
y on x and x on y lines should yield totally different values because your independent and dependent variables are different in each case.
i think its like that? lol.
Originally posted by Snoopyies:How come the least squares regression lines for regress y on x and regress x on y is NOT that as follows ?
Regress y on x, the least squares regrestion line is y= a + bx
y = a + bx
y - a = bx
(y - a)/b = x
x = -(a/b) + (1/b)y
So why for the case of Regress x on y,
the least squares regresiion line is NOT x = -(a/b) + (1/b)y ?
Thanks for the help.
y = bx + a
x = -(a/b) + (1/b)y
x = cy + d, where c = (1/b) and d = (-(a/b))
Originally posted by absol:i think its because your "x on y regression line" is simply a rearrangement of your equation of your y on x regression line.
y on x and x on y lines should yield totally different values because your independent and dependent variables are different in each case.
i think its like that? lol.
In the textbook, the linear regression questions in the textbook are usually given in a table form with 2 rows of X and Y data. Next, the questions will ask to find the regression lines for the case of regress Y on X and regress X on Y ie the first case Y is Dependent variable and X is Independent Variable and for the second case X is Dependent variable and Y is Independent Variable. The X and Y data are the same. So,
1. For the case of regress y on x, the least square regression line is
y = a + bx
2. For the case of regress x on y, the least square regression line is
x = c + dy
So, from y = a + bx and then re-arrange x in terms of y
x = - (a/b) + (1/b)y
So, why is it incorrect ie why c is not equal to - (a/b) and d is not equal to 1/b ?
Originally posted by Forbiddensinner:y = bx + a
x = -(a/b) + (1/b)y
x = cy + d, where c = (1/b) and d = (-(a/b))
It is not the same because I do the questions with actual numbers from the textbook.
Originally posted by Snoopyies:How come the least squares regression lines for regress y on x and regress x on y is NOT that as follows ?
Regress y on x, the least squares regrestion line is y= a + bx
y = a + bx
y - a = bx
(y - a)/b = x
x = -(a/b) + (1/b)y
So why for the case of Regress x on y,
the least squares regresiion line is NOT x = -(a/b) + (1/b)y ?
Thanks for the help.
Hi,
It is good to ask why :)
If that is the case, then both lines coincide.
Regression lines only coincide when r = 1 or -1 and this is rarely the case.
Thanks!
Cheers,
Wen Shih
Hi,
Regress y on x Regress x on y
1. Use GC, do a scatter plot 1. Use GC, do a scatter plot
2. Note that gradient = a 2. Note that gradient = c
Vertical intercept = b Vertical intercept = d
3. Residual Error ei 3. Residual Error ei
= Observed Value of each yi = Observed value of each xi
- each predicted y value - each predicted x value
= yi - (a + bxi) = xi - (c + dyi)
4. Using the least squares 4. Using the least squares
method to minimise the method to mininise the
sum of the squared residual sum of the squared residual
errors, differentiation and errors, differentiation and
substitution, the formula for susbstitution, the formula for
a and b are obtained. c and d are obtained.
5. When the sum of squared 5. When the sum of squared
residual errors are not zero, residual errors are not zero,
a =/= - cb and b =/= 1/d c =/= -(a/b) and d =/= 1/b
ie r =/= 1 or r =/= - 1 ie r =/= 1 or r =/= -1
6. When the sum of squared 6. When the sum of squared
residual errors are zero, residual errors are zero,
a = - cb and b = 1/d c = - (a/b) and d = 1/b
ie r = 1 or r = - 1 ie r = 1 or r = -1
This means that all the This means that all the
observed y values are observed x values are
equal to the predicted equal to the predicted
y values x values
So to TS,
(a) why a is not equal to c or - (a/b) and b is not equal to d or 1/b because the
the sum of squared residual errors for regress y on x and regress x on y are
different and not equal to zero. So, the least squares regression lines for both
cases are different ( ie r are not equal to 1 or -1 ).
(b) a is equal to c or - (a/b) and b is equal to d or 1/b when the sum of squared
residual errors for regress y on x and regress x on y are both equal to zero.
So, the least squares regression lines for both cases are the same (ie r are
equal to 1 or - 1). So, only in this case, we can from y = a + bx and re-
arrange and express x in terms of y into x = - (a/b) + (1/b)y = c + dy.
PS : For Step 4, TS you may refer to your textbook Appendix 6.
This will require you to use partial differentiation with respect to a, then b, c
and d. Since we want to minimise the sum of squared residual errors, we
can set the 4 equations equal to zero (known as normal equations). Solving
these equations, we get the formula for a, b, c and d. To prove that it is a
minimum, we can find the second derivatives for a, b, c and d.
Dear Lee012lee,
Thanks for your in-depth explanations!
Cheers,
Wen Shih