Home > Homework Forum

H2 Maths : Linear Regression

Snoopyies
2 Dec 09, 00:22

How come the least squares regression lines for regress y on x and regress x on y is NOT that as follows ?

Regress y on x, the least squares regrestion line is y= a + bx

y = a + bx

y - a = bx

(y - a)/b = x

x = -(a/b) + (1/b)y

So why for the case of Regress x on y,

the least squares regresiion line is NOT x = -(a/b) + (1/b)y ?

Thanks for the help.
Snoopyies
2 Dec 09, 00:24

Deleted
absol
2 Dec 09, 00:38

i think its because your "x on y regression line" is simply a rearrangement of your equation of your y on x regression line.

y on x and x on y lines should yield totally different values because your independent and dependent variables are different in each case.

i think its like that? lol.
Forbiddensinner
2 Dec 09, 00:44

Originally posted by Snoopyies:

How come the least squares regression lines for regress y on x and regress x on y is NOT that as follows ?

Regress y on x, the least squares regrestion line is y= a + bx

y = a + bx

y - a = bx

(y - a)/b = x

x = -(a/b) + (1/b)y

So why for the case of Regress x on y,

the least squares regresiion line is NOT x = -(a/b) + (1/b)y ?

Thanks for the help.

y = bx + a

x = -(a/b) + (1/b)y

x = cy + d, where c = (1/b) and d = (-(a/b))
Snoopyies
2 Dec 09, 00:53

Originally posted by absol:

i think its because your "x on y regression line" is simply a rearrangement of your equation of your y on x regression line.

y on x and x on y lines should yield totally different values because your independent and dependent variables are different in each case.

i think its like that? lol.

In the textbook, the linear regression questions in the textbook are usually given in a table form with 2 rows of X and Y data. Next, the questions will ask to find the regression lines for the case of regress Y on X and regress X on Y ie the first case Y is Dependent variable and X is Independent Variable and for the second case X is Dependent variable and Y is Independent Variable. The X and Y data are the same. So,

1. For the case of regress y on x, the least square regression line is

    y = a + bx

2. For the case of regress x on y, the least square regression line is

    x = c + dy

    So, from y = a + bx and then re-arrange x in terms of y

   x = - (a/b) + (1/b)y

   So, why is it incorrect ie why c is not equal to - (a/b) and d is not equal to 1/b ?
Snoopyies
2 Dec 09, 00:58

Originally posted by Forbiddensinner:

y = bx + a

x = -(a/b) + (1/b)y

x = cy + d, where c = (1/b) and d = (-(a/b))

It is not the same because I do the questions with actual numbers from the textbook.
wee_ws
2 Dec 09, 08:19

Originally posted by Snoopyies:

How come the least squares regression lines for regress y on x and regress x on y is NOT that as follows ?

Regress y on x, the least squares regrestion line is y= a + bx

y = a + bx

y - a = bx

(y - a)/b = x

x = -(a/b) + (1/b)y

So why for the case of Regress x on y,

the least squares regresiion line is NOT x = -(a/b) + (1/b)y ?

Thanks for the help.

Hi,

It is good to ask why :)

If that is the case, then both lines coincide.

Regression lines only coincide when r = 1 or -1 and this is rarely the case.

Thanks!

Cheers,
Wen Shih
Lee012lee
2 Dec 09, 13:21

Hi,

      Regress y on x                           Regress x on y

1.   Use GC, do a scatter plot       1.  Use GC, do a scatter plot

2.   Note that gradient = a             2. Note that gradient = c

      Vertical intercept = b                   Vertical intercept = d

3.  Residual Error ei                      3. Residual Error ei

     = Observed Value of each yi        = Observed value of each xi

       - each predicted y value               - each predicted x value

     = yi - (a + bxi)                             = xi - (c + dyi)

4. Using the least squares            4. Using the least squares

     method to minimise the                method to mininise the

     sum of the squared residual        sum of the squared residual

     errors, differentiation and            errors, differentiation and

     substitution, the formula for         susbstitution, the formula for

     a and b are obtained.                 c and d are obtained.

5. When the sum of squared        5. When the sum of squared

     residual errors are not zero,        residual errors are not zero,

     a =/= - cb  and b =/= 1/d             c =/= -(a/b) and d =/= 1/b

     ie r =/= 1 or r =/= - 1                  ie r =/= 1 or r =/= -1

6. When the sum of squared        6. When the sum of squared

     residual errors are zero,              residual errors are zero,

     a = - cb  and b = 1/d                   c = - (a/b) and d = 1/b

     ie r = 1 or r = - 1                       ie r = 1 or r = -1

     This means that all the               This means that all the

     observed y values are                observed x values are

     equal to the predicted                equal to the predicted

     y values                                    x values

So to TS,

(a) why a is not equal to c or - (a/b) and b is not equal to d or 1/b because the

     the sum of squared residual errors for regress y on x and regress x on y are

     different and not equal to zero. So, the least squares regression lines for both

      cases are different ( ie r are not equal to 1 or -1 ).

(b) a is equal to c or - (a/b) and b is equal to d or 1/b when the sum of squared

      residual errors for regress y on x and regress x on y are both equal to zero.

      So, the least squares regression lines for both cases are the same (ie r are

      equal to 1 or - 1). So, only in this case, we can from y = a + bx and re-

      arrange and express x in terms of y into x = - (a/b) + (1/b)y = c + dy.

PS : For Step 4, TS you may refer to your textbook Appendix 6.

       This will require you to use partial differentiation with respect to a, then b, c

       and d. Since we want to minimise the sum of squared residual errors, we

       can set the 4 equations equal to zero (known as normal equations). Solving

       these equations, we get the formula for a, b, c and d. To prove that it is a

       minimum, we can find the second derivatives for a, b, c and d.
wee_ws
2 Dec 09, 14:38

Dear Lee012lee,

Thanks for your in-depth explanations!

Cheers,
Wen Shih