Originally posted by Gedanken:That's where the parameter of getting less than 1 percentile difference comes in. If there are gaps, the questions will keep going until a stable measure is reached. That being the case, the examinee might get the first question wrong and be bumped into the lower half, but by virtue of getting more difficult questions correct move to the upper half later.
Of course, the caveat here is that the actual content of the test needs to be carefully constructed in such a way that the items in fact measure a continuum of levels on a clearly-defined ability.
but it doesn't help if the gap in continuum is too big right? + it'll be that much harder and tiring to get the higher scores if you got the first few ones wrong.
hmmz and I guess it only allows for testing on ability in general but not the whole gamut of topics unless you do multiple tests for each topic
Originally posted by Gedanken:It's quite simple, really. IRT is an alternative to Classical Test Theory which is how most tests are currently scored. The problem with Classical Test Theory is that a person's ability level is determined by the difficulty of the questions he answers correctly, but the difficulty of the questions is conversely based upon how many people that get the answers right. It's a vicious cycle that never gives you a real answer of either the person's level of ability or the item difficulty. Also, classically-designed exams ask a whole lot of questions that are either too easy or too diffcult for examinees, and that's just a waste of time.
Item Response Theory takes the data and generates a mathematical model that simultaneously estimates both ability levels and item difficulty, repeatedly changing the parameter estimates and comparing it to the data till you get a proper fit between the mathematical model and the data itself. The result is a measure of the difficulty of each item in the exam, extrapolated to a model of the distribution of ability levels across the general population.
The repetition part's always been a bugger, but with the amount of computing power that we have available now, it's become fairly easy. The model above only takes into account two parameters, namely item difficulty and ability levels. You can also throw in other parameters such as the probability of getting an answer correct purely by guessing.
Anyway, put IRT together with Computerised Adaptive Testing and you have a examination system that only asks you the questions most appropriate to your ability level. It starts with a mid-level-difficulty question, and if you get that right, the next item's difficulty goes up to the three-quarter mark, while it goes to the first-quarter mark if you get it wrong. The difference between the level of the question asked and the level of the next question keeps halving until there's less than a 1-percentile difference. The SAT and the US Military's ASVAB tests are based upon such a system.
In my thesis, I ran simulations of an adaptive model on exam data and correlated it to the actual results, and found that in 20 questions, I had a .97 correlation with the actual results that were based upon asking 100 questions, and with 25 questions, the correlation was .99. The upshot of this is that exams can be significantly shortened, and with everybody being asked different questions, students stealing exam papers will not be a problem.
And to think I hated maths.
Aaaaahhhh... I understand now! Ta!
I believe the NCLEX-RN exams in the US, a licencing exam that ALL RN's must undertake before being allowed to practice nursing in the US, uses the IRT. The first few questions are pretty tough and if you get those wrong, you'll have to answer more questions that are simpler, however, if you get those right, you'll answer fewer, but more difficult questions.
One more question though... if you get someone who's 'smarter' scoring 95%, and another person who had to answer more questions also scoring 95%, wouldn't that not be a very equitable 95%?
I get the bit about the correlation being .99... but there's just this nagging thought in my head that a 'smart and quick' 95% just somehow isn't the same as a 'slow and steady' 95%.
Originally posted by tare:and u call tat really simple???!!! o.O
Aiyah, it's not that difficult lah. Think about it as the mathematical equivalent to being measured for an alteration. the clothing item is the mathematical model, and your body is the actual data.
My supervisor got quite annoyed with me, though. I got the basic text and came back a month later to check that I understood the theory correctly - he did IRT for his PhD thesis as well, and it took him almost a year to get a grasp of it.
Originally posted by tare:
and u call tat really simple???!!! o.O
i went poly instead of jc of i dun wanna take maths!!!!!
I dropped 'AO' Maths in JC...
Originally posted by Gedanken:Aiyah, it's not that difficult lah. Think about it as the mathematical equivalent to being measured for an alteration. the clothing item is the mathematical model, and your body is the actual data.
My supervisor got quite annoyed with me, though. I got the basic text and came back a month later to check that I understood the theory correctly - he did IRT for his PhD thesis as well, and it took him almost a year to get a grasp of it.
hmmz does that mean since the data is expanding constantly, the mathematical model needs to be altered and expanded constantly too?
Originally posted by hisoka:but it doesn't help if the gap in continuum is too big right? + it'll be that much harder and tiring to get the higher scores if you got the first few ones wrong.
hmmz and I guess it only allows for testing on ability in general but not the whole gamut of topics unless you do multiple tests for each topic
Good observation. If the answers bounce all over the place, it will result in a longer test. Unfortunately, that's an issue of how the test items were designed, as well as how the intended content is structured. If proper item reliability tests are conducted on the data before the IRT model is constructed, the problematic items should be discarded, but there's only so much you can do without having to rejig the content itself.
Originally posted by Rhonda:One more question though... if you get someone who's 'smarter' scoring 95%, and another person who had to answer more questions also scoring 95%, wouldn't that not be a very equitable 95%?
Well, "smarter" doesn't quite fit into the picture. In the purest sense, all we're testing is the person's ability to do some specific thing or handle some specific subject. As an analogy, you get two people to stand at a line and throw basketballs into a hoop. One person may be faster and more agile on the court, but all we're testing is how accurate their shooting is, not how well they play basketball overall.
Originally posted by Rhonda:
I get the bit about the correlation being .99... but there's just this nagging thought in my head that a 'smart and quick' 95% just somehow isn't the same as a 'slow and steady' 95%.
That becomes a question of how the assessor interprets the results of the test. If one person takes 20 question to get to the 95th percontile while the other takes 50 questions, that's something the assessor can take into consideration. Also, the test can be designed in such a way that if the results are bouncing all over the place, it'll stop after a certain number of questions and produce a result that effective states, "Requires closer examination - something's not quite right here".
Originally posted by hisoka:hmmz does that mean since the data is expanding constantly, the mathematical model needs to be altered and expanded constantly too?
Yup, but that's the same as conventional tests - in any case, the normative database should be kept up to date.
Meaning of "No Regrets" varies on where is the usage, when is the usage and why is the usage.
Example:
If it is used before the event, it means be brave. Just do it
If it is used after the event, it means there is no "what if" questions.
Originally posted by Gedanken:Yup, but that's the same as conventional tests - in any case, the normative database should be kept up to date.
nvm probably the reference was too obscure >.<
Originally posted by Rhonda:Et tu, jetta? My maths is so koyak, it's embarrassing!
I always led the class from the bottom up. I just couldn't grasp math theories, applications. I know I'm dyslexic when it comes to math. Couldn't even help my kids as they got into middle school. How embarassing is that?
Originally posted by Gedanken:It's quite simple, really. IRT is an alternative to Classical Test Theory which is how most tests are currently scored. The problem with Classical Test Theory is that a person's ability level is determined by the difficulty of the questions he answers correctly, but the difficulty of the questions is conversely based upon how many people that get the answers right. It's a vicious cycle that never gives you a real answer of either the person's level of ability or the item difficulty. Also, classically-designed exams ask a whole lot of questions that are either too easy or too diffcult for examinees, and that's just a waste of time.
Item Response Theory takes the data and generates a mathematical model that simultaneously estimates both ability levels and item difficulty, repeatedly changing the parameter estimates and comparing it to the data till you get a proper fit between the mathematical model and the data itself. The result is a measure of the difficulty of each item in the exam, extrapolated to a model of the distribution of ability levels across the general population.
The repetition part's always been a bugger, but with the amount of computing power that we have available now, it's become fairly easy. The model above only takes into account two parameters, namely item difficulty and ability levels. You can also throw in other parameters such as the probability of getting an answer correct purely by guessing.
Anyway, put IRT together with Computerised Adaptive Testing and you have a examination system that only asks you the questions most appropriate to your ability level. It starts with a mid-level-difficulty question, and if you get that right, the next item's difficulty goes up to the three-quarter mark, while it goes to the first-quarter mark if you get it wrong. The difference between the level of the question asked and the level of the next question keeps halving until there's less than a 1-percentile difference. The SAT and the US Military's ASVAB tests are based upon such a system.
In my thesis, I ran simulations of an adaptive model on exam data and correlated it to the actual results, and found that in 20 questions, I had a .97 correlation with the actual results that were based upon asking 100 questions, and with 25 questions, the correlation was .99. The upshot of this is that exams can be significantly shortened, and with everybody being asked different questions, students stealing exam papers will not be a problem.
And to think I hated maths.
AIEEEEEEEEEEEEEEEEEEEEEEE
IRT! I didn't expect to see that in this thread. This stuff has been giving me nightmares.
Regrets... quite a bit but then... I live and move on with it.
Should have studied harder and get my degree earlier.
Should not have bought so many insurances since it only benefits the love ones I left behind.
Should have be more decisive when come to the affair of the heart.
Should have learn to save more.
Should have maintain my 6 packs
Originally posted by jetta:I always led the class from the bottom up. I just couldn't grasp math theories, applications. I know I'm dyslexic when it comes to math. Couldn't even help my kids as they got into middle school. How embarassing is that?
Maths and Mandarin I was always first in class... from the bottom.
Thankfully, I had other subjects where I was first... at the top. Phew!
Originally posted by X-men:Should have maintain my 6 packs
HAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHA!!!
regrets...i have a few.....
and...i did it....MY WAY!!!!!!!!!!!!!!
*cough cough...like my singing?*
i regret being with someone whom i thought would last. But it was not to be.
Originally posted by Rhonda:Maths and Mandarin I was always first in class... from the bottom.
Thankfully, I had other subjects where I was first... at the top. Phew!
I was the President of that club in my school as well.
Originally posted by Rhonda:I dropped 'AO' Maths in JC...
It dropped me.
Ok i take back the no regrets post.
I regret drinking that big cup of iced teh tarik yesterday and end up hyper whole night and now wake up at 5.30 am in the morning cannot sleep.
Originally posted by jetta:It dropped me.
I dropped it before it could drop me. So I might be bodoh in Maths, but at least, I can still pretend a bit and hold my head up high a bit and hope that not too many knew ...
I wasn't alone though. One of my JC buddies dropped 'AO' Maths too, because the two of us 'liak bo kew'! She's now a lawyer, so she didn't end up shabby without Maths!
In my time, there was no AO maths option for me... It was more like you take A level, and if you do bad, you get an AO level grade...
Originally posted by av98m:
AIEEEEEEEEEEEEEEEEEEEEEEE
IRT! I didn't expect to see that in this thread. This stuff has been giving me nightmares.
MUAAHAHAHAHAHAHA!!!!
Misery, pestilence and chaos - my work here is done.
I hate maths. God knows why I'm in my current job crunching numbers daily.
*boo hoo hoo*
Originally posted by SaTaN_Ga|:I hate maths. God knows why I'm in my current job crunching numbers daily.
*boo hoo hoo*
Nevermind... jetta and I love you! *pat pat! Welcome to the sisterhood!