Thread:Comments from Brian (1)

Cumulative calibration

Congratulations Akash on this.

Can we get our hands on an existing specification for a peer assessment system that could be used as a starting point - perhaps a detailed description of the Moodle one? I have used it and found it fairly good. The one thing I think could be usefully added is some measure of competence of reviewers.

Off the top of my head I can think of 3 measurements that would be needed to be kept for each reviewer.

1. A measure of their tendency to err on the high or low side. This could be as simple as a percentage or number between -100 and + 100 (when grading out of 100)

2. A measure of the variability of their assessments. I'm not a statistician so i'm not sure how this might be expressed, but the idea is to quantify their tendancy to vary in their inaccuracy. A reviewer with a high error tendency but low variability is easier to adjust for than one with high variability.

3. Finally, some measure of our confidence in the above 2 measurements. Opportunities will constantly arise for modification of a learners calibration scores above. As the calibration scores are tweaked, each new piece of information carries less weight. A way to handle that might be that each type of calibration has a particular score - eg. I might be calibrated against 3 of my peers assessing the same work. That might have a confidence score of 5. A tutor, perhaps at random, might grade the same piece of work and my ability calibrated against that with a score of 10. I might have my score compared to another student who has already been calibrated by a tutor and calibrated accordingly and this might have a value of 3. After these 3 calibrations, the confidence of my calibration might be scored at 5+10+3 = 13. If another tutor calibrates me, their calibration would be weighted with the existing one in a ration of 13 to 5. Feedback on reviews might also be included. This is not well thought out but I hope you get the idea.

In terms of the algorithm (and I think I've mentioned this before) I think it would be efficient if a tutor could grade an assignment and learners who assessed the same piece of work would be calibrated from this (added confidence 5?). Then their grades of other assignments would be compared with other learners and their ability calibrated (confidence 3?) - this could then be repeated and the next calibration might have a confidence addition of 1. I hope you get the idea.

The idea of assigning a confidence measure to calibration would work best over many assignments so it may be necessary to have a mechanism for transferring between courses.

You would have to be sure that the algorithm did not end up doing silly things like getting into a recursive loop particularly with positive (negative?) feedback.

I can immediately think of a silly outcome where the learner could end up with a score greater than a tutor. Maybe tutor's have a score of 80 and learners move asymtotically towards that score (Perhaps we will be able to prove that some learners can be more reliable than tutors - now there's a challenge - have some measure of the ability of tutors built in as well)

Apologies for the stream of consciousness. I do believe that peer assessment will prove to be the most powerful tool in our arsenal eventually for cutting the cost of accredited education. I may be wrong.

Brian