Peer Evaluation

Introduction
For courses where there are a large number of learners manual grading by instructors will not be possible. Some types of activities can be automatically graded but it also may not be possible for tasks which may not have a very clear deliverable. Peer Evaluation is a possible scalable solution which may be used in most circumstances.

Plans for the first prototype version
For each activity while submitting learners will have the option to opt in for evaluation. Those who opt in will have to submit evaluation for about three-ten peers (a tentative number - an alternate approach could be a karma system), which will be assigned to them.They will be given rubrics for evaluation using which they are to give their scores for each activity. A part of the grade will be based on peer grading (say 20 %). Half of it will be for participation and the other half for accuracy(tentative). Based on the above distribution:
 * Each activity will comprise of 80 % of the grade obtained from peer grading.
 * Now, suppose there are 8 categories of 10 points each.
 * Consider, the first activity, eg. Depth in understanding, that comprises of 10 points.
 * It has 3 grades from a scale of 1-10 (Considering that each submission is evaluated by 3 peers).

Now we need to use these grades and judge the accuracy of the evaluations :

Using the mean score may appear to be a decent measure but some learners may have the tendency to give everyone very good or very low scores. The averages are sensitive to very high or very low values. We need to deal with this by some statistical method of checking if the different grades available for a particular submission are not very different and a mechanism to correct them of they are so. A naive way to do this would be to check if the difference between the average and the max and min values is more than say 1.5 points, then that particular grading is not considered. In unlikely extreme cases, where there is no grade within 1.5 points of the average (eg. grading of 10,10,1 points results in a mean of 7. max-mean=3 and mean-min=6),the median mark could be considered. In, this method, the grades within the 1.5 point difference from the final score could be given the points for accuracy (say 10% of the total grade).

Prototype question examples

 * 1) /Copyright MCQ e-learning activity/ - Example based on MCQ activity from OCL4Ed to trial an custom evaluation rubric with weightings and objective responses.
 * 2) /Learning reflection example/ - Example using an alternate rubric approach. Learning reflections are personal, so harder to define specific criteria for evaluation.
 * 3) /Creative commons remix activity/ - Another example with a custom assessment rubric.

Challenges

 * Peer evaluations may cause learners to highly rate their friends and give low grades to others. They may form small groups within themselves and try to grade only among themselves. Also, some learners may have the tendency to give the same grades to others. These can be significantly reduced by randomly assigning the learners which peers would have to grade. In the prototype model we will also try to remove this by not considering the very high or very low grades.


 * Peer evaluations would strictly require deadlines to be met, both for submitting the activities (so that they are available for assessments by others) as well as for the evaluation itself.


 * There may be cases where learners do not review the assigned posts. The system where there is some part of the grade for the evaluation itself as well will reduce it a lot.

Next Steps and Ideas Being Explored

 * A feedback and rating system in place for the evaluations itself.
 * Rather than the plan suggested above, a calibration algorithm which calibrates grades based on some of the gradings by instructors. This may be done by a sample evaluation task before learners actually evaluate the assignments.
 * Measurement of the reliability of a students grades as a course progresses and perhaps universally across all courses offered.

Additional resource links

 * Coursera - How peer assessments work.
 * Generally peer assessments are not allowed until evaluation phase begins after the submission deadline. (At OERu, once a learner has opted-in for a peer assessment, they could be added to the pool for evaluation. Perhaps the Peer evaluation tool could provide an option for date bound start of the evaluation in addition to allowing evaluations once the learner has opted in for evaluation.)
 * I like the optional "learn to evaluate" feature where peer ratings of an example assignment are compared with a teacher evaluated assignment.
 * Reflections from Chuck Severance on Coursera Rubric - A useful approach for setting up rubrics.