This paper reports on a web-based peer-review process that I used in an upper-division face-to-face mathematics course for the first time. There were 20 students: 17 mathematics majors and 3 computer science majors. I have been a professor of Computer Science for 15 years, but my Ph.D. is in mathematics. I have taught such math-intensive courses as Discrete Mathematics, Analysis of Algorithms, and Automata Theory, but the students were always computer science majors. This was my first time teaching a course that was for mathematics majors. Prior to this class I had implemented peer reviews in several computer science classes, so some of the comments in this paper are based on a comparison with that experience. I will first describe how the peer reviews were implemented and then present the results, including statistics such as the number of reviews submitted and the average scores received (and given) by the students over 12 assignments. In addition, the students were asked to respond anonymously to an online survey about how they felt about the peer-review process. These results will also be presented.
There are several reasons for implementing peer reviews. The primary reason is that they give students direct and timely access to the creative contributions of their classmates. This is a twofold issue: (a) the students learn from each other, and (b) they get to compare themselves with their classmates. The peer-review process puts a student in the role of "critical reviewer," which reinforces his or her understanding of the grading criteria. Peer reviews also encourage class participation, foster an atmosphere of collaboration, and give students a wide audience for their work.
This course and the peer reviews were structured around weekly assignments. While the students were working on the current week's assignment they were expected to do a peer review of the prior week's assignment. Students were asked to review all of their classmates. Therefore, in this class of 20 students, each student was asked to review 19 other students each week. Although this was the goal, many reviews were not done because (a) I told them the reviews were optional (but could be used to satisfy "class participation" requirements), and (b) the reviews turned out to be extremely difficult to do (my choice of assignments got better as the semester progressed but the first three assignments required multiple mathematical proofs, which turned out to be too much for most of the class).
Students were required to post their assignments (solutions) on their own web pages. This may seem a little unusual for a mathematics class, but as a Computer Science professor I thought this a skill every student should have and that there was no better time to learn it than now! The school provided a web server, and usernames and passwords were distributed. Many of the students had very little understanding of web pages or web servers. After a few tutoring sessions most of them got the hang of elementary HTML and FTP. Some students posted their first web pages ever. However, two students never got it completely figured out, so I had them submit their assignments to me (in electronic format) and I did the posting for them.
The first step in posting an assignment is to get it into electronic format. A few students resisted this and pushed for submitting handwritten solutions, thereby opting out of having any peer reviews done of their work. It is not uncommon for mathematics majors to work exclusively with paper and pencil, but I held the line and insisted that all assignments be in electronic format. The students conformed. However, there were still major issues about how to get sophisticated mathematical symbols or drawings into electronic format. To lessen this concern I steered away from problems that required sketches, and I told them it was fine to write out mathematical symbols such as "summation of k squared from k = 1 to n" since reviewers would know the context (i.e., the problems from the textbook). Some students used MS Word with Equation Editor or MathType. This was a labor-intensive process but produced professional-looking documents. LaTeX files and PDF files were also used to create very professional documents. Using HTML was fairly effective, but not as professional looking. Simple text files with the symbols written out were considered acceptable, and in my opinion preferable, but I could not halt the march toward professional-looking documents. Subsequently, students complained about the amount of effort they were putting into electronic formats. Feeling their pain, I gave them a week off from assignments after the third assignment and reduced the required postings to only one problem per week.
A nice side effect is that many students now have examples of original professional documents to add to portfolios, attach to resumes, or post on home pages.
Course Web Site
After getting their homework in electronic format and posting it on their web pages, they were told to submit their URLs to the course web site, which featured a course-management system that I developed. The course web site then created a page with a list of links, one for each student. After completing and posting their assignments, students logged on to the course web site and clicked on the links to evaluate the solutions submitted by their peers. Each student completed a peer review by submitting a score (1 to 10) and a comment. The course web site kept track of all these entries and provided a way for students to submit reviews and to see the reviews they received from their peers.
Students knew whom they were reviewing, since that was obvious when they visited another student's web page, but they did not know who reviewed their own solutions. Students were told they could appeal to me if they felt they got unfair or inaccurate reviews. Although this had happened a few times in prior computer science classes, it did not happen at all in this class. I think this was the case because the mathematical content of the assignments left very little room for opinions or subjective judgments. In prior classes there was an occasional comment that could have been viewed as rude or inflammatory. In each of these cases it was clear to me that the comment was the result of poor judgment as opposed to malice. Although rude comments were extremely rare, I found that I had to spend a lot of time reviewing the comments to make sure they were appropriate.
Peer Review Scores
Completing a "peer review" in this context entailed reviewing another student's homework and then submitting a score from 1 to 10 with a supporting comment. The score of "1" was reserved for assignments that were "missing." I felt that the threat of getting a "1" was a strong incentive for students to post their homework on time. Several scores of "1" could bring down a student's peer-review average significantly. I doubt that this is the best way to handle missing and late assignments but it is a simple, clear, and direct way to discourage late work.
The students were told to review their classmates' work and try to make helpful evaluations, such as pointing out errors or better solutions. They were also given a rubric that was intended to help them distinguish between three categories of potential problems: accuracy, completeness, and presentation. Students were provided with the following table:
||Could not find project.|
||Grossly incomplete and/or grossly inaccurate.|
||Inaccurate and/or incomplete.|
||Complete, but poor presentation or somewhat inaccurate/incomplete.|
||Complete, with good presentation (explanations, graphs, charts, etc.).|
||Complete, accurate, excellent presentation.|
||The best of all the projects you reviewed. (Give this score to only one project.)
There was also an information page that expanded on each table element, giving some examples of each case. Students were also told that for each assignment they should only give one classmate a score of "10." This was intended to discourage them from submitting all 10s. The system did not prevent them, however, from entering more than one 10, but by observing the scores I could see that the majority of the students abided by this rule. It appeared that the students accepted the idea that there was one "best" homework in the bunch.
Although doing peer reviews was not required, the students were told it would be considered part of their "participation" grade, which was set at 30% of the final grade. They could participate in the class in many other ways, they were told, such as classroom discussions, helping other students, taking part in the online forum, etc. Although participation in the peer reviews was sufficient to ensure a good participation grade, it was not necessary.
All the students were open to receiving reviews, but two students chose not to give any reviews. Three other students gave fewer than five reviews each over 12 assignments, so their participation was negligible. On the other hand, five students gave more than 50% of all the reviews. Fortunately, these five students were among the top students (as measured by the review scores they received from their peers, which was consistent with my own assessment). It appears that the students with a better command of the subject embraced the peer-review process (more about this below).
It became clear after the first assignment that reviewing mathematical proofs, the major emphasis of this course, was very difficult. As the teacher I found myself struggling to follow the various rabbit trails that popped up here and there in the jungle of mathematical proofs submitted by the students. Although mathematics is purported to be a very precise science, it was often the case that there were 20 distinct proofs submitted for a given assignment, many of which were correct. As expected, the students also had trouble following the proofs of their peers. One thing became clear to all of us: At this level of mathematical sophistication, analyzing another person's mathematical proof was a very difficult and time-consuming process. A few times during the semester students expressed appreciation for those students whose postings included extensive explanations of each mathematical step.
Because the reviews were so difficult, I reduced the peer-review process to a single problem per week after the third week. Despite the difficulties, I believe doing peer reviews is a great exercise for the students. This was a new experience for these students, even though most undergraduate mathematics is learned from reading a textbook, which is also the evaluation of someone else's mathematical reasoning. The key difference is that a textbook proof is assumed to be correct.
Potential for Cheating
With homework assignments posted on publicly accessible web pages there was an obvious potential for cheating. However, there was a natural force working against copying homework: During the peer-review process the students could see each other's work. If there were any discrepancies the students were very likely to catch and report them. There were a lot of "eyes" on these assignments. That is, a similar detail that might get past the teacher was unlikely to get past all the peers. I did not find any evidence of copying. All the submitted work looked painfully original, and the students were encouraged to read and learn from the solutions submitted by their peers. They were told they could modify their own work if they gave credit to the source. This worked fine, as a few students pointed out things they had learned from other students. A final note in this regard: It seemed to me that the mathematics students were much more resistant to the idea of "shared" work than computer science students. That is, they seemed to consider the mere act of looking at someone else's work as tantamount to cheating, whereas typical computer science students are more likely to look at someone else's code and say "Thanks! I can use that!" It is a badge of honor in computer science to have others copying and using your code.
Late or Missing Assignments
One of the most difficult assessment tasks is that of a student who appears to know the subject fairly well as evidenced by in-class exams but who does not submit the assignments on time or at all. I do not believe that in-class exams provide the most accurate assessment of student "knowledge." I do not think that an exam can capture the subtle and multifaceted forms of learning that go on when a student does assignments on time, participates in related classroom discussions, and meets with the professor from time to time. When a student gets credit for a course I think it should indicate that he or she had a rich experience with the subject matter, having done more than pass an exam or two, but mathematics is one of those subjects where exam performance is often the only important outcome.
One way to motivate a student to participate is to apply the threat of a reduced grade, but this puts the teacher in the role of "enforcer" which can lead to a very negative student-teacher relationship, especially if the student knows the subject matter fairly well and feels that participation is a waste of time. The peer reviews address this issue by putting students on notice that not only does the teacher expect timely work but so does the rest of the class. A late assignment will get several scores of "1" submitted, as opposed to just the teacher's feedback, thereby distributing some to the burden of enforcement. In fact, in this peer-review environment the teacher can take on the role of "coach."
There were 715 reviews in the system after 12 assignments. There could have been as many as 4,560 if every student reviewed the work of every other student on all 12 assignments (20 x 19 x 12). There were only 715 reviews because the process was optional and the students found them difficult to do. Also, some of the students complained about having to do all the reviews in the computer lab because they did not have fast Internet connections at home. The computer labs were abundantly available, but they required a special trip for some students.
Average Review Score
The average review score received by each student over 12 assignments is shown in Figure 1. The students are ranked from 1 to 20 based on this score, and all the charts that follow refer to the students by this ranking. I found the ranking to be accurate. That is, I was hard pressed to disagree with the rankings based on my overall assessment of their performance, which included a semester's worth of assignments, exams, quizzes, emails, office visits, and several problem-solving sessions. I knew each student very well at the end of the semester. After looking closely at each ranking I was surprised by how accurate they appeared to me. For example, the top student (rank=1) was clearly ahead of the rest of the class. This was clear not only from the content of the submitted work but also from the careful explanations provided with each proof. The rankings made fine distinctions. For example, students with high rank submitted excellent work that was always on time. The salient feature of the higher-ranking students was that not only were they excellent students in their own right but there was also strong evidence that they worked in study groups outside the class. Their high average review scores reflected the fact that their peers recognized the quality and consistency of their work. Students whose ranking fell near the middle of the pack were characterized by good but occasionally wrong work and/or some late or incomplete assignments. In my estimation the errors in their work were the kinds of things that would probably have been caught if they had worked together in a study group. As I scrolled down the rankings I could not find any contradictions to my own comparative evaluation of their work. The rather simple and crude reviews done by amateurs produced an assessment tool that appeared to be valid.
Number of Reviews Received
Figure 2 shows the number of reviews received by each student. The distribution is very flat, indicating that each student received about the same number of reviews. However, there is a slight downward trend indicating that the students with higher ranking received slightly more reviews than students with lower ranking. This might be explained by the fact that the higher-ranking students were rarely late with assignments, and it is also possible that the higher-quality work "attracted" more reviews. Although students were originally asked to review all their classmates, the process was optional and hence students could pick and choose from the list of classmates. Although they were instructed to submit reviews first for the students with the fewest reviews (the web page provided this information), the system did not enforce this constraint. It appears that as the semester progressed the students showed a preference for reviewing those classmates who had better submissions and an avoidance of those classmates whose submissions were often inaccurate, late, or incomplete.
Student comments ranged from a perfunctory "good job" to detailed feedback, such as the following:
Your answers for 14 and 19 indicate that your proof works if you choose a specific constant as your epsilon, what if for number (14) you chose the value of epsilon to be 3? We could then find an N to satisfy your requirement making this a Cauchy sequence. What is so special about the value 1/2? If your proof is based on assigning a value to epsilon, you need to tell us why you picked 1/2 as your value. Also you should be able to use two non-consecutive terms such as m = n+2 instead of just m = n+1 as the definition of a Cauchy sequence does not restrict m,n to be two consecutive terms but rather it states for ALL m,n > N, not just n and n+1.
Some of the reviews were much more elaborate than this sample, so it was clear to me that some students were putting it monumental amounts of effort. The other thing that stood out from a review of the comments was that the students were very free with praise and appreciation. For example it was not unusual to see a review such as this:
Excellent. Like where you showed sequences were periodic by using n2\pi.
The variety and level of detail clearly provided more feedback for each student than I could have given.
Average Score Given
It appeared that students with lower ranking were giving out slightly higher scores, on average, than students with higher ranking. Perhaps students with weaker knowledge of the subject were unable to evaluate the work of their peers effectively. As was already mentioned, it was difficult to follow all the different proofs, so it may be that some of the weaker students "punted" at times and gave out 9s and 10s.
Number of Reviews Given
Five of the top six ranking students gave over 50% of the reviews, while five students, primarily from the lower rankings, submitted very few reviews. This is consistent with the fact that evaluating the proofs was very difficult. It seems that only the best students could embrace the task of reviewing their peers on a regular basis.
Survey on Peer Reviews
To get a better picture of how the students felt about the peer-review process, I asked them to fill out an anonymous online survey (see Figure 3). The survey was in the form of ten statements, and the respondents were asked to disagree or agree on a scale of 1 to 5. Fourteen out of 20 students responded. From Figure 3 we see a few interesting things. The chart includes the average numerical result as well as the distribution of responses for each statement.
Statement 1: I learned a lot from the peer reviews.
It was gratifying that 8 of 14 students felt they learned a lot from the process.
Statement 2: The peer reviews helped me see how my work compared to others in the class.
Thirteen of 14 students agreed (checked either "agree" or "strongly agree"). This is to be expected, but I think the consistent agreement with this statement also implies that students highly valued the chance to see the other students' work.
Statement 3: The peer reviews helped me understand that there are many ways to solve the same problem.
Thirteen of 14 students agreed. This is an important lesson. It is clear that the peer reviews drove home the fact that a correct proof can be presented in many ways. Statements 2 and 3 are things that the teacher gets a large dose of in any class. The peer reviews give the student access to the global views of the class that are often reserved exclusively for the teacher.
Statement 4: The peer reviews helped me learn a lot about web technology.
Half of the respondents felt they learned a lot about web technology. This is fairly close to what I expected but I thought there would be more agreement since most of the students needed help posting web pages, but in some cases I may have underestimated their prior web knowledge and overestimated the degree to which they valued such knowledge.
Statement 5: The peer reviews were relatively easy to do.
Nine of 14 disagreed (checked "disagree" or "strongly disagree") with this statement. This is no surprise given the difficulty of the subject matter, but it is a little surprising that 4 students felt that the reviews were easy to do. These may have been the better students or they may be students who did not put much effort into the reviews.
Statement 6: The peer reviews helped me get to know my classmates better.
This statement did not evoke a clear response, but six students disagreed with it. That surprised me. It is hard to understand how students could not get to know their peers better after evaluating several of their homeworks. I have long been struck by how impersonal computer science and math classes have seemed, and as a teacher I have consciously avoided anything non-technical. When I saw the results of my first peer review class, however, I saw a lot of semi-personal comments, that is, comments addressed to other students that were friendly, encouraging, and leaning a little bit on the personal side. It appeared to me that the peer-review process was helping the students develop relationships. In most mathematics and computer science classes the students come and go with very little classroom interaction or group activity, so there is a tendency in the technical subject areas for students to be isolated. The peer-review process gets them all communicating, even if it is mostly on a technical level. Given my experience with my previous classes, I expected more students to agree with Statement 6. That they didn't might be explained by the fact that a few of the students knew each other quite well before the class started, so seeing many samples of their friends' work did not help them get to know them any better. It could also be a direct consequence of the fact that five students gave very few reviews (there is no way to know which of these students were among the respondents).
Statement 7: The peer reviews that I received were very useful to me.
I was not surprised by the lukewarm agreement with Statement 7. Although the review process is very effective in getting everyone to evaluate each other's work, it was only the best students in the class who could make very useful comments.
Statement 8: The peer reviews that I received were reasonably fair and accurate.
It is significant that 12 of 14 students agreed that the reviews were reasonably fair and accurate, despite the fact that the reviewers were amateurs and some of the reviewers got the same problems wrong on their own assignment. This is a little surprising.
Statement 9: The peer reviews motivated me to do better work.
Ten of 14 felt motivated to do better work. This is consistent with the "audience" effect: people are motivated to produce higher quality work when they know it will be viewed by a larger audience. Exposing their work to the rest of the class was a significant motivation to do better work.
Statement 10: The peer reviews motivated me to get my assignments done on time.
This statement got a lukewarm agreement. This surprised me a little because I thought most of the students were making an extra effort to be on time to avoid the peer scores of "1." But it appears that this peer pressure did not affect them as strongly as I thought it would.
I believe the peer reviews in this mathematics class were very effective. That is, they facilitated the learning process by demonstrating many ways to solve the same problem, providing hints and tips on how to solve the problems, and providing a lot of personalized feedback. It is clear that the peer reviews were difficult to do but this is partly the result of poorly chosen assignments (it was my first time teaching this course). The peer reviews motivated students to do better work, and, to a lesser extent, motivated them to be on time with their assignments. They gained lots of experience in evaluating mathematical reasoning in a realistic setting (i.e., not the text book). It also appears that students appreciated the chance to see how they were doing with respect to their peers. The results of the peer reviews gave me valuable information, which added an extra dimension to my own assessment of their work. Despite the fact that the reviewers were amateurs, the net effect of all the reviews, as represented by the average review scores received, appeared to be an accurate assessment tool.
If I were again to teach this class using peer reviews, I would spend much more time on selecting the specific problems for peer review. That is, there still would be a set of problems due each week but not all problems--and most likely only a single problem--would be subjected to peer review. This makes the peer reviews easier and avoids overwhelming the average student. However, I am not sure that I would completely suppress a student's willingness to post all the solutions. Some students are willing to contribute significant amounts of work for the benefit of the whole class, and I would not want to lose this energy. Finally, I would spend more time facilitating responses to the reviews. When a student receives a review I would like him or her to act on it. For example, it would not be hard to extend the system to allow the student to respond to a review with "agreed, updated my homework accordingly" or "disagree and here's why." This pushes the dialogue one step further. On the other hand I would not want to encourage "flame wars," so I do not think it necessary to communicate these responses to the original reviewers.
For those teachers who are considering online peer reviews in their classes I would suggest that the first issue is web access (students need high-speed Internet at home or in a lab) and the second issue is web skills (for the teacher and students). Many students have extensive web skills and have no problem posting simple things such as writing exercises on a web site, but there are many who may require a lot of coaching (and this is true for many teachers as well). I would contact the school's IT department and/or faculty development office to see what support they can provide. The IT department should be able to field student questions about web browsers and how to post web pages. After these practical issues are resolved I would focus on the assignments and make sure they are straightforward, but not so simple that the student reviewers end up reading the same answer over and over. This happened on one of my assignments in a prior computer science class and I was a little surprised at the high level of dissatisfaction. The students did not like the idea of wasting their time on a peer-review process that had no meat to it.
Among the things that I learned from applying peer reviews are
Students greatly appreciate the opportunity to see what everyone else is doing and thereby compare their work.
Some students will apply themselves to the role of "critical reviewer" with a passion.
- Most students see the wide audience provided by the peer-review process as a golden opportunity to demonstrate, or show off, their skills and abilities. (Be prepared to see some outstanding work!)
The instructor must be online every day for a process as aggressive as the one described in this paper (weekly peer reviews). Although I found the students to be very professional in their approach to peer reviews, I also found that I had to watch the reviews very closely for such things as insulting comments and inaccurate information. I have managed to avoid these pitfalls in the past by responding immediately when I saw anything inappropriate. Oftentimes a student would alert me and then I would act quickly to rectify the situation, so this is not a process that can be put on automatic. But, what I like about the process is that it uses my knowledge and experience in a very effective way. That is, most of the mundane issues, such as missing assignments, incomplete work, etc., were being handled very nicely by the peer-review process, whereas I took on a supervisory role and chimed in with instructions, guidance, and information when and where I deemed it appropriate.
To manage a peer-review process like this I had to develop my own web site, but web platforms such as Blackboard, eCollege, and ConnectWeb are developing the flexibility to handle such activities. A more comprehensive approach to systematic online peer reviews, called Calibrated Peer Review, was developed at UCLA over the past few years (see http://cpr.molsci.ucla.edu). Currently this is a free service so you can sign up on their web site and follow the instructions. Finally, I would be happy to help (time permitting) any teacher who wants to pursue setting up a system similar to mine.
I would like to thank the editors of Exchanges for all their helpful comments and suggestions for improving the paper's content. I would also like to thank Professor Carol Holder, CSUCI, for her technical comments, reference material, and overall support for the development of the peer-review method described in this paper. Thanks also to Professor Harley Baker (CSUCI--Psychology), for his comments concerning the statistical results discussed here, and to Professor Paul Rivera (CSUCI -- Economics) and Professor Robert Bleicher (CSUCI -- Education) for their helpful comments about the applicability of this peer-review method to writing exercises in the Humanities. Finally I would like to thank Cheryl Dwyer Wolfe for her insightful comments.
Posted September 3, 2003
All material appearing in this journal is subject to applicable copyright laws.
Publication in this journal in no way indicates the endorsement of the content by the California State University, the Institute for Teaching and Learning, or the Exchanges Editorial Board.
©2003 by William J. Wolfe.