Evaluating Teaching in the CSU: A Guide for Campus Discussions
In November 1996, the U.S. News and World Report (special education issue) reported a curious statistic from a survey of college and university professors. When asked to evaluate their own teaching, 94% of the professors thought they were "better than average at their job." This statistic suggests two interesting things about the evaluation of teaching. First, faculty in higher education may have a fuzzy idea about how to evaluate teaching. Second, we in the academy probably suffer from the same self-serving bias that afflicts other professionals--a bias that we might dub the Lake Woebegon Effect (after Garrison Keillor's fabled community--where all the women are strong, all the men are good looking, and all the children are above average).
To help us sift through our misconceptions, biases, and a blizzard of theoretical work and research that has already been done on the evaluation of teaching, Evaluating Teaching in Higher Education: A Vision for the Future makes a useful contribution. This monograph has two big virtues. First, it is short. Nine essays ranging from 6 to 16 pages each cover a range of issues that allow a faculty member to quickly and easily become current on evaluating teaching. Second, the book develops the fundamentals of evaluation--how to evaluate, why to evaluate, and who decides how and why. Some themes receive greater emphasis in some of the essays, but the book is balanced and the closing essay is written to bring them all together.
The first four essays are issue essays; the second four essays are method essays. The issue essays lay out in broad-brush strokes the practical, empirical, and ethical issues involved in the evaluation of teaching. The opening essay is a posthumous piece by one of the deans of teaching evaluation, Robert Menges. "Shortcomings" is the first word of his title and it sets the tone for much of the book. Menges argues that most of our evaluation is empirical and quantitative with little consideration given to theoretical or qualitative issues. Most specifically, he points out that we have fairly rigorously developed measures of teaching behavior, but we have no idea if those behaviors match the intentions of the teacher before entering the classroom. He also points out that we have measures of those intentions, and we certainly do not know why teachers abandon behaviors they know are educationally the most effective when they are in the pressure cooker of the classroom. The other shortcoming he points out is the lack of comparative information on the advantages of technology vs. face-to-face instruction: what is better for which students, subjects, and classroom situations?
In the second essay John Ory provides a metaphor that could provide the theme for the entire book. He traces the etymology of the word assessment to the Latin word assidere which means "to sit beside." Too often faculty shudder at the thought of assessment of their teaching in the same way students shudder at the thought of assessment of their learning through examinations. But Ory suggests that "sitting beside" conjures up images of reflecting, sharing, helping, and building, which are more appropriate to the academic enterprise. He traces a valuable short history of assessment over the last 30 years, pointing out that it has become more systematic but needs more "sitting beside." This essay should have been given to each of the other contributors before they wrote. The editor and closing essayist might then have had an easier time connecting the issue essays and method essays in the book.
In the third essay, Lawrence Braskamp charts the future of assessment, pointing out the need for more experimental work that recognizes the dual roles of teaching and learning. Assessment can take a student focus, emphasizing the learning that teaching produces. Or, assessment can take a teacher focus, emphasizing the behavior of the professor as a unique truth seeker in an academy of scholars. Many of our academic institutions have formal programs to do one or the other, but an effective program for the assessment of teaching really needs both.
In the final issues essay, Randall Bass confronts the issue of technology in the assessment of teaching. Technology may be forcing a redefinition of teaching making it less and less an individual activity and more and more a communal and ecological activity. Faculty must play many roles and one of them is certainly to design the learning environment by deciding what technology to use and how to use it. As such, the virtue of technology is to allow the teacher to match in assessment the complexity of the learning environment. The student faces a blinding array of information for which technology may be necessary to sort and organize for effective learning. How should the teacher's management of the technology become a part of the evaluation of teaching?
The issues raised in the first four essays are addressed by some specific assessment methods in the next four essays. The editor, however, characterizes the method essays as snapshots to address the themes and issues of the book. That is, essays five through eight offer just four possible assessment techniques among many and they offer them in brief--although these are the longer articles in the book.
Robert Stake and Edith Cisneros-Cohernour, in essay five, criticize current formal evaluation of college teaching as simplistic and inconsequential, focusing on what individual instructors do in the classroom. The alternative they recommend is a "community of practice" approach. They appeal to personnel evaluation procedures that distinguish between selection evaluation for hiring a faculty member vs. placement evaluation for assessing someone who is already a faculty member and hopes to remain so. Too often, inappropriate selection criteria are applied in placement situations, since placement evaluation should focus on the responsibilities of instruction and the immediate work context. Each instructor needs to be evaluated on the contribution made to the maintenance and improvement of all instructional programs in the department. There is both individual and collective accountability and outcomes can best be assessed through individual and collective peer evaluations.
Daniel Bernstein, Jessica Jonson, and Karen Smith describe the American Association for Higher Education (AAHE) Peer Review Model in essay number six. This model was pilot tested at 12 universities where faculty pairs exchanged written information and discussed course content, classroom practices, and student learning. Not only were faculty uniformly positive about these dialogs, but assessment showed a positive, albeit uneven, impact on student achievement, student attitudes, and faculty practices. The authors recommend that peer review, in the form of consultation and interaction about teaching and learning, be as valued as the other forms of intellectual work (such as discovery research) that go on at a university.
In the seventh essay, John Centra advocates the teaching portfolio, which would include products of good teaching (student work), materials developed by the teacher, and assessments from others. He argues for the need for standardized procedures for evaluating portfolios, pointing out that when such procedures have been developed there is high reliability across evaluators. He stresses that the best people to serve as evaluators are well-informed colleagues.
The last method essay, by Michael Theall and Jennifer Franklin, addresses the widely used and suspiciously regarded student rating system. There is an abundance of research here--reliability studies, validity studies, and evaluation of good and bad practices. Despite all of the research on this form of assessment, the authors argue that student-rating methods have not kept up with the changing times. For example, standardized student rating systems do not work well with new instructional practices, such as active and collaborative learning. Nor are they appropriate for some nontraditional student populations, such as on-line students. And they haven't kept pace with technological developments in the classroom and the faculty member's need for classroom assessment, formative evaluation, and portfolio development. Because student-rating systems have the potential to do more harm than good, it is critical that the entire system be valid, reliable, accurate, efficient, and accepted.
Trav Johnson and Katherine Ryan, bring the book to a close with a call for a multifaceted approach to evaluating teaching. They see four issues at the heart of improving evaluation: defining faculty roles, understanding teachers and teaching, meeting the many demands placed on teaching evaluations, and the effective use of evaluations. Their message is clear, simple, and probably able to be implemented--if we believe it has value and if we are willing to put in a lot of hard work. For 21 years, the New Directions for Teaching and Learning Series of Jossey-Bass has provided scholarly and useful summaries of research and opinion-summaries that can be used to guide the hard work of campus discussions and actions. This book is no exception.