Students' Gender Bias in Teaching Evaluations

: The goal of this study was to investigate if there is gender bias in student evaluations. Researchers administered a modified version of the teacher evaluation forms to 58 students (male=30; female=28) in a basic introductory communications class. Half the class was instructed to fill out the survey about a male professor, and the other half a female professor. Researchers broke down the evaluation results question by question in order to give a detailed account of the findings. Results revealed that there is certainly some gender bias at work when students evaluate their instructors. It was also found that gender bias does not significantly affect the evaluations. The results align with other findings in the available literature, which point to some sort of pattern regarding gender bias in evaluations, but it still seems to be inconsequential.


Introduction
At the beginning of every semester, students at a typical university sit down in their first class and wait for their professor to step into the room. From the moment they begin the first lecture, professors are being watched and judged by their students. At the end of every semester, the professor faces the mass judgment of an entire class in the form of an evaluation. This standard evaluation is handed out in every classroom on the college campus. It is meant to give the students a voice by allowing them to vocalize what they like or do not like about a specific professor. Almost every journal article we found stressed the importance of instructor evaluations regarding promotion, tenure, and salary. Because these evaluations are so important, what happens when they are filled out by students with a bias against an instructor? By bias, we mean some preconceived factor that would affect the evaluation of a teacher in an untrue way. An example of an obvious bias would be class size, where the student receives more attention in a smaller class, therefore affecting the instructor's evaluation.
The point and overall goal of this particular project was to see if there are any gender biases when it comes to student evaluations of professors. As such, the following paper works to answer the question: What is the impact of gender bias in student evaluations of teachers? We will discuss previous research and thoughts on the subject, our own research and results, as well as our own thoughts on the problem of gender bias in these evaluations. We hope to make the reader of this paper understand and recognize the problem, and prevent it from happening in our classrooms at other college campuses.
Regarding gender bias in student evaluations of teachers, most of the available studies were published before the year 2000. A study conducted in 1973 looked to find some sort of correlation between the ratings students give instructors and things such as the student or teacher's demographics (Granzin & Painter, 1973). The interesting thing about this study is the fact that after finding such a small correlation between student gender, teacher gender, and the ratings given out, the researchers actually threw out sex as a variable altogether (1973).
A study conducted in 1975 found male students evaluated their female teachers less favorably than their male teachers (Ferber & Huber, 1975). Female students were found to rate all instructors higher than males did, also rating female instructors higher than they did male instructors. Therefore, the researcher found the students to show a bias toward their own sex. Additionally, the study found that positive past experiences with women instructors greatly reduced the preference students had for male instructors (1975).
Yet, another study had found that sex bias in student ratings do not generally occur (see Wilson & Doyle, 1976). However, in this study there were some students who participated, but did not reveal their gender. In this new category, the mean ratings for female instructors were slightly lower than the mean ratings for male instructors. Still, the researchers found this to be statistically insignificant due to the low number of people in the category (1976).
Another researcher used a modified questionnaire in a study to test the idea that teaching evaluations "are influenced by zones of acceptance based on sex stereotypes" (Martin, 1984, p. 488), meaning students are influenced on evaluations by stereotypical authoritative gender roles (i.e., women should act as women and men should act as men). The researcher believed that low instructor evaluations were partly as a result of a sort of revenge on teachers who upset students by rejecting the zones of acceptance (1984). This hypothesis only held up in the evaluation of female social science instructors. Male students rated these teachers higher only when they combined feminine traits with masculine traits in their teaching style (1984).
In a 1985 study, it was found that generally for male students, gender plays a very small role in the evaluations of teachers (Tieman & Rankin-Ullock, 1985). Departments at a southern university were split into two separate fields: traditional and nontraditional. The traditional field included male and female faculty members who were teaching in stereotypical fields such as males teaching math, or females teaching English. The nontraditional fields included male and female instructors who were teaching in fields not generally taught by their particular gender, such as women teaching biology or men teaching nursing. Males were actually discovered to show a 30 N. Punyanunt-Carter & S. Carter slight favoritism toward female faculty in both traditional and nontraditional fields; the same study found female students to show favoritism toward the underdog in the field (1985). Female students actually rated men teaching in stereotypical female dominated fields higher than the women instructors. The same was the case for male stereotypical fields: women were given higher ratings on evaluations than the male instructors (1985). Arubayi (1987) also used previous research to form his own conclusions. He found that there were previous studies conducted that concluded there was no relationship between the sex of the evaluator and faculty rating. The personality of the evaluator was found to play a bigger role in the way a student evaluates (Arubayi, 1987, p. 270).
In a 1989 study, Dukes and Victoria hypothesized that the professor's gender and the student's gender would interact statistically on the results of an instructor evaluation. They also predicted that students evaluate instructors of the opposite sex more highly, an effect that is heightened by effective teaching. The findings of the study reported that female instructors were not evaluated lower than male instructors for the same performance (1989). Enthusiasm was found to be the thing that causes higher cross-sex evaluation ratings from students (1989).
Two articles used data mining to infer the researcher's conclusions. A 1997 article found that students evaluate male and female instructors differently because they have different expectations for the way male and female instructors should behave, specifically, in areas such as likeability and competence (Anderson & Miller, 1997). Male and female professors were overall rated as equals, but when the instructor adhered "to the gender appropriate model" (Anderson & Miller, 1997, p. 218), they were found to receive higher ratings on the evaluations. The second article that utilized data mining discusses evaluations in general, not just those completed in universities. The researcher found that women are evaluated less favorably than men when they are highly qualified. Interestingly, women were evaluated more favorably than males when both the male and female were not well qualified. "This implies a different reward system for males and females-one that rewards success and competence in males, and failure and incompetence for females," (Nieva & Gutek, 1980) Other research focused not on the actual paper evaluations of the instructor, but rather on the student's perception of professional status and education credentials of the instructor based around gender. It was shown that students are more likely to perceive a male instructor as higher in status and credentials than females (Miller & Chamberlin, 2000). Another study involving gender and student-teacher perception found that females with male instructors reported a significantly less favorable overall impression of their instructors (Cromie, Pyke, Silverthorn, Jones, & Piccinin, 2003). This was in comparison to female students with female professors and males with either female or male professors (2003).
A study published in 2000 compared evaluations for both male and female instructors by male and female students (Centra & Gaubatz). The researchers used the Student Instructional Report II, an evaluation form that is used in universities, and had courses fill them out three times over three semesters. Data was collected from 741 courses, all of which had at least 10 females and 10 males in each. Multivariate analysis of variance (MANOVA) was used to analyze the data (Centra & Gaubatz, 2000).
Results from the study showed female instructors received higher evaluation ratings from female students six out of eight times, while male instructors received equal ratings from males and females (Centra & Gaubatz, 2000). Female instructors were also viewed as better organized, www.hlrcjournal.com

Open Access
Students' Gender Bias in Teaching Evaluations 31 better communicators, more interactive, and better at giving quality exams and feedback (2000). Female instructors were found to teach in a different style than males, encouraging discussions and lecturing less (2000).
Centra and Gaubatz viewed this difference in teaching style as potentially causing the higher ratings among female instructors by female students. Likewise, males were found to view male instructors as better organized and more systematic (2000). Cross-gender biases were found to be very small. The authors concluded that, although there might be some gender biases, particularly with females evaluating female instructors, the effects were extremely minimal and most likely caused by differences in teaching style (Centra & Gaubatz, 2000).
The research for this article was based around Centra and Gaubatz since it involves an evaluation that every student has filled out multiple times in a way that is different from the normal usage. The authors of this article also recognized the flexibility this provided in order to tweak the survey in order to target and answer the proposed research questions as well. Also, it must be acknowledged most of the articles focused on this topic are from the 1970s and 1980s, while Centra and Gaubatz is relatively more recent, providing for better results validation.

Research Questions
The researchers of this paper were most interested in seeing just how much a professor's gender influences the student's perceptions of the teacher based on the student's own gender. As such, the authors came up with three basic questions to be answered by the research and analyses conducted. The questions are listed below:

RQ1: Is there a gender bias when students evaluate teachers?
Based on the literature review, the researchers hypothesize that females are more likely to judge their female professors kindly; the same going for male students and male professors. The researchers think there is a sort of gender solidarity that makes it easier for a female to give a female and a male to give a male a good evaluation overall.
RQ2: Do students view professors of their same gender as more effective, clear, knowledgeable, likeable, etc.?
The researchers also hypothesize students will find professors of their same gender as overall better teachers than professors of their opposite gender. They might also find the professor's teaching style to be more effective and find the teacher to be overall respectful and encouraging. It is the opinion of the researchers that students find teachers of their same gender to be easier to learn from and easier to like.
RQ3: How likely is a student to choose a teacher of their same gender from the very beginning, during class selection?
If there is gender bias in these teacher evaluations, where does the bias actually begin? To answer this, the researchers question the possibility that the bias might actually begin in the registration process, when students are choosing the classes they will take. The researchers of this study think that if it is possible for there to be any sort of gender bias, it must begin when they choose between taking a female professor or a male professor's class. If this inference is correct, then the bias problem might be much bigger than just evaluations. N. Punyanunt-Carter & S. Carter

Methods
The data for this project was gathered using a very straight-forward survey method. The researchers used a slightly modified version of the teacher evaluation forms that students fill out at the end of each semester. The survey was filled out by students in a basic introductory communications class. Half the class was instructed to fill out the survey about a male professor, and the other half a female professor. Altogether, 30 males and 28 females completed the evaluation. Of those, 17 males and 15 females evaluated a female professor, and 13 males and 13 females evaluated a male professor. All students who took the survey remained anonymous to the research group.
The survey had questions regarding the professor's overall effectiveness, clarity, and level of knowledge. The survey also asked about the ways in which the instructor conducts the class and treats the students. All of these questions were taken straight from the Texas Tech teacher evaluation form.
A few modifications were made to the survey to better fit the researchers' needs and answer the research questions. The biggest change made to the survey was adding a place for the student to enter the professor's gender as well as the student's own gender. This was added to allow the researchers to know the gender of the student and the gender of the teacher so that they could compare the two.
The researchers also added questions at the end of the survey regarding the professor's likeability and how much the student felt he or she learns from the teacher. The researchers added these questions because they were curious to know: if an instructor is well liked, but not necessarily someone you learn a lot from, will he or she still receive a good evaluation? In other words, the researchers wanted to find out if evaluations are based off of good teaching, or just how much the instructor is liked.
Lastly, a question was asked regarding whether or not a student is likely to choose an instructor of his or her same sex when choosing courses. This question was added to see if some students solely choose classes based on some sort of gender bias.
Students chose from five different responses: strongly agree, agree, neutral, disagree, and strongly disagree. All results were given numerical value, entered into a spreadsheet in excel, and analyzed by the group.

Analysis and Results
For an easier presentation of the analysis and results of this study, the researchers will simply break the evaluation down, question by question, and review the findings for each one. This gives a detailed accounting of exactly what was found.
The first question asked about the instructor's effectiveness in the classroom. For the female professor, 82% of males and 67% of females agreed or strongly agreed that the instructor was effective. Sixty-nine percent of females and 69% of males agreed or strongly agreed that the instructor was effective.
Question 2 asked if the instructor stimulated student learning. Eighty-two percent of males and 60% of females agreed or strongly agreed that the female professor stimulated learning. Seventy-seven percent of males and 69% of females agreed or strongly agreed that the male professor stimulated learning.
Questions 3 and 4 regarded the fairness and respectfulness of the instructor. Both of these questions garnered very similar results from the students. Overwhelmingly, females and males alike viewed both the female and the male instructors as being fair and respectful. According to the findings in this study, 82% of males and 87% of females surveyed either agreed or strongly agreed that the female instructor treated students fairly. Nighty-two percent of males and 85% of females either agreed or strongly agreed that the male instructor treated students fairly. Likewise, 88% of males and 93% of females agreed or strongly agreed that the instructor treated students with respect. Nighty-two percent of males and 100% of females agreed or strongly agreed that the male instructor had respect for the students.
Question 5 of the evaluation asked if the instructor welcomed and encouraged questions and comments in the lecture. Nighty-four percent of males and 87% of females either agreed or strongly agreed that the female instructor encouraged student input in the classroom. As for the male instructor, 100% of males and 77% of females found him to welcome student questions and comments.
Question 6 asked about the clarity of the instructor. Seventy-six percent of males and 69% of females agreed or strongly agreed that the female instructor was easy to understand. Eightyfive percent of males found the male instructor to be clear when teaching. However, only 31% of females agreed the male instructor was clear. Additionally, only 2 of the 13 females surveyed strongly agreed that the male instructor was easy to understand. Question 7 discusses the instructor's knowledge on the subject being taught. 94% of males and 87% of females found the female instructor to be knowledgeable on the subject. Overwhelmingly, 100% of males found the male instructor to be knowledgeable, while 85% of females agreed.
Question 8 asked how much the student likes the professor. Seventy-one percent of males and 73% of females were found to like the female instructor either a lot or an average amount. Seventy-seven percent of males and 77% of females liked the male instructor they evaluated either a lot or an average amount.
Question 9 dealt with how much the student felt he or she learned from the evaluated instructor. Seventy-six percent of males and 53% of females felt they learned either a lot or an average amount from the female instructor evaluated. Fifty-four percent of males and 54% of females indicated that they had learned a lot or an average amount from the male instructor they evaluated. Additionally, 38% of males and 38% of females felt they had learned none or only some from the male instructor, a number that differs a bit from the rest of the survey findings.
The last question asked about the likelihood of a student specifically taking a course based on the teacher's gender. Seventy-seven percent of students, male and female combined, chose neutral, stating that it would not matter the gender of the teacher when choosing classes. Seventeen percent of males said they were not likely or not at all likely to choose a male instructor, and only 7% of females said they were not likely or not at all likely to choose a female instructor. No one surveyed said they were always likely to choose an instructor of their same sex when choosing classes. 34 N. Punyanunt-Carter & S. Carter

Implications
The study conducted shows a number of things about teacher evaluation practices. The researchers believe, based on the results shown here, there is very little gender bias in the student evaluations of teachers. However, there are a few patterns that emerged in the research results. Many times over, the male student evaluated the female instructor higher than the female student, or the female student evaluated the male instructor higher than the male student. Male students actually rated the female instructor higher than female students did 6 of 10 times. This result shows a slight, possible cross-sex bias.
When rating the male instructor, males and females tended to agree more often than when rating the female instructor. For the results regarding the male professor, male and female students were within 10 percentage points 7 of 10 times. However, when it came to clarity, although 85% of female students found the male professor to be knowledgeable of the subject matter, only 31% found him to be clear. Additionally, only 54% of females felt they had learned something from the male instructor. This suggests that female students are less likely to rate an instructor highly on an evaluation than a male student when it comes to teaching style and ability.
For the most part, likeability did not play a part in the student's evaluation of the teacher. Generally, even when a student indicated they did not like the professor, they were evaluating, they still gave him/her a good score in other areas. We take this to mean that students do not have any sort of likeability bias towards an instructor. Even if the student does not like the professor, he or she can still give that professor an objective evaluation.
Because no student surveyed said they always choose an instructor of their same sex, this shows that if there is any sort of gender bias in the evaluations it is entirely unintentional. Otherwise, the bias would begin from the very beginning when classes are chosen. If anything, the 17% of males that indicated they are more likely to choose a female instructor shows a bias toward females as stated before.
The original study, "Is there gender bias in the student evaluations of teachers?", had slightly different results than this study. While the original found female students to give female professors better evaluations, the present study found males to typically evaluate female professors higher. The original study also saw cross-gender biases to be few and far between (Centra & Gaubatz, 2000), while the researchers in this study found them to be prevalent. Although there are definite differences, both studies do conclude that the gender biases that do exist in student evaluations of instructors are not significant enough to actually affect the purpose of the evaluation itself.

Limitations
As with any research project, there were a few limitations to this study which, in turn, could have influenced our results. First of all, the research project was conducted during one semester, which necessarily implies time restraints. The survey was definitely not passed out to enough students, especially when the students who did have access to the survey were split into two parts, one group evaluating a male instructor and the other a female instructor.

Students' Gender Bias in Teaching Evaluations 35
Secondly, the survey distributed, as mentioned in the Methods section, featured a fivepoint scale. This scale included choices for strongly agrees, agree, neutral, disagree, and strongly disagree, from left to right. However, on the last three questions, the order of the choices was reversed, with the negative choices appearing on the left and the positive choices on the right. This might have led to some confusion when students got to the last questions, possibly skewing some of their responses.
Another problem exists while the student is taking the survey. There is no actual way to determine if the student was simply rushing through the evaluation or actually taking the time to fill it out correctly. Moreover, this is a problem with any sort of survey research.
Lastly, there were quite a few neutral responses. By answering neutral, no implications can be made about what the student thinks of his or her instructor. However, by simply taking the neutral choice off of the evaluations, you force a student to choose a response that they might not otherwise choose, again skewing the data. The solution to this issue is, obviously, to pass the survey out to more people. This will increase data amounts and therefore cut neutral responses down.

Future Research
Future research on the topic of gender bias in teacher evaluation could definitely include a much broader sample of students filling out the evaluations. It must be noted that, given the limited amount of recent research addressing this topic compared to the number of studies published before the year 2000, as well as the limited size of the study sample used in this case, the research conducted here can serve as a pilot or preliminary study. Further research with larger samples can yield more accurate the results, analysis, and findings. The evaluations could also be done in multiple departments of a single university to better gauge the thoughts of the entire campus, not just one group. Additionally, a multiple university study could be conducted to compare different schools as well as broaden the sample size to an even larger scale.
Other options for future research could include in-depth interviews with students. Most of the research we have come across has all been qualitative. Doing interviews gives the researcher a chance to find out not only if there is gender bias, but also why it is there and how to get rid of it. By hearing the thoughts of actual students, the researcher can give insight that qualitative research sometimes lacks.

Conclusion
While there is certainly some gender bias at work when students evaluate their instructors, the researchers have found that it might not be enough to truly affect the evaluations in a strong way. Additionally, most researchers studying the subject have found some sort of pattern regarding gender bias in evaluations, but still agree that it is inconsequential.
In conclusion, it is important to remember that evaluations are not meant to judge the instructor as a person, gender member, and race member, among other. They are meant to evaluate the instructor's effectiveness in the classroom. As Tieman and Rankin-Ullock (1985) stated, "It is impossible to say whether student evaluations reflect actual performance differences by faculty or only the perceptions of students," (p. 189). This seems to be the recurring theme in the studies on gender bias in evaluations. It is hard to know when a student is judging the teachings of the instructor and when he/she is simply judging the teacher as a human being. By N. Punyanunt-Carter & S. Carter keeping biases, gender biases included, out of evaluations, the student and instructor will reap the benefits. Students will have better instructors and the instructors will have earned their position fairly and without bias.