|Cuyamaca College Library|
Interrater Reliability definition
So how do we determine whether two observers are being consistent in their observations? We have chosen to implement a test for inter-rater reliability. Interrater reliability is the extent to which two or more individuals agree.
If you have watched the Olympics or any sport using judges, the scores are based upon judges as observers maintaining a great degree of consistency between themselves. If even one of the judges is erratic in their scoring system, this can jeopardize the entire system and deny a participant of their rightful prize. Another example is where a husbands response to marital satisfaction questions are related to (correlated with) wives responses. A third example is when a parents assessment of their child's achievements are related to (correlated with) with the child's assessment of his/her achievement.
Outside the world of sport, marriage and children, inter-rater reliability has some far more important connotations and can directly influence the Library. In our assessment, the CC Library will be assessing student IC skills are related to students own assessment of his/her achievement. For example, CC Library has designed a "self-efficacy survey" called Reference Card Survey (RCS) to assess SLO 1,2,3. Our RCS was used to report students sense of their own competence. Student self-efficacy involves students rating their perception of their own achievement in library learning outcomes. The Reference Card Survey includes a librarian assessment of student performance. (Ren, Wen-hua. "Library Instruction and College Students Self-efficacy in Electronic Information Searching". J. of Academic Librarianship 26(Sept 2000):323-328.
Data Collection procedures at Cuyamaca College Library
The data will be used to evaluate the quality of library reference services. The Reference Survey Card consists of two forms: Form A, which focuses on student experience during a reference desk interview, and Form B, which focuses on the librarian's experience during a reference desk interview. The librarian and the student will complete each form immediately after a reference desk interview.
For the rating of the students and the librarians to be consider “credible” and thus usable to draw conclusions, we must first test the degree to which the students and librarians agree about the student’s experience with reference librarian. When we test the reliability of ratings we often compute the inter-rater reliability coefficient. It is generally accepted that a inter-rater reliability coefficient of .75 or higher suggests that the ratings are reliable. The closer to 1.0 (a perfect match), the more reliable the ratings are. For example, if we have the following data:
In this first case, the reliability of the ratings are considered low (.61) even though the student and librarian mean
rating (3.9 on a 5-point scale) are the same (3.9).
In this second case, the reliability of the ratings is considered high (.92) even though the student and librarian’s mean rating (3.9) is the same as in the first case. The difference is in the level of agreement. You can visually see that in the 2nd set of data the student and librarian ratings are the same in almost each row and that is what we are looking for in with your data. It doesn’t matter if the student “Strongly Agrees or Strongly Disagrees” only that the librarian and the student agree on the result of the interaction and thus provide similar ratings.