Creation of Reliable Relevance Judgments in Information Retrieval Systems Evaluation Experimentation through Crowdsourcing: A Review
Table 3
Statistics for calculating the interrater agreement.
Methods
Description
Joint-probability of agreement (percentage agreement) [20]
The simplest and easiest measure based on dividing number of times for each rating (e.g.,), assigned by each assessor, by the total number of the ratings
A statistical measure to calculate interrater agreement among raters. This measurement is more robust than percentage agreement since this method considers the effects of random agreement between two assessors