Review Article

Creation of Reliable Relevance Judgments in Information Retrieval Systems Evaluation Experimentation through Crowdsourcing: A Review

Table 1

User-based evaluation methods.

User-based methodsDescription

Human in the labThis method involves human experimentation in the lab to evaluate the user-system interaction

Side-by-side panelsThis method is defined as collecting the top ranked answers generated by two IR systems for the same search query and representing them side by side to the users. To evaluate this method, in the eyes of human assessor, a simple judgment is needed to see which side retrieves better results

A/B testingA/B testing involves numbers of preselected users of a website to analyse their reactions to the specific modification to see whether the change is positive or negative

Using clickthrough dataClickthrough data is used to observe how frequently users click on retrieved documents for a given query

CrowdsourcingCrowdsourcing is defined as outsourcing tasks, which was formerly accomplished inside a company or institution by employees assigned externally to huge, heterogeneous mass of potential workers in the form of an open call through Internet