2010
DOI: 10.1561/1500000009
|View full text |Cite
|
Sign up to set email alerts
|

Test Collection Based Evaluation of Information Retrieval Systems

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
199
0
5

Year Published

2011
2011
2022
2022

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 333 publications
(206 citation statements)
references
References 205 publications
2
199
0
5
Order By: Relevance
“…Such aggregates are learned using learning to rank methods. Online learning to rank methods learn from user interactions such as clicks [4,6,10,12]. Dueling Bandit Gradient Descent [16,DBGD] uses interleaved comparison methods [1,3,6,7,10] to infer preferences and then learns by following a gradient that is meant to lead to an optimal ranker.…”
Section: Introductionmentioning
confidence: 99%
“…Such aggregates are learned using learning to rank methods. Online learning to rank methods learn from user interactions such as clicks [4,6,10,12]. Dueling Bandit Gradient Descent [16,DBGD] uses interleaved comparison methods [1,3,6,7,10] to infer preferences and then learns by following a gradient that is meant to lead to an optimal ranker.…”
Section: Introductionmentioning
confidence: 99%
“…While it would in theory be possible to provide the ground truth for the relevance of each document to the test queries in step 1, this would in practice require infeasible amounts of human input. In practice, the human input for the relevance judgements is provided in step 6, where relevance judgements are only done on documents returned by at least one algorithm, usually involving a com technique such as pooling to further reduce the number of relevance judgements to be made [17]. For applications that in practice involve the processing and analysis of large amounts of data, running benchmarks of the algorithms on representative amounts of data has advantages.…”
Section: Challenges In Benchmarking On Big Datamentioning
confidence: 99%
“…Ma et al [123] use "live" WWW. Sanderson [150] surveys the most general issues of Cranfield style-based evaluation. Shen et al [156] use TREC collections.…”
Section: Concluding Remarks and Suggestionsmentioning
confidence: 99%