MS MARCO V2 Document

The two-click reproduction matrix below provides commands for reproducing experimental results reported in the following paper. Numbered rows correspond to tables in the paper; additional conditions are provided for comparison purposes.

Xueguang Ma, Ronak Pradeep, Rodrigo Nogueira, and Jimmy Lin. Document Expansions and Learned Sparse Lexical Representations for MS MARCO V1 and V2. Proceedings of the 45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022), July 2022.

TREC 2021 dev dev2

AP
nDCG@10 RR@100 R@100 R@1K RR@100 R@1K RR@100 R@1K
Command to generate run on TREC 2021 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-slim \
  --topics dl21 \
  --output run.msmarco-v2-doc.bm25-doc-default.dl21.txt \
  --bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl21-doc run.msmarco-v2-doc.bm25-doc-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl21-doc run.msmarco-v2-doc.bm25-doc-default.dl21.txt
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank dl21-doc run.msmarco-v2-doc.bm25-doc-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.100 dl21-doc run.msmarco-v2-doc.bm25-doc-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl21-doc run.msmarco-v2-doc.bm25-doc-default.dl21.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-slim \
  --topics msmarco-v2-doc-dev \
  --output run.msmarco-v2-doc.bm25-doc-default.dev.txt \
  --bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev run.msmarco-v2-doc.bm25-doc-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev run.msmarco-v2-doc.bm25-doc-default.dev.txt
Command to generate run on dev2 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-slim \
  --topics msmarco-v2-doc-dev2 \
  --output run.msmarco-v2-doc.bm25-doc-default.dev2.txt \
  --bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev2 run.msmarco-v2-doc.bm25-doc-default.dev2.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev2 run.msmarco-v2-doc.bm25-doc-default.dev2.txt
Command to generate run on TREC 2021 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-slim \
  --topics dl21 \
  --output run.msmarco-v2-doc.bm25-doc-segmented-default.dl21.txt \
  --bm25 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl21-doc run.msmarco-v2-doc.bm25-doc-segmented-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl21-doc run.msmarco-v2-doc.bm25-doc-segmented-default.dl21.txt
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank dl21-doc run.msmarco-v2-doc.bm25-doc-segmented-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.100 dl21-doc run.msmarco-v2-doc.bm25-doc-segmented-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl21-doc run.msmarco-v2-doc.bm25-doc-segmented-default.dl21.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-slim \
  --topics msmarco-v2-doc-dev \
  --output run.msmarco-v2-doc.bm25-doc-segmented-default.dev.txt \
  --bm25 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev run.msmarco-v2-doc.bm25-doc-segmented-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev run.msmarco-v2-doc.bm25-doc-segmented-default.dev.txt
Command to generate run on dev2 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-slim \
  --topics msmarco-v2-doc-dev2 \
  --output run.msmarco-v2-doc.bm25-doc-segmented-default.dev2.txt \
  --bm25 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev2 run.msmarco-v2-doc.bm25-doc-segmented-default.dev2.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev2 run.msmarco-v2-doc.bm25-doc-segmented-default.dev2.txt
Command to generate run on TREC 2021 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-full \
  --topics dl21 \
  --output run.msmarco-v2-doc.bm25-rm3-doc-default.dl21.txt \
  --bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl21-doc run.msmarco-v2-doc.bm25-rm3-doc-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl21-doc run.msmarco-v2-doc.bm25-rm3-doc-default.dl21.txt
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank dl21-doc run.msmarco-v2-doc.bm25-rm3-doc-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.100 dl21-doc run.msmarco-v2-doc.bm25-rm3-doc-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl21-doc run.msmarco-v2-doc.bm25-rm3-doc-default.dl21.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-full \
  --topics msmarco-v2-doc-dev \
  --output run.msmarco-v2-doc.bm25-rm3-doc-default.dev.txt \
  --bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev run.msmarco-v2-doc.bm25-rm3-doc-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev run.msmarco-v2-doc.bm25-rm3-doc-default.dev.txt
Command to generate run on dev2 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-full \
  --topics msmarco-v2-doc-dev2 \
  --output run.msmarco-v2-doc.bm25-rm3-doc-default.dev2.txt \
  --bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev2 run.msmarco-v2-doc.bm25-rm3-doc-default.dev2.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev2 run.msmarco-v2-doc.bm25-rm3-doc-default.dev2.txt
Command to generate run on TREC 2021 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-full \
  --topics dl21 \
  --output run.msmarco-v2-doc.bm25-rm3-doc-segmented-default.dl21.txt \
  --bm25 --rm3 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl21-doc run.msmarco-v2-doc.bm25-rm3-doc-segmented-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl21-doc run.msmarco-v2-doc.bm25-rm3-doc-segmented-default.dl21.txt
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank dl21-doc run.msmarco-v2-doc.bm25-rm3-doc-segmented-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.100 dl21-doc run.msmarco-v2-doc.bm25-rm3-doc-segmented-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl21-doc run.msmarco-v2-doc.bm25-rm3-doc-segmented-default.dl21.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-full \
  --topics msmarco-v2-doc-dev \
  --output run.msmarco-v2-doc.bm25-rm3-doc-segmented-default.dev.txt \
  --bm25 --rm3 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev run.msmarco-v2-doc.bm25-rm3-doc-segmented-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev run.msmarco-v2-doc.bm25-rm3-doc-segmented-default.dev.txt
Command to generate run on dev2 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-full \
  --topics msmarco-v2-doc-dev2 \
  --output run.msmarco-v2-doc.bm25-rm3-doc-segmented-default.dev2.txt \
  --bm25 --rm3 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev2 run.msmarco-v2-doc.bm25-rm3-doc-segmented-default.dev2.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev2 run.msmarco-v2-doc.bm25-rm3-doc-segmented-default.dev2.txt
Command to generate run on TREC 2021 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-d2q-t5 \
  --topics dl21 \
  --output run.msmarco-v2-doc.bm25-d2q-t5-doc-default.dl21.txt \
  --bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl21-doc run.msmarco-v2-doc.bm25-d2q-t5-doc-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl21-doc run.msmarco-v2-doc.bm25-d2q-t5-doc-default.dl21.txt
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank dl21-doc run.msmarco-v2-doc.bm25-d2q-t5-doc-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.100 dl21-doc run.msmarco-v2-doc.bm25-d2q-t5-doc-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl21-doc run.msmarco-v2-doc.bm25-d2q-t5-doc-default.dl21.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-d2q-t5 \
  --topics msmarco-v2-doc-dev \
  --output run.msmarco-v2-doc.bm25-d2q-t5-doc-default.dev.txt \
  --bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev run.msmarco-v2-doc.bm25-d2q-t5-doc-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev run.msmarco-v2-doc.bm25-d2q-t5-doc-default.dev.txt
Command to generate run on dev2 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-d2q-t5 \
  --topics msmarco-v2-doc-dev2 \
  --output run.msmarco-v2-doc.bm25-d2q-t5-doc-default.dev2.txt \
  --bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev2 run.msmarco-v2-doc.bm25-d2q-t5-doc-default.dev2.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev2 run.msmarco-v2-doc.bm25-d2q-t5-doc-default.dev2.txt
Command to generate run on TREC 2021 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-d2q-t5 \
  --topics dl21 \
  --output run.msmarco-v2-doc.bm25-d2q-t5-doc-segmented-default.dl21.txt \
  --bm25 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl21-doc run.msmarco-v2-doc.bm25-d2q-t5-doc-segmented-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl21-doc run.msmarco-v2-doc.bm25-d2q-t5-doc-segmented-default.dl21.txt
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank dl21-doc run.msmarco-v2-doc.bm25-d2q-t5-doc-segmented-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.100 dl21-doc run.msmarco-v2-doc.bm25-d2q-t5-doc-segmented-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl21-doc run.msmarco-v2-doc.bm25-d2q-t5-doc-segmented-default.dl21.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-d2q-t5 \
  --topics msmarco-v2-doc-dev \
  --output run.msmarco-v2-doc.bm25-d2q-t5-doc-segmented-default.dev.txt \
  --bm25 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev run.msmarco-v2-doc.bm25-d2q-t5-doc-segmented-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev run.msmarco-v2-doc.bm25-d2q-t5-doc-segmented-default.dev.txt
Command to generate run on dev2 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-d2q-t5 \
  --topics msmarco-v2-doc-dev2 \
  --output run.msmarco-v2-doc.bm25-d2q-t5-doc-segmented-default.dev2.txt \
  --bm25 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev2 run.msmarco-v2-doc.bm25-d2q-t5-doc-segmented-default.dev2.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev2 run.msmarco-v2-doc.bm25-d2q-t5-doc-segmented-default.dev2.txt
Command to generate run on TREC 2021 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-d2q-t5-docvectors \
  --topics dl21 \
  --output run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-default.dl21.txt \
  --bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl21-doc run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl21-doc run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-default.dl21.txt
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank dl21-doc run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.100 dl21-doc run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl21-doc run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-default.dl21.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-d2q-t5-docvectors \
  --topics msmarco-v2-doc-dev \
  --output run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-default.dev.txt \
  --bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-default.dev.txt
Command to generate run on dev2 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-d2q-t5-docvectors \
  --topics msmarco-v2-doc-dev2 \
  --output run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-default.dev2.txt \
  --bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev2 run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-default.dev2.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev2 run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-default.dev2.txt
Command to generate run on TREC 2021 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-d2q-t5-docvectors \
  --topics dl21 \
  --output run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl21.txt \
  --bm25 --rm3 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl21-doc run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl21-doc run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl21.txt
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank dl21-doc run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.100 dl21-doc run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl21-doc run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl21.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-d2q-t5-docvectors \
  --topics msmarco-v2-doc-dev \
  --output run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-segmented-default.dev.txt \
  --bm25 --rm3 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-segmented-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-segmented-default.dev.txt
Command to generate run on dev2 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-d2q-t5-docvectors \
  --topics msmarco-v2-doc-dev2 \
  --output run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-segmented-default.dev2.txt \
  --bm25 --rm3 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev2 run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-segmented-default.dev2.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev2 run.msmarco-v2-doc.bm25-rm3-d2q-t5-doc-segmented-default.dev2.txt
Command to generate run on TREC 2021 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-unicoil-noexp-0shot \
  --topics dl21-unicoil-noexp \
  --output run.msmarco-v2-doc.unicoil-noexp.dl21.txt \
  --batch 36 \
  --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl21-doc run.msmarco-v2-doc.unicoil-noexp.dl21.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl21-doc run.msmarco-v2-doc.unicoil-noexp.dl21.txt
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank dl21-doc run.msmarco-v2-doc.unicoil-noexp.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.100 dl21-doc run.msmarco-v2-doc.unicoil-noexp.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl21-doc run.msmarco-v2-doc.unicoil-noexp.dl21.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-unicoil-noexp-0shot \
  --topics msmarco-v2-doc-dev-unicoil-noexp \
  --output run.msmarco-v2-doc.unicoil-noexp.dev.txt \
  --batch 36 \
  --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev run.msmarco-v2-doc.unicoil-noexp.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev run.msmarco-v2-doc.unicoil-noexp.dev.txt
Command to generate run on dev2 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-unicoil-noexp-0shot \
  --topics msmarco-v2-doc-dev2-unicoil-noexp \
  --output run.msmarco-v2-doc.unicoil-noexp.dev2.txt \
  --batch 36 \
  --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev2 run.msmarco-v2-doc.unicoil-noexp.dev2.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev2 run.msmarco-v2-doc.unicoil-noexp.dev2.txt
Command to generate run on TREC 2021 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-unicoil-0shot \
  --topics dl21-unicoil \
  --output run.msmarco-v2-doc.unicoil.dl21.txt \
  --batch 36 \
  --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl21-doc run.msmarco-v2-doc.unicoil.dl21.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl21-doc run.msmarco-v2-doc.unicoil.dl21.txt
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank dl21-doc run.msmarco-v2-doc.unicoil.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.100 dl21-doc run.msmarco-v2-doc.unicoil.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl21-doc run.msmarco-v2-doc.unicoil.dl21.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-unicoil-0shot \
  --topics msmarco-v2-doc-dev-unicoil \
  --output run.msmarco-v2-doc.unicoil.dev.txt \
  --batch 36 \
  --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev run.msmarco-v2-doc.unicoil.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev run.msmarco-v2-doc.unicoil.dev.txt
Command to generate run on dev2 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-unicoil-0shot \
  --topics msmarco-v2-doc-dev2-unicoil \
  --output run.msmarco-v2-doc.unicoil.dev2.txt \
  --batch 36 \
  --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev2 run.msmarco-v2-doc.unicoil.dev2.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev2 run.msmarco-v2-doc.unicoil.dev2.txt
Command to generate run on TREC 2021 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-unicoil-noexp-0shot \
  --topics dl21 --encoder castorini/unicoil-noexp-msmarco-passage \
  --output run.msmarco-v2-doc.unicoil-noexp-otf.dl21.txt \
  --batch 36 \
  --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl21-doc run.msmarco-v2-doc.unicoil-noexp-otf.dl21.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl21-doc run.msmarco-v2-doc.unicoil-noexp-otf.dl21.txt
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank dl21-doc run.msmarco-v2-doc.unicoil-noexp-otf.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.100 dl21-doc run.msmarco-v2-doc.unicoil-noexp-otf.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl21-doc run.msmarco-v2-doc.unicoil-noexp-otf.dl21.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-unicoil-noexp-0shot \
  --topics msmarco-v2-doc-dev --encoder castorini/unicoil-noexp-msmarco-passage \
  --output run.msmarco-v2-doc.unicoil-noexp-otf.dev.txt \
  --batch 36 \
  --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev run.msmarco-v2-doc.unicoil-noexp-otf.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev run.msmarco-v2-doc.unicoil-noexp-otf.dev.txt
Command to generate run on dev2 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-unicoil-noexp-0shot \
  --topics msmarco-v2-doc-dev2 --encoder castorini/unicoil-noexp-msmarco-passage \
  --output run.msmarco-v2-doc.unicoil-noexp-otf.dev2.txt \
  --batch 36 \
  --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev2 run.msmarco-v2-doc.unicoil-noexp-otf.dev2.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev2 run.msmarco-v2-doc.unicoil-noexp-otf.dev2.txt
Command to generate run on TREC 2021 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-unicoil-0shot \
  --topics dl21 --encoder castorini/unicoil-msmarco-passage \
  --output run.msmarco-v2-doc.unicoil-otf.dl21.txt \
  --batch 36 \
  --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl21-doc run.msmarco-v2-doc.unicoil-otf.dl21.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl21-doc run.msmarco-v2-doc.unicoil-otf.dl21.txt
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank dl21-doc run.msmarco-v2-doc.unicoil-otf.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.100 dl21-doc run.msmarco-v2-doc.unicoil-otf.dl21.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl21-doc run.msmarco-v2-doc.unicoil-otf.dl21.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-unicoil-0shot \
  --topics msmarco-v2-doc-dev --encoder castorini/unicoil-msmarco-passage \
  --output run.msmarco-v2-doc.unicoil-otf.dev.txt \
  --batch 36 \
  --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev run.msmarco-v2-doc.unicoil-otf.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev run.msmarco-v2-doc.unicoil-otf.dev.txt
Command to generate run on dev2 queries:
python -m pyserini.search.lucene \
  --threads 16 --batch-size 128 \
  --index msmarco-v2-doc-segmented-unicoil-0shot \
  --topics msmarco-v2-doc-dev2 --encoder castorini/unicoil-msmarco-passage \
  --output run.msmarco-v2-doc.unicoil-otf.dev2.txt \
  --batch 36 \
  --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-v2-doc-dev2 run.msmarco-v2-doc.unicoil-otf.dev2.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-v2-doc-dev2 run.msmarco-v2-doc.unicoil-otf.dev2.txt