|
[1] — (1a) |
BM25 doc (k1=0.9, b=0.4) |
0.2434 |
0.5176 |
0.6966 |
|
0.3793 |
0.5286 |
0.8085 |
|
0.2299 |
0.8856 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--topics dl19-doc \
--index msmarco-v1-doc-slim \
--output run.msmarco-v1-doc.bm25-doc-default.dl19.txt \
--bm25 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.bm25-doc-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.bm25-doc-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.bm25-doc-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--topics dl20 \
--index msmarco-v1-doc-slim \
--output run.msmarco-v1-doc.bm25-doc-default.dl20.txt \
--bm25 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.bm25-doc-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.bm25-doc-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.bm25-doc-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--topics msmarco-doc-dev \
--index msmarco-v1-doc-slim \
--output run.msmarco-v1-doc.bm25-doc-default.dev.txt \
--bm25 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.bm25-doc-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.bm25-doc-default.dev.txt
|
|
[1] — (1b) |
BM25 doc segmented (k1=0.9, b=0.4) |
0.2449 |
0.5302 |
0.6871 |
|
0.3586 |
0.5281 |
0.7755 |
|
0.2684 |
0.9178 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--topics dl19-doc \
--index msmarco-v1-doc-segmented-slim \
--output run.msmarco-v1-doc.bm25-doc-segmented-default.dl19.txt \
--bm25 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.bm25-doc-segmented-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.bm25-doc-segmented-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.bm25-doc-segmented-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--topics dl20 \
--index msmarco-v1-doc-segmented-slim \
--output run.msmarco-v1-doc.bm25-doc-segmented-default.dl20.txt \
--bm25 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.bm25-doc-segmented-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.bm25-doc-segmented-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.bm25-doc-segmented-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--topics msmarco-doc-dev \
--index msmarco-v1-doc-segmented-slim \
--output run.msmarco-v1-doc.bm25-doc-segmented-default.dev.txt \
--bm25 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.bm25-doc-segmented-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.bm25-doc-segmented-default.dev.txt
|
|
[1] — (1c) |
BM25+RM3 doc (k1=0.9, b=0.4) |
0.2774 |
0.5170 |
0.7503 |
|
0.4014 |
0.5225 |
0.8257 |
|
0.1622 |
0.8791 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--topics dl19-doc \
--index msmarco-v1-doc-full \
--output run.msmarco-v1-doc.bm25-rm3-doc-default.dl19.txt \
--bm25 --rm3 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.bm25-rm3-doc-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.bm25-rm3-doc-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.bm25-rm3-doc-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--topics dl20 \
--index msmarco-v1-doc-full \
--output run.msmarco-v1-doc.bm25-rm3-doc-default.dl20.txt \
--bm25 --rm3 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.bm25-rm3-doc-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.bm25-rm3-doc-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.bm25-rm3-doc-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--topics msmarco-doc-dev \
--index msmarco-v1-doc-full \
--output run.msmarco-v1-doc.bm25-rm3-doc-default.dev.txt \
--bm25 --rm3 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.bm25-rm3-doc-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.bm25-rm3-doc-default.dev.txt
|
|
[1] — (1d) |
BM25+RM3 doc segmented (k1=0.9, b=0.4) |
0.2884 |
0.5764 |
0.7384 |
|
0.3774 |
0.5179 |
0.8041 |
|
0.2412 |
0.9355 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--topics dl19-doc \
--index msmarco-v1-doc-segmented-full \
--output run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dl19.txt \
--bm25 --rm3 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--topics dl20 \
--index msmarco-v1-doc-segmented-full \
--output run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dl20.txt \
--bm25 --rm3 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--topics msmarco-doc-dev \
--index msmarco-v1-doc-segmented-full \
--output run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dev.txt \
--bm25 --rm3 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.bm25-rm3-doc-segmented-default.dev.txt
|
|
|
BM25+Rocchio doc (k1=0.9, b=0.4) |
0.2811 |
0.5256 |
0.7546 |
|
0.4089 |
0.5192 |
0.8273 |
|
0.1624 |
0.8789 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--topics dl19-doc \
--index msmarco-v1-doc-full \
--output run.msmarco-v1-doc.bm25-rocchio-doc-default.dl19.txt \
--bm25 --rocchio --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.bm25-rocchio-doc-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.bm25-rocchio-doc-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.bm25-rocchio-doc-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--topics dl20 \
--index msmarco-v1-doc-full \
--output run.msmarco-v1-doc.bm25-rocchio-doc-default.dl20.txt \
--bm25 --rocchio --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.bm25-rocchio-doc-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.bm25-rocchio-doc-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.bm25-rocchio-doc-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--topics msmarco-doc-dev \
--index msmarco-v1-doc-full \
--output run.msmarco-v1-doc.bm25-rocchio-doc-default.dev.txt \
--bm25 --rocchio --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.bm25-rocchio-doc-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.bm25-rocchio-doc-default.dev.txt
|
|
|
BM25+Rocchio doc segmented (k1=0.9, b=0.4) |
0.2889 |
0.5570 |
0.7423 |
|
0.3830 |
0.5226 |
0.8102 |
|
0.2449 |
0.9351 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--topics dl19-doc \
--index msmarco-v1-doc-segmented-full \
--output run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dl19.txt \
--bm25 --rocchio --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--topics dl20 \
--index msmarco-v1-doc-segmented-full \
--output run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dl20.txt \
--bm25 --rocchio --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--topics msmarco-doc-dev \
--index msmarco-v1-doc-segmented-full \
--output run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dev.txt \
--bm25 --rocchio --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.bm25-rocchio-doc-segmented-default.dev.txt
|
|
|
|
BM25 doc (k1=4.46, b=0.82) |
0.2336 |
0.5233 |
0.6757 |
|
0.3581 |
0.5061 |
0.7776 |
|
0.2768 |
0.9357 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--topics dl19-doc \
--index msmarco-v1-doc-slim \
--output run.msmarco-v1-doc.bm25-doc-tuned.dl19.txt \
--bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.bm25-doc-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.bm25-doc-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.bm25-doc-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--topics dl20 \
--index msmarco-v1-doc-slim \
--output run.msmarco-v1-doc.bm25-doc-tuned.dl20.txt \
--bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.bm25-doc-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.bm25-doc-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.bm25-doc-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--topics msmarco-doc-dev \
--index msmarco-v1-doc-slim \
--output run.msmarco-v1-doc.bm25-doc-tuned.dev.txt \
--bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.bm25-doc-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.bm25-doc-tuned.dev.txt
|
|
|
BM25 doc segmented (k1=2.16, b=0.61) |
0.2398 |
0.5389 |
0.6565 |
|
0.3458 |
0.5213 |
0.7725 |
|
0.2756 |
0.9311 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--topics dl19-doc \
--index msmarco-v1-doc-segmented-slim \
--output run.msmarco-v1-doc.bm25-doc-segmented-tuned.dl19.txt \
--bm25 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.bm25-doc-segmented-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.bm25-doc-segmented-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.bm25-doc-segmented-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--topics dl20 \
--index msmarco-v1-doc-segmented-slim \
--output run.msmarco-v1-doc.bm25-doc-segmented-tuned.dl20.txt \
--bm25 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.bm25-doc-segmented-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.bm25-doc-segmented-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.bm25-doc-segmented-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--topics msmarco-doc-dev \
--index msmarco-v1-doc-segmented-slim \
--output run.msmarco-v1-doc.bm25-doc-segmented-tuned.dev.txt \
--bm25 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.bm25-doc-segmented-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.bm25-doc-segmented-tuned.dev.txt
|
|
|
BM25+RM3 doc (k1=4.46, b=0.82) |
0.2643 |
0.5526 |
0.7189 |
|
0.3619 |
0.5238 |
0.8180 |
|
0.2231 |
0.9305 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--topics dl19-doc \
--index msmarco-v1-doc-full \
--output run.msmarco-v1-doc.bm25-rm3-doc-tuned.dl19.txt \
--bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.bm25-rm3-doc-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.bm25-rm3-doc-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.bm25-rm3-doc-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--topics dl20 \
--index msmarco-v1-doc-full \
--output run.msmarco-v1-doc.bm25-rm3-doc-tuned.dl20.txt \
--bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.bm25-rm3-doc-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.bm25-rm3-doc-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.bm25-rm3-doc-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--topics msmarco-doc-dev \
--index msmarco-v1-doc-full \
--output run.msmarco-v1-doc.bm25-rm3-doc-tuned.dev.txt \
--bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.bm25-rm3-doc-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.bm25-rm3-doc-tuned.dev.txt
|
|
|
BM25+RM3 doc segmented (k1=2.16, b=0.61) |
0.2658 |
0.5405 |
0.7030 |
|
0.3472 |
0.4979 |
0.8049 |
|
0.2443 |
0.9363 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--topics dl19-doc \
--index msmarco-v1-doc-segmented-full \
--output run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dl19.txt \
--bm25 --rm3 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--topics dl20 \
--index msmarco-v1-doc-segmented-full \
--output run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dl20.txt \
--bm25 --rm3 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--topics msmarco-doc-dev \
--index msmarco-v1-doc-segmented-full \
--output run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dev.txt \
--bm25 --rm3 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.bm25-rm3-doc-segmented-tuned.dev.txt
|
|
|
BM25+Rocchio doc (k1=4.46, b=0.82) |
0.2657 |
0.5584 |
0.7299 |
|
0.3628 |
0.5199 |
0.8217 |
|
0.2242 |
0.9316 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--topics dl19-doc \
--index msmarco-v1-doc-full \
--output run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dl19.txt \
--bm25 --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--topics dl20 \
--index msmarco-v1-doc-full \
--output run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dl20.txt \
--bm25 --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--topics msmarco-doc-dev \
--index msmarco-v1-doc-full \
--output run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dev.txt \
--bm25 --rocchio
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.bm25-rocchio-doc-tuned.dev.txt
|
|
|
BM25+Rocchio doc segmented (k1=2.16, b=0.61) |
0.2677 |
0.5424 |
0.7115 |
|
0.3521 |
0.4997 |
0.8042 |
|
0.2476 |
0.9395 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--topics dl19-doc \
--index msmarco-v1-doc-segmented-full \
--output run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dl19.txt \
--bm25 --rocchio --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--topics dl20 \
--index msmarco-v1-doc-segmented-full \
--output run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dl20.txt \
--bm25 --rocchio --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--topics msmarco-doc-dev \
--index msmarco-v1-doc-segmented-full \
--output run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dev.txt \
--bm25 --rocchio --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.bm25-rocchio-doc-segmented-tuned.dev.txt
|
|
|
[1] — (2a) |
BM25 w/ doc2query-T5 doc (k1=0.9, b=0.4) |
0.2700 |
0.5968 |
0.7190 |
|
0.4230 |
0.5885 |
0.8403 |
|
0.2880 |
0.9259 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--topics dl19-doc \
--index msmarco-v1-doc-d2q-t5 \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dl19.txt \
--bm25 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--topics dl20 \
--index msmarco-v1-doc-d2q-t5 \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dl20.txt \
--bm25 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--topics msmarco-doc-dev \
--index msmarco-v1-doc-d2q-t5 \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dev.txt \
--bm25 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.bm25-d2q-t5-doc-default.dev.txt
|
|
[1] — (2b) |
BM25 w/ doc2query-T5 doc segmented (k1=0.9, b=0.4) |
0.2798 |
0.6119 |
0.7165 |
|
0.4150 |
0.5957 |
0.8046 |
|
0.3179 |
0.9490 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--topics dl19-doc \
--index msmarco-v1-doc-segmented-d2q-t5 \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dl19.txt \
--bm25 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--topics dl20 \
--index msmarco-v1-doc-segmented-d2q-t5 \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dl20.txt \
--bm25 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--topics msmarco-doc-dev \
--index msmarco-v1-doc-segmented-d2q-t5 \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dev.txt \
--bm25 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-default.dev.txt
|
|
[1] — (2c) |
BM25+RM3 w/ doc2query-T5 doc (k1=0.9, b=0.4) |
0.3045 |
0.5897 |
0.7738 |
|
0.4229 |
0.5407 |
0.8596 |
|
0.1831 |
0.9128 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--topics dl19-doc \
--index msmarco-v1-doc-d2q-t5-docvectors \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dl19.txt \
--bm25 --rm3 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--topics dl20 \
--index msmarco-v1-doc-d2q-t5-docvectors \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dl20.txt \
--bm25 --rm3 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--topics msmarco-doc-dev \
--index msmarco-v1-doc-d2q-t5-docvectors \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dev.txt \
--bm25 --rm3 --k1 0.9 --b 0.4
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-default.dev.txt
|
|
[1] — (2d) |
BM25+RM3 w/ doc2query-T5 doc segmented (k1=0.9, b=0.4) |
0.3021 |
0.6297 |
0.7481 |
|
0.4268 |
0.5850 |
0.8270 |
|
0.2818 |
0.9547 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--topics dl19-doc \
--index msmarco-v1-doc-segmented-d2q-t5-docvectors \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl19.txt \
--bm25 --rm3 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--topics dl20 \
--index msmarco-v1-doc-segmented-d2q-t5-docvectors \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl20.txt \
--bm25 --rm3 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--topics msmarco-doc-dev \
--index msmarco-v1-doc-segmented-d2q-t5-docvectors \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dev.txt \
--bm25 --rm3 --k1 0.9 --b 0.4 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-default.dev.txt
|
|
|
|
BM25 w/ doc2query-T5 doc (k1=4.68, b=0.87) |
0.2620 |
0.5972 |
0.6867 |
|
0.4099 |
0.5852 |
0.8105 |
|
0.3269 |
0.9553 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--topics dl19-doc \
--index msmarco-v1-doc-d2q-t5 \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dl19.txt \
--bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--topics dl20 \
--index msmarco-v1-doc-d2q-t5 \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dl20.txt \
--bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--topics msmarco-doc-dev \
--index msmarco-v1-doc-d2q-t5 \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dev.txt \
--bm25
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.bm25-d2q-t5-doc-tuned.dev.txt
|
|
|
BM25 w/ doc2query-T5 doc segmented (k1=2.56, b=0.59) |
0.2658 |
0.6273 |
0.6707 |
|
0.4047 |
0.5943 |
0.7968 |
|
0.3209 |
0.9530 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--topics dl19-doc \
--index msmarco-v1-doc-segmented-d2q-t5 \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dl19.txt \
--bm25 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--topics dl20 \
--index msmarco-v1-doc-segmented-d2q-t5 \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dl20.txt \
--bm25 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--topics msmarco-doc-dev \
--index msmarco-v1-doc-segmented-d2q-t5 \
--output run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dev.txt \
--bm25 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.bm25-d2q-t5-doc-segmented-tuned.dev.txt
|
|
|
BM25+RM3 w/ doc2query-T5 doc (k1=4.68, b=0.87) |
0.2814 |
0.6080 |
0.7177 |
|
0.4104 |
0.5743 |
0.8240 |
|
0.2621 |
0.9524 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--topics dl19-doc \
--index msmarco-v1-doc-d2q-t5-docvectors \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dl19.txt \
--bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--topics dl20 \
--index msmarco-v1-doc-d2q-t5-docvectors \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dl20.txt \
--bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--topics msmarco-doc-dev \
--index msmarco-v1-doc-d2q-t5-docvectors \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dev.txt \
--bm25 --rm3
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-tuned.dev.txt
|
|
|
BM25+RM3 w/ doc2query-T5 doc segmented (k1=2.56, b=0.59) |
0.2893 |
0.6239 |
0.7066 |
|
0.4025 |
0.5724 |
0.8172 |
|
0.2985 |
0.9567 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--topics dl19-doc \
--index msmarco-v1-doc-segmented-d2q-t5-docvectors \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dl19.txt \
--bm25 --rm3 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--topics dl20 \
--index msmarco-v1-doc-segmented-d2q-t5-docvectors \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dl20.txt \
--bm25 --rm3 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--topics msmarco-doc-dev \
--index msmarco-v1-doc-segmented-d2q-t5-docvectors \
--output run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dev.txt \
--bm25 --rm3 --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.bm25-rm3-d2q-t5-doc-segmented-tuned.dev.txt
|
|
|
[1] — (3a) |
uniCOIL (noexp): pre-encoded queries |
0.2665 |
0.6349 |
0.6391 |
|
0.3698 |
0.5893 |
0.7623 |
|
0.3409 |
0.9420 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--index msmarco-v1-doc-segmented-unicoil-noexp \
--topics dl19-doc-unicoil-noexp \
--output run.msmarco-v1-doc.unicoil-noexp.dl19.txt \
--batch 36 --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.unicoil-noexp.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.unicoil-noexp.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.unicoil-noexp.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--index msmarco-v1-doc-segmented-unicoil-noexp \
--topics dl20-unicoil-noexp \
--output run.msmarco-v1-doc.unicoil-noexp.dl20.txt \
--batch 36 --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.unicoil-noexp.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.unicoil-noexp.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.unicoil-noexp.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--index msmarco-v1-doc-segmented-unicoil-noexp \
--topics msmarco-doc-dev-unicoil-noexp \
--output run.msmarco-v1-doc.unicoil-noexp.dev.txt \
--batch 36 --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.unicoil-noexp.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.unicoil-noexp.dev.txt
|
|
[1] — (3b) |
uniCOIL (w/ doc2query-T5): pre-encoded queries |
0.2789 |
0.6396 |
0.6652 |
|
0.3882 |
0.6033 |
0.7869 |
|
0.3531 |
0.9546 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--index msmarco-v1-doc-segmented-unicoil \
--topics dl19-doc-unicoil \
--output run.msmarco-v1-doc.unicoil.dl19.txt \
--batch 36 --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.unicoil.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.unicoil.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.unicoil.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--index msmarco-v1-doc-segmented-unicoil \
--topics dl20-unicoil \
--output run.msmarco-v1-doc.unicoil.dl20.txt \
--batch 36 --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.unicoil.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.unicoil.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.unicoil.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--index msmarco-v1-doc-segmented-unicoil \
--topics msmarco-doc-dev-unicoil \
--output run.msmarco-v1-doc.unicoil.dev.txt \
--batch 36 --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.unicoil.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.unicoil.dev.txt
|
|
|
|
uniCOIL (noexp): on-the-fly query inference |
0.2661 |
0.6347 |
0.6385 |
|
0.3698 |
0.5906 |
0.7621 |
|
0.3410 |
0.9420 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--index msmarco-v1-doc-segmented-unicoil-noexp \
--topics dl19-doc --encoder castorini/unicoil-noexp-msmarco-passage \
--output run.msmarco-v1-doc.unicoil-noexp-otf.dl19.txt \
--batch 36 --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.unicoil-noexp-otf.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.unicoil-noexp-otf.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.unicoil-noexp-otf.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--index msmarco-v1-doc-segmented-unicoil-noexp \
--topics dl20 --encoder castorini/unicoil-noexp-msmarco-passage \
--output run.msmarco-v1-doc.unicoil-noexp-otf.dl20.txt \
--batch 36 --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.unicoil-noexp-otf.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.unicoil-noexp-otf.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.unicoil-noexp-otf.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--index msmarco-v1-doc-segmented-unicoil-noexp \
--topics msmarco-doc-dev --encoder castorini/unicoil-noexp-msmarco-passage \
--output run.msmarco-v1-doc.unicoil-noexp-otf.dev.txt \
--batch 36 --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.unicoil-noexp-otf.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.unicoil-noexp-otf.dev.txt
|
|
|
uniCOIL (w/ doc2query-T5): on-the-fly query inference |
0.2789 |
0.6396 |
0.6654 |
|
0.3881 |
0.6030 |
0.7866 |
|
0.3532 |
0.9546 |
|
Command to generate run on TREC 2019 queries:
python -m pyserini.search.lucene \
--index msmarco-v1-doc-segmented-unicoil \
--topics dl19-doc --encoder castorini/unicoil-msmarco-passage \
--output run.msmarco-v1-doc.unicoil-otf.dl19.txt \
--batch 36 --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl19-doc run.msmarco-v1-doc.unicoil-otf.dl19.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl19-doc run.msmarco-v1-doc.unicoil-otf.dl19.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl19-doc run.msmarco-v1-doc.unicoil-otf.dl19.txt
Command to generate run on TREC 2020 queries:
python -m pyserini.search.lucene \
--index msmarco-v1-doc-segmented-unicoil \
--topics dl20 --encoder castorini/unicoil-msmarco-passage \
--output run.msmarco-v1-doc.unicoil-otf.dl20.txt \
--batch 36 --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m map dl20-doc run.msmarco-v1-doc.unicoil-otf.dl20.txt
python -m pyserini.eval.trec_eval -c -m ndcg_cut.10 dl20-doc run.msmarco-v1-doc.unicoil-otf.dl20.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 dl20-doc run.msmarco-v1-doc.unicoil-otf.dl20.txt
Command to generate run on dev queries:
python -m pyserini.search.lucene \
--index msmarco-v1-doc-segmented-unicoil \
--topics msmarco-doc-dev --encoder castorini/unicoil-msmarco-passage \
--output run.msmarco-v1-doc.unicoil-otf.dev.txt \
--batch 36 --threads 12 --impact --hits 10000 --max-passage-hits 1000 --max-passage
Evaluation commands:
python -m pyserini.eval.trec_eval -c -M 100 -m recip_rank msmarco-doc-dev run.msmarco-v1-doc.unicoil-otf.dev.txt
python -m pyserini.eval.trec_eval -c -m recall.1000 msmarco-doc-dev run.msmarco-v1-doc.unicoil-otf.dev.txt
|