May.la
  • Blog
  • Machine Learning
  • Python
  • IT
  • Linux
  • Open Source Contributions
  • May.la
    • Blog
      • Options for Date Encoding
      • Python Installation and Package Management with conda and pip
      • Migration from Sphinx to Hugo
      • Anomalies in the MLSUM Dataset
      • Clean German Wikipedia Text Corpus released
      • LightGBM with Optuna: Demo released
      • German colossal, cleaned Common Crawl corpus (GC4) released
      • Talk: Training and Evaluation of our German Electra Language Model
    • Machine Learning
      • CUDA
      • Dense Passage Retrieval (DPR)
      • Dimensionality Reduction
      • Experiment Documentation
      • German Electra Training
      • Graph Database
      • Graph Neural Network
      • Hugging Face - Datasets
      • Hugging Face - Transformers
      • LightGBM
      • Machine Learning at AWS
      • NLP Datasets
      • Optuna
      • Paraphrase Mining
      • Seldon
      • T5 and MT5 Models
    • Python
      • Beautiful Soup
      • Colab
      • Conda
      • Context Manager
      • Docstrings
      • Exceptions
      • Filesystem
      • Iterate
      • Joblib
      • JSON
      • Jupyter & JupyterLab
      • Linter
      • Logging
      • Mock
      • Pandas
      • PIP
      • Poetry
      • PyCharm
      • pyenv
      • Python Naming
      • REST API with Python
      • tqdm
      • Typing
    • IT
      • Docker
      • Duply with MinIO
      • Duply with Windows
      • Freifunk
      • GIT
      • GitHub
      • GnuPG
      • Hugo
      • kubectl
      • Kubernetes
      • Minecraft
      • PostgreSQL
      • Regular Expressions
      • Sphinx
      • Tor
      • Visual Studio Code
      • YubiKey
    • Linux
      • Archlinux Installation
      • Archlinux on Hetzner Cloud with Btrfs
      • Btrfs
      • Disk and Partition Management
      • File & Disk Tools
      • File Compression
      • Linux Commands
      • NixOS
      • Screen
      • SSH
      • SSL
      • Systemd
      • Ubuntu
      • ZFS
    • Open Source Contributions
View page source Edit this page Create child page Create documentation issue Print entire section
Tag Cloud
  • ablog1
  • common-crawl1
  • conda1
  • date1
  • docsy1
  • documentation1
  • electra1
  • encoding1
  • german-data3
  • hugo1
  • hyperparameter1
  • language-model1
  • lightgbm1
  • mlsum1
  • mt51
  • optuna1
  • pip1
  • somajo1
  • spacy1
  • sphinx1
  • summarization1
  • t51
  • text-corpus2
  • time1
  • tokenizer1
  • wikipedia1
Categories
  • Blog1
  • Data3
  • Gradient Boosted Trees1
  • NLP4
  • Python1
  1. Machine Learning
  2. Dense Passage Retrieval (DPR)

Dense Passage Retrieval (DPR)

Links

  • Training
    • SBERT:
      • Doc: https://github.com/UKPLab/sentence-transformers/tree/master/examples/training/ms_marco
      • Impl: https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/ms_marco/train_bi-encoder_mnrl.py
    • Beir
      • Doc: https://github.com/beir-cellar/beir/wiki/Examples-and-tutorials#beers-dense-retrieval-training
      • Impl: https://github.com/beir-cellar/beir/tree/main/examples/retrieval/training
    • Facebook-Research: https://github.com/facebookresearch/DPR
  • Metrics
    • Doc: https://docs.haystack.deepset.ai/docs/evaluation#metrics-retrieval
    • Impl: https://github.com/beir-cellar/beir/blob/main/beir/retrieval/custom_metrics.py
  • BM25
    • https://github.com/beir-cellar/beir/tree/main/beir/retrieval/search/lexical
  • Videos
    • [Dense Retrieval - Knowledge Distillation (Sebastian Hofstätter)] (https://www.youtube.com/watch?v=EJ_7Gx6amt8)
    • [Crash Course IR - Evaluation] (https://www.youtube.com/watch?v=EiDltQZ713I)
Last modified April 30, 2023: improve dpr (082b6e1)