Tech »  Topic »  Challenges in Web-Scale Information Retrieval: From Keywords to Embeddings

Challenges in Web-Scale Information Retrieval: From Keywords to Embeddings


by datasets... June 27th, 2025

Explore the evolution of web-scale information retrieval, detailing the limitations of keyword matching, advancements in embedding-based retrieval, and emerging challenges for ANN algorithms on web data.

Table of Links

Abstract and 1 Introduction

2 Background and Related work

2.1 Web Scale Information Retrieval

2.2 Existing Datasets

3 MS Marco Web Search Dataset and 3.1 Document Preparation

3.2 Query Selection and Labeling

3.3 Dataset Analysis

3.4 New Challenges Raised by MS MARCO Web Search

4 Benchmark Results and 4.1 Environment Setup

4.2 Baseline Methods

4.3 Evaluation Metrics

4.4 Evaluation of Embedding Models and 4.5 Evaluation of ANN Algorithms

4.6 Evaluation of End-to-end Performance

5 Potential Biases and Limitations

6 Future Work and Conclusions, and References

2 BACKGROUND AND RELATED WORK

2.1 Web Scale Information Retrieval

In traditional information retrieval, user queries and documents ...


Copyright of this story solely belongs to hackernoon.com . To see the full text click HERE