# CH11 Reranker ## Reranker Reranker is a key component used in the modern two-step search system (Two-Stage Retrieval System). Designed to perform efficient and accurate searches on large datasets, it primarily serves to re-rank the documents found by Retriever, the first step. ### summary Reranker works in the second stage of the search system, aiming to improve the accuracy of the initial search results. After Retriever quickly extracts relevant candidate documents from a large set of documents, Reranker analyzes these candidate documents more elaborately to determine the final ranking. ### How it works 1. Receive initial search results from Retriever. 2. Queries and each candidate document are paired to process. 3. Evaluate the relevance of each query-document pair using complex models (mainly transformer based). 4. Readjust documents according to evaluation results. 5. Outputs the final resorted result. ### Technical features #### architecture * Mainly using transformer-based models such as BERT and RoBERTa * Cross-encoder structure adoption #### Input format * Generally `[CLS] Query [SEP] Document [SEP]` In form #### Learning method 1. Pointwise: predict the relevance score of individual query-document pairs 2. Pairwise: Comparison of relative relevance between two documents 3. Listwise: Optimize the entire ranking list at once ### Difference from Retriever | characteristic | Retriever | Reranker | | ---------------------- | ---------------------------------------- | ------------------------------ | | purpose | Quick search for related documents | Accurate ranking | | Processing method | Simple similarity calculation | Complex semantic analysis | | Model structure | Single encoder | Cross encoder | | Operational complexity | low | High | | Priority | speed | accuracy | | Input form | Query and document individual processing | Query-document pair processing | | output | Large set of candidate documents | Exact rank and score | | scalability | High | Limited | ### pros and cons #### Advantages * Significant improvement in search accuracy * Complex semantic relationship modeling possible * Complementing the limits of the first-step search #### Disadvantages * Calculation cost increase * Processing time increase * Difficulty applying directly to large data sets