# CH11 Reranker

## Reranker <a href="#reranker" id="reranker"></a>

Reranker is a key component used in the modern two-step search system (Two-Stage Retrieval System). Designed to perform efficient and accurate searches on large datasets, it primarily serves to re-rank the documents found by Retriever, the first step.

### summary <a href="#id-1" id="id-1"></a>

Reranker works in the second stage of the search system, aiming to improve the accuracy of the initial search results. After Retriever quickly extracts relevant candidate documents from a large set of documents, Reranker analyzes these candidate documents more elaborately to determine the final ranking.

### How it works <a href="#id-2" id="id-2"></a>

1. Receive initial search results from Retriever.
2. Queries and each candidate document are paired to process.
3. Evaluate the relevance of each query-document pair using complex models (mainly transformer based).
4. Readjust documents according to evaluation results.
5. Outputs the final resorted result.

### Technical features <a href="#id-3" id="id-3"></a>

#### architecture <a href="#id-4" id="id-4"></a>

* Mainly using transformer-based models such as BERT and RoBERTa
* Cross-encoder structure adoption

#### Input format <a href="#id-5" id="id-5"></a>

* Generally `[CLS] Query [SEP] Document [SEP]` In form

#### Learning method <a href="#id-6" id="id-6"></a>

1. Pointwise: predict the relevance score of individual query-document pairs
2. Pairwise: Comparison of relative relevance between two documents
3. Listwise: Optimize the entire ranking list at once

### Difference from Retriever <a href="#retriever" id="retriever"></a>

| characteristic         | Retriever                                | Reranker                       |
| ---------------------- | ---------------------------------------- | ------------------------------ |
| purpose                | Quick search for related documents       | Accurate ranking               |
| Processing method      | Simple similarity calculation            | Complex semantic analysis      |
| Model structure        | Single encoder                           | Cross encoder                  |
| Operational complexity | low                                      | High                           |
| Priority               | speed                                    | accuracy                       |
| Input form             | Query and document individual processing | Query-document pair processing |
| output                 | Large set of candidate documents         | Exact rank and score           |
| scalability            | High                                     | Limited                        |

### pros and cons <a href="#id-7" id="id-7"></a>

#### Advantages <a href="#id-8" id="id-8"></a>

* Significant improvement in search accuracy
* Complex semantic relationship modeling possible
* Complementing the limits of the first-step search

#### Disadvantages <a href="#id-9" id="id-9"></a>

* Calculation cost increase
* Processing time increase
* Difficulty applying directly to large data sets

<br>
