To obtain information comparable to a given query, large-scale web search engines train an encoder to contain the query and then connect the encoder to an approximate nearest neighbor search (ANNS) pipeline. Learned representations are often stiff, high-dimensional vectors generally employed as-is throughout the ANNS pipeline. They can result in computationally expensive retrieval because of their ability to accurately capture tail queries and data points.
An integral part of retrieval pipelines is a semantic search on learned representations. Learning a neural network to embed queries and a large number (N) of data points in a d-dimensional vector space is the bare minimum for a semantic search approach. All of the steps of an ANN use the same information learned by existing semantic search algorithms, which are rigid representations (RRs). That is to say, whereas ANNS indices permit a wide range of parameters for searching the design space to maximize the accuracy-compute trade-off, it is customarily believed that the dimensionality of the input data is fixed.
Different stages of ANNS can use adaptive representations of varying capacities to achieve significantly better accuracy-compute trade-offs than would be possible with rigid representations, i.e., stages of ANNS that can get away with more approximate computation should use a lower-capacity representation of the same data point. Researchers offer AdANNS, a novel ANNS design framework that takes advantage of the adaptability afforded by Matryoshka Representations.
Researchers show state-of-the-art accuracy-compute trade-offs using unique AdANNS-based key ANNS building pieces such as search data structures (AdANNS-IVF) and quantization (AdANNS-OPQ). AdANNS-IVF, for instance, achieves 1.5% higher accuracy than rigid representations-based IVF on ImageNet retrieval while using the same compute budget and achieves accuracy parity while running 90x faster on the same dataset. AdANNS-OPQ, a 32-byte variant of OPQ built using flexible representations, achieves the same accuracy as the 64-byte OPQ baseline for Natural Questions. They also demonstrate that the benefits of AdANNS may be applied to state-of-the-art composite ANNS indices by utilizing both search structures and quantization. Finally, they show that ANNS indices constructed without adaptation using matryoshka representations can be compute-awarely searched with AdANNS.
Visit https://github.com/RAIVNLab/AdANNS to get the source code.
- Improved accuracy-compute trade-offs are achieved by using AdANNS to develop new search data structures and quantization techniques.
- AdANNS-IVF can be deployed 90% faster than traditional IVF while increasing accuracy by up to 1.5%.
- AdANNS-OPQ has the same precision as the gold standard at a fraction of the price.
- The AdANNS-powered search data structure (AdANNS-IVF) and quantization (AdANNS-OPQ) significantly outperform state-of-the-art alternatives regarding the accuracy-compute trade-off.
- In addition to enabling compute-aware elastic search during inference, AdANNS generalizes to state-of-the-art composite ANNS indices.
AdANNS – Adaptive ANNS
AdANNS is a system for enhancing the accuracy-compute trade-off for semantic search components that takes advantage of the inherent flexibility of matryoshka representations. There are two main parts to the typical ANNS pipeline: (a) a search data structure that indexes and stores data points; and (b) a query-point computation method that provides the (rough) distance between a query and a set of data points.
In this study, we demonstrate that AdANNS may be used to improve the performance of both ANNS subsystems, and we quantify the improvements in terms of the trade-off between computational effort and accuracy. Specifically, they introduce AdANNS-IVF, an index structure based on AdANNS that is similar to the more common IVF structure and the related ScaNN structure. In addition, they introduce representation adaptivity in the OPQ, a de facto standard quantization, with the help of AdANNS-OPQ. AdANNS-IVFOPQ, an AdANNS variant of IVFOPQ, and AdANNS-DiskANN, a variant of DiskANN, are two other examples of hybrid methods demonstrated by the researchers. Compared to IVF indices constructed using RRs, AdANNS-IVF is experimentally demonstrated to be substantially more accurate-compute optimum. AdANNS-OPQ is shown to be as accurate as the OPQ on RRs while significantly cheaper.
AdANNS are designed with search architectures that can accommodate various large-scale use cases, each with unique resource requirements for training and inference. However, it is only sometimes the case that the user cannot search the design space because of index creation and storage issues.
AdANNS was proposed by a group of researchers from the University of Washington, Google Research, and Harvard University to enhance the accuracy-compute trade-off by utilizing adaptive representations across many stages of ANNS pipelines. Compared to traditional ANNS building blocks, which employ the same inflexible representation throughout, AdANNS takes advantage of the inherent flexibility of matryoshka representations to construct superior building blocks. For the two primary ANNS building blocks—search data structures (AdANNS-IVF) and quantization (AdANNS-OPQ)—AdANNS achieves SOTA accuracy-compute trade-off. Finally, by combining AdANNS-based building blocks, improved real-world composite ANNS indices may be constructed, allowing for compute-aware elastic search and reducing costs by as much as 8x compared to strong baselines.