This Microsoft AI paper introduces an integrated disk system: a profitable and low latency search for vector using Azure Cosmos DB

by Brenden Burgess

When you buy through links on our site, we may earn a commission at no extra cost to you. However, this does not influence our evaluations.

The ability to search for high -dimension vector representations has become a basic requirement for modern data systems. These vector representations, generated by deep learning models, encapsulate the semantic and contextual meanings of data. This allows systems to recover results that are not based on exact correspondence, but on relevance and similarity. These semantic capacities are essential in large -scale applications such as web research, AIA assistants and content recommendations, where users and agents need to access information significantly rather than structured requests alone.

One of the main problems encountered in vectors based is the high cost and the complexity of the exploitation of separate systems for transactional data and vector indices. Traditionally, vector databases are optimized only for semantic research performance, but they force users to duplicate data from their main databases, to introduce latency, general storage costs and the risk of inconsistencies. Developers are also responsible for synchronizing two distinct systems, which can limit the scalability, flexibility and data integrity when updates occur quickly.

Certain popular tools for vector research, such as Zilliz and Pinecone, work as autonomous services that offer effective search. However, these platforms are based on architectures based on segments or entirely in memory. They often require a repeated reconstruction of clues and can suffer from latency peaks and significant use of memory. This makes them ineffective in scenarios that involve large -scale or constantly evolving data. The problem gets worse when it comes to updates, filtering requests or management of several tenants, because these systems lack deep integration with transactional operations and structured indexing.

Microsoft researchers have introduced an approach that integrates the indexing of vectors directly into the Nosql d´Azure Cosmos DB engine. They used Diskann, an indexing library based on graphics already known for its performance in large -scale semantic research, and redesigned it to operate in the infrastructure of Cosmos DB. This design eliminates the need for a distinct vector database. The integrated capacities of Cosmos DB – such as high availability, elasticity, multi -location and automatic partitioning – are fully used, which makes the solution both profitable and evolutionary. Each collection maintains a single vector index per partition, which is synchronized with the main document data using the existing BW index structure.

The Rewritten Diskann Library uses rust and introduces asynchronous operations to ensure compatibility with database environments. It allows the database to recover or update only the necessary vector components, such as quantified versions or lists of neighbors, by reducing the use of memory. Insertions and vector requests are managed using a hybrid approach, most of the calculations occurring in the quantified space. This design supports paginated research and the conscious crossing of filters, which means that requests can effectively manage complex predicates and the scale of a billions of vectors. The methodology also includes a franc indexing mode, allowing distinct indices based on defined keys, such as the tenant's ID or the time period.

In experiences, the system has demonstrated strong performance. For a set of data of 10 million vectors at 768 dimensions, the query latency remained less than 20 milliseconds (P50) and the system made a recall to 10 out of 94.64%. Compared to business level offers, Azure Cosmos DB provided 15 x query costs lower than Zilliz and 41 x lower than PineCone. Profitability has been maintained even if the index has increased from 100,000 to 10 million vectors, with less than a 2 × increase in latency or demand units (rus). During ingestion, Cosmos DB billed about $ 162.5 for 10 million vector inserts, which was lower than Pinecone and Datatax, although higher than Zilliz. In addition, the recall remained stable even during heavy update cycles, on -site deletions considerably improving the precision of changing data distributions.

The study presents a convincing solution to unify the search for vectors with transactional databases. Microsoft's research team has designed a system that simplifies operations and performs considerable cost, latency and scalability. By integrating the search for vectors in Cosmos DB, they offer a practical model to integrate semantic capabilities directly into operational workloads.


Discover the Paper. All the merit of this research goes to researchers in this project. Also, don't hesitate to follow us Twitter And don't forget to join our 90K + ML Subdreddit.


Nikhil is an intern consultant at Marktechpost. It pursues a double degree integrated into materials at the Indian Kharagpur Institute of Technology. Nikhil is an IA / ML enthusiast who is still looking for applications in fields like biomaterials and biomedical sciences. With a strong experience in material science, he explores new progress and creates opportunities to contribute.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.