← Back to Project List
Zvec is Alibaba's open-source in-process vector database. It is a lightweight, low-latency, and zero-service local vector search engine. It is suitable for embedding into applications, Agent, RAG, edge, desktop or privatization systems for vector retrieval, full-text retrieval, hybrid retrieval and structured filtering. For pre-sales, its keywords are: no independent service, quick to get on, localization, low latency, support for mixed search, multi-language SDK, Apache 2.0 .

1. Project Overview

ProjectInformation
GitHubalibaba/zvec
Official Website Documentationzvec.org
Open Source ProtocolApache License 2.0
PrimaryC
Current Latest Release'v0.5.1 ', Posted 2026-06-24
GitHub HeatAbout 12.5k stars, 748 forks, Statistical Time: 2026-06-27
Official LocationLightweight, lightning-fast, in-process vector database
Official SDKPython, Node.js, Go, Rust, Dart/Flutter
Supported platformsLinux x86_64/ARM64, macOS ARM64, Windows x86_64

A word to explain:

Zvec is like a" small high-performance vector database kernel "that can be embedded directly into business programs, without the need to deploy services such as Milvus, Elasticsearch, and Qdrant separately, and can also perform local vector retrieval, text retrieval, and hybrid search.

2. What does it mostly do?

2.1 vector similarity retrieval

Zvec can save the embedding of objects such as documents, pictures, commodities, knowledge fragments, user behaviors, etc., and then return the most similar Top-K results according to the query vector.

Typical Abilities:

-Support dense vector: such as large model embedding, image embedding.

-Support for sparse vectors: SPLADE, BM25 style of sparse semantic representation.

-Supports one-way query and multi-vector query.

-Supports indexes such as HNSW, HNSW-RaBitQ, DiskANN, IVF, Flat, etc.

-Support for similarity metrics such as 'COSINE', 'L2', 'IP.

Suitable for pre-sales statement:

  • * If customers already have embedding models, Zvec can be used as a lightweight vector retrieval layer to quickly integrate vector recall capabilities into existing applications.**

2.2 full-text search FTS

Starting with 'v0.5.0 ', Zvec supports native full-text search. It can establish full-text indexes on string fields and return results by BM25 correlation sorting.

Ability points:

-Support for natural language queries.

-Support exact phrase matching.

-Support Boolean expressions, such as must include or exclude a word.

-Support Chinese Jieba word segmentation, suitable for Chinese knowledge base, articles, product titles, etc.

-Can do pure full-text search Collection, do not have to have vector fields.

Pre-sales value:

  • * Many customers' searches are not pure semantics or pure keywords, but both. Zvec can now cover both keyword search and vector search in a local database, reducing dependence on external search engines.**

2.3 Hybrid Retrieval

Zvec supports the combination of vector retrieval, full-text retrieval, and scalar filtering. Common forms include:

-Vector structured filtering: for example "semantically similar documents, but only published after 2024".

-Dense Vector Sparse Vector: Combine semantic recall and keyword weights.

-Multi-way query reranker: Use 'WeightedReRanker' or 'RrfReRanker' to fuse multiple vector field results.

-FTS application layer rearrangement: For scenarios with strong requirements for both keywords and semantics, they can be recalled separately and then merged or rearranged.

Please note:

Official documentation shows that full-text search and vector search routes in a single 'Query' are mutually exclusive; if keywords and vectors are to be merged, multiple query routes, reordering, or merging at the application layer are usually used.

2.4 local persistence and in-process operation

Zvec is an in-process database and does not require a separate server start.

Features:

-The Collection is persisted in the disk directory.

-Use WAL write-ahead logs to ensure durability.

-Multiple processes can read the same Collection at the same time.

-Write is single process exclusive mode.

-Collection directories can be migrated and reopened as long as the path is correct.

Pre-sales value:

Suitable for private, offline, on-premises, edge devices, desktop tools, Notebook or CLI tools, without requiring customers to maintain an additional database service cluster.

3. Applicable Scenario

3.1 RAG Knowledge Base Retrieval

This is the most intuitive scenario. The knowledge base document is sliced to generate a embedding and stored in Zvec; when the user asks a question, the vector recalls the relevant fragment and gives it to the large model to generate the answer.

Applicable customer:

-Questions and answers on the internal knowledge base of the enterprise.

-After-sales knowledge base, product manual, system document retrieval.

-Private deployment of intelligent customer service or intelligent assistant.

-Customers who require simple deployment and data localization.

Why appropriate:

-Run locally, no additional services required.

-Support structured filtering, such as department, time, document type, permission label.

-Support the combination of Chinese full-text search and vector search.

3.2 Agent Long-Term Memory and Local Memory

Agent needs to store user preferences, historical tasks, tool execution results, document fragments and other memories. Zvec can be used as a local memory index.

Applicable customer:

-AI Agent platform.

-Personal knowledge management tools.

-Corporate Copilot.

-Local desktop AI assistant.

Why appropriate:

-Run within the process, short call chain, fast response.

-No need for separate database services, easy to distribute with the application.

-Supports multi-language access such as Node.js, Python, Go, and Rust.

Similar recommendation of 3.3 products, contents and pictures

After the objects such as goods, pictures, short videos and articles are vectorized, "similar goods", "similar pictures" and "related recommendations" are realized through similarity retrieval ".

Applicable customer:

-E-commerce search recommendation.

-Content platform related recommendations.

-Image material library, design asset library.

-Industry knowledge content recommendation.

Why appropriate:

-HNSW is suitable for low latency online recall.

-HNSW-RaBitQ can reduce memory usage in x86_64 environment.

-DiskANN can be used in hyperscale scenarios where memory budgets are limited but higher latency is acceptable.

Enterprise Search 3.4 Keyword Semantics

Many enterprise search needs to be met at the same time:

-Exact match when the user enters the keyword.

-Semantic recall when users describe in natural language.

-Also filter by business fields, such as time, source, permission, classification, status.

Zvec's FTS, vector retrieval, and filtering capabilities can cover such lightweight search scenarios.

Applicable customer:

-Enterprise document search.

-Rules and regulations retrieval.

-Contract, bidding, project information search.

-Local knowledge base search tool.

3.5 edge end, desktop end, mobile end AI application

Zvec supports Flutter/Dart, and is itself an in-process mode, suitable for embedded end-side applications.

Applicable customer:

-Offline AI assistant.

-Mobile local search.

-Industrial edge device data retrieval.

-Desktop-side knowledge base software.

Pre-sales highlights:

If the customer does not want to upload the data to the cloud, or the network is unstable, Zvec's local embedded database is easier to explain the value.

4. Not quite the scene

4.1 requires full distributed database cluster capability

Zvec is not positioned to replace Serviced or distributed databases such as Milvus, Elastic, and Qdrant. It is more like an embedded database kernel.

If the customer requires:

-Multi-node distributed write.

-Built-in high availability.

-Horizontal expansion.

-Multi-tenant service governance.

-Operations console and cluster monitoring.

Then Zvec may not be the first choice, unless the upper-level products make up these capabilities themselves.

4.2 High Concurrency Multiple Write Scenarios

The official note supports multi-process concurrent reading, but writing is a single-process exclusive mode.

Therefore, if customers need multiple service instances to write the same vector library at the same time, they need to carefully evaluate the architecture, usually doing write coordination at the upper level, or choosing a service-based vector database.

4.3 Cross-Collection Query

Zvec does not support Join, Union, or multi-Collection retrieval across Collection. Business modeling requires Collection boundaries to be designed in advance.

5. Core Competence List

CapabilitiesDescriptionsPre-Sales Value
Run in-processNo need for a separate serverReduce the complexity of deployment and O & M
vector searchsupport dense/sparse vectorsupport RAG, recommendation, similar search
Multiple index typesFlat, HNSW, HNSW-RaBitQ, DiskANN, IVFtrade-offs based on size, latency, memory
FTS full-text searchBM25, phrase, Boolean, JiebaChinese search and keyword recall are more friendly
Hybrid searchVector filtering multi-vector rearrangementCloser to enterprise search and knowledge base scenarios
WAL persistencedata is not easy to lose after a crash or power failureproduction reliability is higher
Dynamic SchemaCan add or delete fields and vectorsLow cost of business iteration
Multilingual SDKPython, Node.js, Go, Rust, Dart/FlutterEasier to embed into existing technology stacks
Apache 2.0Business-friendly agreementEasy for enterprise secondary development and integration
Zvec StudioVisual browsing and debuggingEasy Demo and problem location

6. How to choose the index type

indexfit scenarioadvantagesattention points
FlatSmall Scale, Prototype, Accuracy Benchmark100% Recall, Zero ConfigurationLinear Scan with Large Data Volume, Slow
HNSWGeneric Production Online RetrievalLow Latency, High Recall, MatureHigh memory footprint
HNSW-RaBitQx86_64 high-dimensional large-scale vectorHNSW quality, lower memoryx86_64 only, requires AVX2/AVX-512, does not support ARM
DiskANNVery large scale, limited memoryThe index body is on disk, significantly reducing memoryLinux + libaio,QPS/latency is not as good as memory index
IVFData has a clustering structure, needs to be tunedMemory efficiency is better, suitable for large-scaleParameter sensitivity, cost to build clustering

Pre-sales advice:

-Demo and PoC in the early stage: Flat or HNSW is used first, which is easy to explain and debug.

-Online RAG/Knowledge Base: HNSW preferred.

-Large data volume but memory sensitive: evaluate HNSW-RaBitQ or DiskANN.

-Billions, acceptable higher latency: Evaluation DiskANN.

-Sensitive to the correctness of the results: first use Flat to establish an accuracy benchmark, and then change the ANN index to compare Recall.

How to use #7.

7.1 installation

Python:

pip install zvec

Node.js:

npm install @zvec/zvec

Flutter/Dart:

flutter pub add zvec

Go and Rust also have official SDKs for system-level or high-performance service integration.

7.2 Create Collection and write to vector

import zvec

schema = zvec.CollectionSchema(
    name="knowledge_base",
    fields=[
        zvec.FieldSchema(
            name="publish_year",
            data_type=zvec.DataType.INT32,
            index_param=zvec.InvertIndexParam(enable_range_optimization=True),
        ),
    ],
    vectors=[
        zvec.VectorSchema(
            name="embedding",
            data_type=zvec.DataType.VECTOR_FP32,
            dimension=768,
            index_param=zvec.HnswIndexParam(metric_type=zvec.MetricType.COSINE),
        ),
    ],
)

collection = zvec.create_and_open(
    path="./knowledge_base_zvec",
    schema=schema,
)

collection.insert([
    zvec.Doc(
        id="doc_1",
        vectors={"embedding": [0.1] * 768},
        fields={"publish_year": 2025},
    )
])

collection.optimize()

7.3 vector retrieval

result = collection.query(
    queries=zvec.Query(
        field_name="embedding",
        vector=[0.3] * 768,
    ),
    topk=10,
)

7.4 vector conditional filtering

result = collection.query(
    queries=zvec.Query(
        field_name="embedding",
        vector=[0.3] * 768,
    ),
    filter="publish_year > 2023",
    topk=10,
)

7.5 to create a full-text search field

import zvec

schema = zvec.CollectionSchema(
    name="article_collection",
    fields=[
        zvec.FieldSchema(
            name="category",
            data_type=zvec.DataType.STRING,
            nullable=False,
        ),
        zvec.FieldSchema(
            name="content",
            data_type=zvec.DataType.STRING,
            nullable=False,
            index_param=zvec.FtsIndexParam(
                tokenizer_name="jieba",
            ),
        ),
    ],
)

7.6 full-text search

from zvec.model.param.query import Fts, Query

result = collection.query(
    queries=Query(
        field_name="content",
        fts=Fts(match_string="机器学习"),
    ),
    topk=5,
)

Advanced Query:

result = collection.query(
    queries=Query(
        field_name="content",
        fts=Fts(query_string='+学习 -神经网络 "向量搜索"'),
    ),
    topk=5,
)

7.7 multi-vector retrieval and rearrangement

result = collection.query(
    topk=5,
    queries=[
        zvec.Query(field_name="dense_embedding", vector=[0.1] * 768),
        zvec.Query(field_name="sparse_embedding", vector={1: 0.1, 37: 0.43}),
    ],
    reranker=zvec.WeightedReRanker(
        topn=3,
        metric=zvec.MetricType.IP,
        weights={
            "dense_embedding": 1.2,
            "sparse_embedding": 1.0,
        },
    ),
)

8. What can I say before sales

8.1 Elevator

Zvec is an open-source embedded vector database, which is suitable for embedding vector retrieval and full-text retrieval capabilities directly into existing applications. It does not need to deploy services separately, and supports local persistence, multi-language SDK, HNSW/DiskANN and other indexes and Chinese full-text retrieval. It is especially suitable for RAG, Agent memory, enterprise knowledge base, end-side AI and lightweight privatization scenarios.

8.2 customer value points

-Simple deployment: one SDK can run without operation and maintenance of database services.

-Private security: Data is stored locally, which is suitable for scenarios where customers are sensitive to data out of the domain.

-Cost controllable: No additional servers or clusters are required, suitable for lightweight services and edge ends.

-Complete retrieval capability: vector, full-text, structured filtering, multi-vector fusion.

-Chinese friendly: FTS supports Jieba participle.

-Open technology: Apache 2.0, easy to integrate, two open and product.

Differences between 8.3 and traditional vector databases

DimensionZvecServitized/Distributed Vector Database
Deployment formIn-process SDKIndependent service or cluster
O & M ComplexityLowMedium to High
Scale-outDepends on upper-layer application designUsually built-in
High availabilityDepends on the upper systemUsually has a supporting scheme
Local/End sideGood fitUsually heavier
Fit for ScenariosEmbedded, Localized, Lightweight PrivatizationMassive Online Services, Multi-Tenant Platform

8.4 FAQ

Q: Can it replace Milvus or Elasticsearch?

A: It's not a complete replacement. Zvec is more like an embedded vector retrieval kernel, suitable for localization, lightweight, in-app integration. If customers need distributed clusters, high availability, and multi-tenant governance, they still need to evaluate the service-based database.

Q: Do you support Chinese search?

A: Support. FTS full-text search can use Jieba word segmentation, suitable for Chinese and Chinese-English mixed text.

Q: Will the data be lost?

A:Zvec supports WAL pre-write logs, and official documentation explains that data persistence can be guaranteed in case of process crash or power failure. However, it is still recommended to cooperate with backup and file system-level disaster recovery in key production scenarios.

Q: Does it support multiple processes?

A: Support multiple processes to read the same Collection at the same time; Write is a single process exclusive mode.

Q: Can I do a mixed search?

A: Sure. Support vector search, full-text search, scalar filtering, multi-vector rearrangement and other combinations. Note that FTS and vector are mutually exclusive in a single query route, and complex fusion is usually achieved through multi-way recall and reranker or application-layer merging.

Q: How large is it?

A: The official emphasis can support large-scale vector retrieval, and Cohere 1M/10M benchmark test. The specific ability to meet customer SLAs depends on dimensions, indexes, hardware, Recall, concurrency, and filtering ratios. Pre-sales PoC recommends using customer real data pressure measurement.

9. PoC Recommendations

9.1 PoC Target

It is recommended not only to prove "can run", but to prove:

-Data can be imported.

-The search results are acceptable to the business.

-Delay and QPS to reach target.

-Memory and disk cost acceptable.

-Simple integration with existing applications.

9.2 PoC data

Prioritize the use of real customer data:

-Documentation knowledge base: 10000, 100000, 1 million chunk hierarchical testing.

-Commodity library: commodity title, category, attribute, picture vector.

-Customer service knowledge base: FAQ, work order, dialogue summary.

-Permission segment: department, role, security level, and tenant ID.

9.3 PoC Metrics

IndicatorSuggested Observations
Recall/Hit QualityTop-5/Top-10 Is there a correct answer to the business
P95/P99 DelayWhether the online Q & A or search SLA is met
QPSThroughput capacity in concurrent scenarios
Build timeTime consumed for full import and index creation
Memory footprintHNSW vs RaBitQ vs DiskANN
Disk footprintCollection directory size
Write ModeWhether single-process write meets the business requirements
Filtering performancePermission filtering, time filtering, and category filtering

9.4 Demo Route

  1. Prepare to 1000 to 10000 Chinese document fragments.
  2. Generate 768 or 1024 dimensional vectors with the embedding model.
  3. Build a Collection:'content' is FTS,'embedding' is HNSW, and 'source/category/year is scalar field.
  4. Demonstrate three queries:

-Pure semantic question and answer recall.

-Keyword full text search.

-Vector permissions/time/category filtering.

  1. Show Zvec Studio or print results, indicating that the local file directory is the database.
  2. Do small-scale stress tests to show latency, QPS, disk and memory.

10. Risks and Considerations

-The project is relatively new: the warehouse was created in 2025-12, with fast development speed and active version iteration; Production and use should pay attention to API stability and release notes.

-DiskANN is limited by platform: currently only Linux, and depends on 'libaio '.

-HNSW-RaBitQ is limited by hardware: x86_64 only, AVX2 or AVX-512 is required, ARM is not supported.

-Multi-write requires architecture design: write is exclusive to a single process and is not suitable for uncoordinated multi-instance concurrent write.

-Cross-Collection query is not supported: data modeling should be planned in advance.

-Benchmark requires real retesting by the customer: the official results are based on specific hardware and data sets and cannot be directly committed to the customer's production environment.

11. Customer portrait suitable for pre-sales promotion

Priority Recommendation:

-Customers who want to do RAG/knowledge base PoC quickly, but do not want to deploy complex vector databases.

-Privatized deployment, customers whose data cannot be out of the domain.

-Customers who need to embed retrieval capabilities on desktop, edge, mobile, or local tools.

-Teams that have Python/Node.js applications and want to quickly add vector retrieval capabilities.

-Enterprise search scenarios that require semantic retrieval of Chinese full-text retrieval at the same time.

Cautious recommendation:

-Large platform projects requiring multi-node distributed clusters, built-in HA, multi-tenant governance.

-Real-time data streaming scenarios with high-frequency multi-instance writing.

-Customers with high requirements for search engine ecological plug-ins, complex aggregation analysis, and operation and maintenance console.

12. My Pre-Sales Judgment

The biggest highlight of Zvec is not "another vector database", but its embedded form : making the vector database capability into a local SDK that can be directly integrated. This makes it particularly suitable for two types of sales opportunities:

  1. Low Threshold AI Application Landing : Customers only want to be a RAG, Agent memory or local knowledge base, and do not want to set up a set of database services first.
  2. Product Embedded Capability: Our products or customer products need built-in search capability, which requires simple installation, offline availability and low cost.

In terms of pre-sales, it is not recommended to package it directly as a comprehensive alternative to Milvus/Elasticsearch. Better positioning is:

  • * Zvec is an open source option worth evaluating when customers need "lightweight, local, embedded, low-O & M" vector retrieval capabilities. When customers need a "large-scale distributed database platform", they should combine other service solutions.**