1. Project Overview
| Project | Information |
|---|---|
| GitHub | alibaba/zvec |
| Official Website Documentation | zvec.org |
| Open Source Protocol | Apache License 2.0 |
| Primary | C |
| Current Latest Release | 'v0.5.1 ', Posted 2026-06-24 |
| GitHub Heat | About 12.5k stars, 748 forks, Statistical Time: 2026-06-27 |
| Official Location | Lightweight, lightning-fast, in-process vector database |
| Official SDK | Python, Node.js, Go, Rust, Dart/Flutter |
| Supported platforms | Linux x86_64/ARM64, macOS ARM64, Windows x86_64 |
A word to explain:
Zvec is like a" small high-performance vector database kernel "that can be embedded directly into business programs, without the need to deploy services such as Milvus, Elasticsearch, and Qdrant separately, and can also perform local vector retrieval, text retrieval, and hybrid search.
2. What does it mostly do?
2.1 vector similarity retrieval
Zvec can save the embedding of objects such as documents, pictures, commodities, knowledge fragments, user behaviors, etc., and then return the most similar Top-K results according to the query vector.
Typical Abilities:
-Support dense vector: such as large model embedding, image embedding.
-Support for sparse vectors: SPLADE, BM25 style of sparse semantic representation.
-Supports one-way query and multi-vector query.
-Supports indexes such as HNSW, HNSW-RaBitQ, DiskANN, IVF, Flat, etc.
-Support for similarity metrics such as 'COSINE', 'L2', 'IP.
Suitable for pre-sales statement:
- * If customers already have embedding models, Zvec can be used as a lightweight vector retrieval layer to quickly integrate vector recall capabilities into existing applications.**
2.2 full-text search FTS
Starting with 'v0.5.0 ', Zvec supports native full-text search. It can establish full-text indexes on string fields and return results by BM25 correlation sorting.
Ability points:
-Support for natural language queries.
-Support exact phrase matching.
-Support Boolean expressions, such as must include or exclude a word.
-Support Chinese Jieba word segmentation, suitable for Chinese knowledge base, articles, product titles, etc.
-Can do pure full-text search Collection, do not have to have vector fields.
Pre-sales value:
- * Many customers' searches are not pure semantics or pure keywords, but both. Zvec can now cover both keyword search and vector search in a local database, reducing dependence on external search engines.**
2.3 Hybrid Retrieval
Zvec supports the combination of vector retrieval, full-text retrieval, and scalar filtering. Common forms include:
-Vector structured filtering: for example "semantically similar documents, but only published after 2024".
-Dense Vector Sparse Vector: Combine semantic recall and keyword weights.
-Multi-way query reranker: Use 'WeightedReRanker' or 'RrfReRanker' to fuse multiple vector field results.
-FTS application layer rearrangement: For scenarios with strong requirements for both keywords and semantics, they can be recalled separately and then merged or rearranged.
Please note:
Official documentation shows that full-text search and vector search routes in a single 'Query' are mutually exclusive; if keywords and vectors are to be merged, multiple query routes, reordering, or merging at the application layer are usually used.
2.4 local persistence and in-process operation
Zvec is an in-process database and does not require a separate server start.
Features:
-The Collection is persisted in the disk directory.
-Use WAL write-ahead logs to ensure durability.
-Multiple processes can read the same Collection at the same time.
-Write is single process exclusive mode.
-Collection directories can be migrated and reopened as long as the path is correct.
Pre-sales value:
Suitable for private, offline, on-premises, edge devices, desktop tools, Notebook or CLI tools, without requiring customers to maintain an additional database service cluster.
3. Applicable Scenario
3.1 RAG Knowledge Base Retrieval
This is the most intuitive scenario. The knowledge base document is sliced to generate a embedding and stored in Zvec; when the user asks a question, the vector recalls the relevant fragment and gives it to the large model to generate the answer.
Applicable customer:
-Questions and answers on the internal knowledge base of the enterprise.
-After-sales knowledge base, product manual, system document retrieval.
-Private deployment of intelligent customer service or intelligent assistant.
-Customers who require simple deployment and data localization.
Why appropriate:
-Run locally, no additional services required.
-Support structured filtering, such as department, time, document type, permission label.
-Support the combination of Chinese full-text search and vector search.
3.2 Agent Long-Term Memory and Local Memory
Agent needs to store user preferences, historical tasks, tool execution results, document fragments and other memories. Zvec can be used as a local memory index.
Applicable customer:
-AI Agent platform.
-Personal knowledge management tools.
-Corporate Copilot.
-Local desktop AI assistant.
Why appropriate:
-Run within the process, short call chain, fast response.
-No need for separate database services, easy to distribute with the application.
-Supports multi-language access such as Node.js, Python, Go, and Rust.
Similar recommendation of 3.3 products, contents and pictures
After the objects such as goods, pictures, short videos and articles are vectorized, "similar goods", "similar pictures" and "related recommendations" are realized through similarity retrieval ".
Applicable customer:
-E-commerce search recommendation.
-Content platform related recommendations.
-Image material library, design asset library.
-Industry knowledge content recommendation.
Why appropriate:
-HNSW is suitable for low latency online recall.
-HNSW-RaBitQ can reduce memory usage in x86_64 environment.
-DiskANN can be used in hyperscale scenarios where memory budgets are limited but higher latency is acceptable.
Enterprise Search 3.4 Keyword Semantics
Many enterprise search needs to be met at the same time:
-Exact match when the user enters the keyword.
-Semantic recall when users describe in natural language.
-Also filter by business fields, such as time, source, permission, classification, status.
Zvec's FTS, vector retrieval, and filtering capabilities can cover such lightweight search scenarios.
Applicable customer:
-Enterprise document search.
-Rules and regulations retrieval.
-Contract, bidding, project information search.
-Local knowledge base search tool.
3.5 edge end, desktop end, mobile end AI application
Zvec supports Flutter/Dart, and is itself an in-process mode, suitable for embedded end-side applications.
Applicable customer:
-Offline AI assistant.
-Mobile local search.
-Industrial edge device data retrieval.
-Desktop-side knowledge base software.
Pre-sales highlights:
If the customer does not want to upload the data to the cloud, or the network is unstable, Zvec's local embedded database is easier to explain the value.
4. Not quite the scene
4.1 requires full distributed database cluster capability
Zvec is not positioned to replace Serviced or distributed databases such as Milvus, Elastic, and Qdrant. It is more like an embedded database kernel.
If the customer requires:
-Multi-node distributed write.
-Built-in high availability.
-Horizontal expansion.
-Multi-tenant service governance.
-Operations console and cluster monitoring.
Then Zvec may not be the first choice, unless the upper-level products make up these capabilities themselves.
4.2 High Concurrency Multiple Write Scenarios
The official note supports multi-process concurrent reading, but writing is a single-process exclusive mode.
Therefore, if customers need multiple service instances to write the same vector library at the same time, they need to carefully evaluate the architecture, usually doing write coordination at the upper level, or choosing a service-based vector database.
4.3 Cross-Collection Query
Zvec does not support Join, Union, or multi-Collection retrieval across Collection. Business modeling requires Collection boundaries to be designed in advance.
5. Core Competence List
| Capabilities | Descriptions | Pre-Sales Value |
|---|---|---|
| Run in-process | No need for a separate server | Reduce the complexity of deployment and O & M |
| vector search | support dense/sparse vector | support RAG, recommendation, similar search |
| Multiple index types | Flat, HNSW, HNSW-RaBitQ, DiskANN, IVF | trade-offs based on size, latency, memory |
| FTS full-text search | BM25, phrase, Boolean, Jieba | Chinese search and keyword recall are more friendly |
| Hybrid search | Vector filtering multi-vector rearrangement | Closer to enterprise search and knowledge base scenarios |
| WAL persistence | data is not easy to lose after a crash or power failure | production reliability is higher |
| Dynamic Schema | Can add or delete fields and vectors | Low cost of business iteration |
| Multilingual SDK | Python, Node.js, Go, Rust, Dart/Flutter | Easier to embed into existing technology stacks |
| Apache 2.0 | Business-friendly agreement | Easy for enterprise secondary development and integration |
| Zvec Studio | Visual browsing and debugging | Easy Demo and problem location |
6. How to choose the index type
| index | fit scenario | advantages | attention points |
|---|---|---|---|
| Flat | Small Scale, Prototype, Accuracy Benchmark | 100% Recall, Zero Configuration | Linear Scan with Large Data Volume, Slow |
| HNSW | Generic Production Online Retrieval | Low Latency, High Recall, Mature | High memory footprint |
| HNSW-RaBitQ | x86_64 high-dimensional large-scale vector | HNSW quality, lower memory | x86_64 only, requires AVX2/AVX-512, does not support ARM |
| DiskANN | Very large scale, limited memory | The index body is on disk, significantly reducing memory | Linux + libaio,QPS/latency is not as good as memory index |
| IVF | Data has a clustering structure, needs to be tuned | Memory efficiency is better, suitable for large-scale | Parameter sensitivity, cost to build clustering |
Pre-sales advice:
-Demo and PoC in the early stage: Flat or HNSW is used first, which is easy to explain and debug.
-Online RAG/Knowledge Base: HNSW preferred.
-Large data volume but memory sensitive: evaluate HNSW-RaBitQ or DiskANN.
-Billions, acceptable higher latency: Evaluation DiskANN.
-Sensitive to the correctness of the results: first use Flat to establish an accuracy benchmark, and then change the ANN index to compare Recall.
How to use #7.
7.1 installation
Python:
pip install zvec
Node.js:
npm install @zvec/zvec
Flutter/Dart:
flutter pub add zvec
Go and Rust also have official SDKs for system-level or high-performance service integration.
7.2 Create Collection and write to vector
import zvec
schema = zvec.CollectionSchema(
name="knowledge_base",
fields=[
zvec.FieldSchema(
name="publish_year",
data_type=zvec.DataType.INT32,
index_param=zvec.InvertIndexParam(enable_range_optimization=True),
),
],
vectors=[
zvec.VectorSchema(
name="embedding",
data_type=zvec.DataType.VECTOR_FP32,
dimension=768,
index_param=zvec.HnswIndexParam(metric_type=zvec.MetricType.COSINE),
),
],
)
collection = zvec.create_and_open(
path="./knowledge_base_zvec",
schema=schema,
)
collection.insert([
zvec.Doc(
id="doc_1",
vectors={"embedding": [0.1] * 768},
fields={"publish_year": 2025},
)
])
collection.optimize()
7.3 vector retrieval
result = collection.query(
queries=zvec.Query(
field_name="embedding",
vector=[0.3] * 768,
),
topk=10,
)
7.4 vector conditional filtering
result = collection.query(
queries=zvec.Query(
field_name="embedding",
vector=[0.3] * 768,
),
filter="publish_year > 2023",
topk=10,
)
7.5 to create a full-text search field
import zvec
schema = zvec.CollectionSchema(
name="article_collection",
fields=[
zvec.FieldSchema(
name="category",
data_type=zvec.DataType.STRING,
nullable=False,
),
zvec.FieldSchema(
name="content",
data_type=zvec.DataType.STRING,
nullable=False,
index_param=zvec.FtsIndexParam(
tokenizer_name="jieba",
),
),
],
)
7.6 full-text search
from zvec.model.param.query import Fts, Query
result = collection.query(
queries=Query(
field_name="content",
fts=Fts(match_string="机器学习"),
),
topk=5,
)
Advanced Query:
result = collection.query(
queries=Query(
field_name="content",
fts=Fts(query_string='+学习 -神经网络 "向量搜索"'),
),
topk=5,
)
7.7 multi-vector retrieval and rearrangement
result = collection.query(
topk=5,
queries=[
zvec.Query(field_name="dense_embedding", vector=[0.1] * 768),
zvec.Query(field_name="sparse_embedding", vector={1: 0.1, 37: 0.43}),
],
reranker=zvec.WeightedReRanker(
topn=3,
metric=zvec.MetricType.IP,
weights={
"dense_embedding": 1.2,
"sparse_embedding": 1.0,
},
),
)8. What can I say before sales
8.1 Elevator
Zvec is an open-source embedded vector database, which is suitable for embedding vector retrieval and full-text retrieval capabilities directly into existing applications. It does not need to deploy services separately, and supports local persistence, multi-language SDK, HNSW/DiskANN and other indexes and Chinese full-text retrieval. It is especially suitable for RAG, Agent memory, enterprise knowledge base, end-side AI and lightweight privatization scenarios.
8.2 customer value points
-Simple deployment: one SDK can run without operation and maintenance of database services.
-Private security: Data is stored locally, which is suitable for scenarios where customers are sensitive to data out of the domain.
-Cost controllable: No additional servers or clusters are required, suitable for lightweight services and edge ends.
-Complete retrieval capability: vector, full-text, structured filtering, multi-vector fusion.
-Chinese friendly: FTS supports Jieba participle.
-Open technology: Apache 2.0, easy to integrate, two open and product.
Differences between 8.3 and traditional vector databases
| Dimension | Zvec | Servitized/Distributed Vector Database |
|---|---|---|
| Deployment form | In-process SDK | Independent service or cluster |
| O & M Complexity | Low | Medium to High |
| Scale-out | Depends on upper-layer application design | Usually built-in |
| High availability | Depends on the upper system | Usually has a supporting scheme |
| Local/End side | Good fit | Usually heavier |
| Fit for Scenarios | Embedded, Localized, Lightweight Privatization | Massive Online Services, Multi-Tenant Platform |
8.4 FAQ
Q: Can it replace Milvus or Elasticsearch?
A: It's not a complete replacement. Zvec is more like an embedded vector retrieval kernel, suitable for localization, lightweight, in-app integration. If customers need distributed clusters, high availability, and multi-tenant governance, they still need to evaluate the service-based database.
Q: Do you support Chinese search?
A: Support. FTS full-text search can use Jieba word segmentation, suitable for Chinese and Chinese-English mixed text.
Q: Will the data be lost?
A:Zvec supports WAL pre-write logs, and official documentation explains that data persistence can be guaranteed in case of process crash or power failure. However, it is still recommended to cooperate with backup and file system-level disaster recovery in key production scenarios.
Q: Does it support multiple processes?
A: Support multiple processes to read the same Collection at the same time; Write is a single process exclusive mode.
Q: Can I do a mixed search?
A: Sure. Support vector search, full-text search, scalar filtering, multi-vector rearrangement and other combinations. Note that FTS and vector are mutually exclusive in a single query route, and complex fusion is usually achieved through multi-way recall and reranker or application-layer merging.
Q: How large is it?
A: The official emphasis can support large-scale vector retrieval, and Cohere 1M/10M benchmark test. The specific ability to meet customer SLAs depends on dimensions, indexes, hardware, Recall, concurrency, and filtering ratios. Pre-sales PoC recommends using customer real data pressure measurement.
9. PoC Recommendations
9.1 PoC Target
It is recommended not only to prove "can run", but to prove:
-Data can be imported.
-The search results are acceptable to the business.
-Delay and QPS to reach target.
-Memory and disk cost acceptable.
-Simple integration with existing applications.
9.2 PoC data
Prioritize the use of real customer data:
-Documentation knowledge base: 10000, 100000, 1 million chunk hierarchical testing.
-Commodity library: commodity title, category, attribute, picture vector.
-Customer service knowledge base: FAQ, work order, dialogue summary.
-Permission segment: department, role, security level, and tenant ID.
9.3 PoC Metrics
| Indicator | Suggested Observations |
|---|---|
| Recall/Hit Quality | Top-5/Top-10 Is there a correct answer to the business |
| P95/P99 Delay | Whether the online Q & A or search SLA is met |
| QPS | Throughput capacity in concurrent scenarios |
| Build time | Time consumed for full import and index creation |
| Memory footprint | HNSW vs RaBitQ vs DiskANN |
| Disk footprint | Collection directory size |
| Write Mode | Whether single-process write meets the business requirements |
| Filtering performance | Permission filtering, time filtering, and category filtering |
9.4 Demo Route
- Prepare to 1000 to 10000 Chinese document fragments.
- Generate 768 or 1024 dimensional vectors with the embedding model.
- Build a Collection:'content' is FTS,'embedding' is HNSW, and 'source/category/year is scalar field.
- Demonstrate three queries:
-Pure semantic question and answer recall.
-Keyword full text search.
-Vector permissions/time/category filtering.
- Show Zvec Studio or print results, indicating that the local file directory is the database.
- Do small-scale stress tests to show latency, QPS, disk and memory.
10. Risks and Considerations
-The project is relatively new: the warehouse was created in 2025-12, with fast development speed and active version iteration; Production and use should pay attention to API stability and release notes.
-DiskANN is limited by platform: currently only Linux, and depends on 'libaio '.
-HNSW-RaBitQ is limited by hardware: x86_64 only, AVX2 or AVX-512 is required, ARM is not supported.
-Multi-write requires architecture design: write is exclusive to a single process and is not suitable for uncoordinated multi-instance concurrent write.
-Cross-Collection query is not supported: data modeling should be planned in advance.
-Benchmark requires real retesting by the customer: the official results are based on specific hardware and data sets and cannot be directly committed to the customer's production environment.
11. Customer portrait suitable for pre-sales promotion
Priority Recommendation:
-Customers who want to do RAG/knowledge base PoC quickly, but do not want to deploy complex vector databases.
-Privatized deployment, customers whose data cannot be out of the domain.
-Customers who need to embed retrieval capabilities on desktop, edge, mobile, or local tools.
-Teams that have Python/Node.js applications and want to quickly add vector retrieval capabilities.
-Enterprise search scenarios that require semantic retrieval of Chinese full-text retrieval at the same time.
Cautious recommendation:
-Large platform projects requiring multi-node distributed clusters, built-in HA, multi-tenant governance.
-Real-time data streaming scenarios with high-frequency multi-instance writing.
-Customers with high requirements for search engine ecological plug-ins, complex aggregation analysis, and operation and maintenance console.
12. My Pre-Sales Judgment
The biggest highlight of Zvec is not "another vector database", but its embedded form : making the vector database capability into a local SDK that can be directly integrated. This makes it particularly suitable for two types of sales opportunities:
- Low Threshold AI Application Landing : Customers only want to be a RAG, Agent memory or local knowledge base, and do not want to set up a set of database services first.
- Product Embedded Capability: Our products or customer products need built-in search capability, which requires simple installation, offline availability and low cost.
In terms of pre-sales, it is not recommended to package it directly as a comprehensive alternative to Milvus/Elasticsearch. Better positioning is:
- * Zvec is an open source option worth evaluating when customers need "lightweight, local, embedded, low-O & M" vector retrieval capabilities. When customers need a "large-scale distributed database platform", they should combine other service solutions.**