QAnything - AI Navigation

← Back to Project List

QAnything is netease's open source local knowledge base question-and-answer system (AGPL-3.0,14,025 Stars), which was released in January 2024 to locate "throw any file into it and you can accurately ask questions". The core highlight lies in Youdao's self-developed BCEmbedding retrieval model (bilingual cross-language SOTA) and profound OCR/document analysis accumulation-Chinese scanned PDF, form recognition and cross-page layout processing capabilities are unique among domestic open source schemes. V2.0 (2024-08) completed the architecture migration from GPU to pure CPU, Docker one-click deployment, and the image was slimmed from 18.94GB to 4.88GB. However, the code update of the project has basically stalled since March 2025, and 403 Open Issues have not been processed. AGPL-3.0 agreement poses legal risks to commercial deployment. If the customer scenario happens to be "a large number of Chinese scanned documents → structured analysis → local Q & A" and can accept AGPL agreement or purchase enterprise license, QAnything is still one of the best choices.

1. Project/Product Overview

Dimension	Information
Project name	QAnything(Question and Answer based on Anything)
Developer	Netease Youdao (NetEase Youdao)
Open Source Protocol	AGPL-3.0 (⚠Strong Copyleft, commercial need special attention)
Main Language	Python
GitHub Stars	14,025
Forks	1,347
Commits	754
Created	2024-01-03 (approximately 2.5 years)
Recent Code Push	2025-03-24 (⚠There has been no substantial code update for 15 months)
Latest Release	v2.0.0(2024-08-23, about 22 months ago)
Open Issues	403 (a large number of unclosed issues, low maintenance activity)
Default Branch	qanything-v2
official website	https://qanything.ai
Online Experience	https://qanything.ai
Enterprise Edition	QAnything Enterprise Edition (closed source, commercial license, Turbo/Plus/Long/Max multi-size model available)
Core Components	BCEmbedding(Embedding Rerank, Independent Open Source Project Apache-2.0)
Community	WeChat Community 3,000 People, HuggingFace Downloads 1,025,440
Business Contact	010-82558901 / AIcloud_Business@corp.youdao.com

2. What does it mostly do?

The core of the QAnything is a complete document parsing → vector indexing → two-stage retrieval → LLM generation pipeline. Users only need to upload files to obtain accurate answers based on document content.

2.1 Document Analysis Ability (Youdao Core Accumulation)

Parse Dimension	Capability Description
Supported formats	PDF, Word(docx), PPT(pptx), XLS(xlsx), Markdown(md), EML (email), TXT, Image (jpg/jpeg/png), CSV, HTML
PDF Table Parsing	v2.0 Rewrite Logic to Recognize Cross-Page Table Structure, Row-Column Layout, and Automatic Header Extraction; Tables with embedded text columns can also be correctly recognized instead of being processed as plain text
Column/multi-column layouts	Identify double-column or multi-column layouts, sort text blocks by reading habits; cross-page text to ensure that they belong to the same chunk
Image Extraction	Images in PDF are completely retained, embedded in the corresponding chunk, and "Answer with Image" is supported
Subtitle attribution	The text under a specific subtitle is assigned to the same chunk first. If it is too long, repeat the title before the new chunk to maintain semantic coherence
Metadata Embedding	metadata information is embedded in both the retrieval and Q & A phases to improve retrieval accuracy
Chunk Visualization	The front end can preview and manually edit the content of each chunk online and take effect in real time

2.2 OCR Capability

The technical accumulation of Youdao Translation/OCR is the core barrier that distinguishes QAnything from other domestic RAG schemes. For Chinese scanned PDF (image PDF), the OCR recognition accuracy of the QAnything is significantly better than that of the general OCR engine. In v2.0, OCR can be called separately as a separate service (HTTP).

115.00g Phase Retrieval (Core Architecture Differentiation)

QAnything uses the self-developed BCEmbedding model (Apache-2.0 open source) and consists of two components:

- One-stage Embedding(bce-embedding-base_v1) :MTEB has a comprehensive score of 59.43, significantly better than BGE-large-zh-v1.5(54.21) and M3E-base(53.54), especially in Chinese cross-language scenes.

-Phase II Rerank(bce-reranker-base_v1):Reranking score 60.06, better than BGE-reranker-large(59.69)

-Combination effect SOTA: In the LlamaIndex RAG evaluation, the combination of BCEmbedding BCReranker reaches the current best.

The core value of two-stage retrieval: the larger the amount of data in the knowledge base, the problem of "retrieval degradation" will appear in one-stage Embedding retrieval, and Rerank rearrangement can reverse the trend and realize "the more data, the better the effect".

2.4 Hybrid Search & Other Features

Function	Description
hybrid search	BM25 (keyword) Embedding (semantic) two-way fusion
Online Search	Support External Network Search to Supplement Information Outside the Knowledge Base (VPN Required)
Quick Start Mode	Similar to Kimi, you can upload files without creating a knowledge base.
Fileless Conversation	Pure LLM Chat Mode Not Dependant on Knowledge Base
Retrieve-only mode	Only the retrieved document fragment is returned, without calling LLM
Custom Bot	Can bind knowledge base, customize role prompt, configure model parameters, and share with others
FAQ	Built-in FAQ matching engine
File traceability	The answer can be traced back to the specific location of the original document. Click to open it directly.
Web UI	Supports multiple dialog windows, saving Q & A records as pictures, and configuring API Key/Base URL/fragment size on the front end

3. Applicable Scenario

Scenario	Description	Typical Customer
Chinese Scanned Documents/Pictures PDF Q & A	Youdao OCR has accumulated a deep accumulation, and the recognition rate of Chinese Scanned Documents is the leading in the industry	Government Files, Legal Contracts, Financial Bills
Enterprise internal knowledge base	Employee handbooks, product documents, technical specifications, etc. upload and ask	Knowledge management for medium and large enterprises
offline/classified environment deployment	supports the installation and use of the whole network disconnection, and the data is not available in the LAN	military industry, government affairs, finance
Form-intensive document processing	PDF form parsing capability is a differentiating highlight	Financial report analysis, research reports, experimental data
Cross-language Document Q & A	High Accuracy of Cross-language Semantic Retrieval in Mixed Chinese and English Documents	Foreign Enterprise China Branch, Import and Export Trade
Enterprise Digital Employee	Sales Assistant, Customer Service Robot, Technical Consultant 7 × 24 Service	Enterprise Customer Service Center, IT Service Desk

Report study/investment research, content summary, key point extraction, document question and answer, investment institutions, consulting companies.

4. Not quite the scene

Scenario	Reason	Alternative Suggestions
Requires commercial License security	AGPL-3.0 requires open source derivative codes when providing services to the outside world, lawyers usually do not recommend direct use in production	Purchase enterprise license/exchange Apache-2.0 protocol (such as MaxKB)
Long-term maintenance and ecological importance	⚠The project basically stagnated after March 2025, 403 Open Issues, no new Release for nearly 2 years	RAGFlow (update very active)/ MaxKB(20,000 Stars)
Complex Agent/Workflow Orchestration	QAnything is a straight pipeline of "Document → Question and Answer" without visual workflow orchestration	Dify / RAGFlow(v0.21 introduces Ingestion Pipeline)
Non-document knowledge management	Non-document scenarios such as knowledge graph and structured database query	GraphRAG / LlamaIndex
Requires GPU acceleration	Completely migrated from v2.0 to pure CPU, no longer provides native GPU inference	RAGFlow self-hosted LLM (with GPU support)
Large-scale concurrent production environment (open source version)	The open source version cannot perform parallel operations when uploading files, and the number and size of files are limited.	Enterprise Edition/RAGFlow/MaxKB

5. Core Competence List

Capability Category	Capability Item	Detailed Description
Document parsing	PDF parsing (including tables)	Self-developed parser to identify table structure/cross-page table/column layout/embedded picture
	OCR Recognition	Youdao OCR Technology, PDF Recognition of Scanned Documents, Obvious Advantages of Chinese Scenes
	Multi-format support	PDF/Word/PPT/XLS/Markdown/EML/TXT/Picture/CSV/HTML
	Visual Chunk Editing	The front-end directly previews the contents of a chunk, supports manual editing, and takes effect in real time
	Metadata embedding	Both the retrieval phase and the Q & A phase carry metadata
Retrieval	Embedding Retrieval	Self-developed BCEmbedding,MTEB Comprehensive 59.43, Chinese Scene SOTA
	Rerank Reordering	Self-developed BCReranker with a score of 60.06 to solve large-scale retrieval degradation
	Mixed retrieval	BM25 keyword Embedding semantic two-way fusion
	Fragment Fusion Sort	Aggregates chunk fragments of single or double documents
LLM	Multi-model access	Supports all models compatible with OpenAI APIs (Ollama, Tongyi Qiwen DashScope, etc.)
	Front-end and Back-end Configuration	API Key/Base URL/Fragment Size/Number of Output Tokens/Number of Context Messages can be configured on the front-end
	Custom Bot	Configure model parameters, role prompt, and binding knowledge base independently for each Bot
Q & A	Multiple Conversations	Support multiple conversation windows and save multiple sets of history records at the same time
	File traceability	The answer can be traced back to the original document location and opened directly
	Retrieval-only mode	Only return results without calling LLM
	Fileless Conversations	Pure LLM Chat Mode
	Internet Search	Extranet Search Supplementary Knowledge
Deployment	One-click Docker deployment	Start with the 'docker compose up -d' single-line command
	CPU-only operation	v2.0 completely migrated to CPU,Mac/Linux/Win three-terminal unified
	Offline use	Support full network disconnection installation and operation
	Mirror slimming	Compressed from 18.94GB to 4.88GB(1/4)
	Independent service calls	Embed/Rerank/OCR/PDF parsing can be independently HTTP calls
Enterprise Edition Extra	Large Model Customization	Turbo/Plus/Long/Max Available in Various Sizes
	Large-scale support	The number and size of files is 10-100 times that of the open source version
	Parallel operation	Upload files in parallel with other operations
	Field landing	Fine tuning prompt to reduce illusion, multi-industry landing cases

6. Architecture/deployment/integration approach

Overall Architecture

用户上传文档（PDF/Word/PPT/...）
    │
    ▼
文档解析层（PDF Parser / OCR / 格式转换器）
    │
    ▼
文本分块 + 元数据提取
    │
    ▼
向量化索引（Embedding 服务 + Elasticsearch/Milvus 等）
    │
    ▼
用户提问 → 一阶段 Embedding 检索（粗筛）
    │
    ▼
二阶段 Rerank 重排序（精排）
    │
    ▼
LLM 生成回答（OpenAI 兼容接口）
    │
    ▼
返回答案 + 溯源引用

Deployment Mode

Mode	Description
Docker Compose (recommended)	'docker compose up -d' one-click start, support Linux, Mac, and Windows (no WSL required)
Pure Python	v1.4.2 supports the 'pip install' method, but is not recommended for production use.
Offline deployment	Docker images can be downloaded in advance and imported offline. The whole process can be run offline.

Hardware Requirements

Environment	Requirements
CPU	v2.0 pure CPU operation, 32GB memory recommended
Storage	Mirroring 4.88GB of knowledge base data
network	offline can run, network retrieval requires external network

Model Integration

-LLM: all OpenAI API-compatible models (Ollama, Tongyi Qiwen DashScope, DeepSeek, GLM, etc.)

-Embedding Rerank: default BCEmbedding (self-developed, replaceable)

-Vector storage:Elasticsearch (built-in Chinese word segmentation IK)

API Support

Provides RESTful API, which can perform all operations such as file upload, knowledge base management, and Q & A. In v2.0, Embed, Rerank, OCR, and PDF parsing all support independent HTTP calls.

How to use #7.

Docker one-click deployment (recommended)

# 克隆项目
git clone https://github.com/netease-youdao/QAnything.git
cd QAnything

# 启动服务（根据操作系统选择 compose 文件）
# Linux
docker compose -f docker-compose-linux.yaml up -d

# Mac
docker compose -f docker-compose-mac.yaml up -d

# Windows
docker compose -f docker-compose-win.yaml up -d

# 访问 Web UI
# 浏览器打开 http://localhost:5052

Configure LLM

To set the page configuration in the Web UI frontend:

-'API_BASE': the API address of the LLM service (e. g. 'https://api.openai.com/v1' or Ollama's' http:// localhost:11434/v1')

-'API_KEY: API Key

-'MODEL': model name (e. g. gpt-4o, qwen-plus, deepseek-chat)

API call example

import requests

# 上传文件到知识库
url = "http://localhost:5052/api/local_doc_qa/upload_files"
files = {"files": open("contract.pdf", "rb")}
data = {"kb_id": "KB123456", "user_id": "user001"}
resp = requests.post(url, files=files, data=data)

# 问答
url = "http://localhost:5052/api/local_doc_qa/local_doc_chat"
payload = {
    "question": "这份合同的关键条款是什么？",
    "kb_ids": ["KB123456"],
    "user_id": "user001"
}
resp = requests.post(url, json=payload)
print(resp.json()["response"])

Use BCEmbedding independent components

# BCEmbedding 是 Apache-2.0 协议，可单独使用
from BCEmbedding import EmbeddingModel, RerankerModel

# Embedding
embed_model = EmbeddingModel(model_name_or_path="maidalun1020/bce-embedding-base_v1")
embeddings = embed_model.encode(["什么是RAG?", "RAG是检索增强生成"])

# Rerank
reranker = RerankerModel(model_name_or_path="maidalun1020/bce-reranker-base_v1")
scores = reranker.compute_score(["什么是RAG?"], ["RAG是检索增强生成技术"])

8. What can I say before sales

8.1 a sentence positioning

* "QAnything is a document intelligent question and answer system produced by Netease Youdao-throw in the scanned Chinese document and give accurate answers in seconds. "**

8.2 customer pain points → solutions

Customer pain points	QAnything solutions
"We have a large number of scanned PDF,OCR is not allowed, and the question-and-answer effect is poor"	Youdao OCR technology has accumulated deeply, and the recognition rate of Chinese scanned documents is the industry leader, which is the core differentiation advantage
"Company data cannot go to the cloud and must be deployed on the intranet"	The whole network is disconnected for installation and use, Docker is deployed with one click, and the data cannot go out of the LAN
"The larger the knowledge base, the more inaccurate the retrieval"	Two-stage retrieval (Embedding Rerank), the more data, the better the effect, BCEmbedding evaluation SOTA
"The table in PDF cannot be recognized"	v2.0 rewrites the table parsing logic, and the cross-page table, embedded table and header recognition have been optimized
"Can the open source solution agreement be commercially available?"	Enterprise Edition provides commercial license, Turbo/Plus/Long/Max models are available
"Can I run without GPU?"	v2.0 is completely migrated to pure CPU operation, Mac/Linux/Win all
"The deployment is too complicated, the team does not have ML engineers"	'docker compose up -d' one-line command to start, out-of-the-box

8.3 Differentiated Selling Points

vs RAGFlow(InfiniFlow open source):

-QAnything advantages: Youdao OCR accumulation → better PDF analysis of Chinese scanned documents/forms; BCEmbedding bilingual and cross-language SOTA; Pure CPU deployment is lighter

-RAGFlow advantages: more comprehensive DeepDoc parsing engine (support for layout recognition YOLOv8); V0.21 introduces Ingestion Pipeline that can be arranged; The update is very active (2025-2026 continuous high frequency iteration);Apache-2.0 protocol is more friendly

-Conclusion: If the customer's core requirement is "a large number of complex documents in various formats can be Pipeline by deep analysis", select RAGFlow; If it is "simple deployment of Chinese scanned documents/forms PDF Q & A", the QAnything is more accurate.

vs MaxKB(1Panel open source):

-QAnything advantages: stronger retrieval model (BCEmbedding SOTA vs MaxKB basic retrieval), deeper document analysis (especially OCR/table), commercial support in enterprise version

-MaxKB Advantages: Stars More (20,600), Protocol Friendly (GPL-3.0 with MaxKB EULA), Workflow Orchestration and MCP Tool Call, Much Higher Community Activity, Deep Integration with 1Panel Operation and Maintenance Ecology

-Conclusion: MaxKB is more suitable as a general platform Agent base; QAnything focus more on the "document → question and answer" path to the extreme.

vs Dify:

-QAnything is "Documentation Q & A Special Tool";Dify is "AI Application Development Platform"

-Dify has visual workflow orchestration, rich tools and plug-in ecology, but the document analysis depth is not as deep as QAnything.

-QAnything is suitable for customers who do not need complicated choreography and only need document questions and answers; Dify is suitable for enterprises that need custom AI applications.

Core Differentiation Summary: Youdao OCR BCEmbedding Pure CPU Deployment Scan Friendly *

8.4 Customer Value Story Line

Cut in *:"Your company has a lot of scanned contracts/files/reports, want to use AI Q & A but OCR effect is not good?"
Resonance :"The universal OCR engine has a low recognition rate for Chinese scanned documents, the contents of the form are broken after page spanning, and the column layout is misread-these are common pits."
Demo: Upload a Chinese scanned PDF contract (including forms), and ask "What is the liability clause for breach of contract?" -- QAnything accurately identify the scanned text, parse the forms, locate the answers, and trace the original text.
Advanced : Build Enterprise Knowledge Base from Single Document → Batch Upload → Custom Bot (Sales Assistant/Legal Assistant/Technical Consultant) → API Integration into OA System
Ends :"Netease has 20 years of OCR technology accumulation, 14,000 GitHub Stars,3000 WeChat community users-there is no stronger open source scheme on this subdivision track."

9. Frequently Asked Customer Questions

Question	Answer
Can AGPL-3.0 agreements be commercially available?	This is probably the most critical question. AGPL-3.0 requirements: If your system provides external network services (such as SaaS), you must provide users with the complete source code of derivative works. For internal use only and no external service, open source is not required. If the customer needs to "provide external document question and answer service" and modify the QAnything source code, it must be open source. Suggestion: The use of pure intranet is not a big problem. If you do not want to undertake open source obligations or external services, purchase the enterprise version of the commercial license.
Is the project not maintained?	Objectively speaking, the code warehouse of the project has been basically stagnant since March 2025 (no new submissions have been made in 15 months). The latest Release v2.0.0 is nearly 2 years ago, and 403 Open Issues have not been processed. However, the v2.0 version itself has complete and stable functions, which is sufficient for the relatively mature requirement of "document question and answer. If customers need new features for continuous iteration, they need to be evaluated carefully.
Which is better than RAGFlow?	Look at the scene. QAnything OCR/scan processing is a strong point, and pure CPU deployment is lighter. The RAGFlow DeepDoc analysis engine is more comprehensive and updated more actively (new functions such as Ingestion Pipeline and Long-Context RAG continue to be added). If you value the depth of document analysis and retrieval accuracy, the RAGFlow ecology is currently stronger. If the scene happens to be a "Chinese scan question and answer", the QAnything is more right.
Which major models are supported?	All OpenAI API-compatible models can be accessed: OpenAI GPT series, Tongyi Qiwen (DashScope), DeepSeek, GLM, Ollama deployed local models, etc. The front-end directly configures API Key and Base URL without changing the code.
How many files can you handle?	The open source version has a limit (the official limit is not clear, but the enterprise version claims to be 10-100 times that of the open source version). In actual use, the knowledge base experience of hundreds of documents is good; more than thousands of recommended enterprise edition or evaluation performance.
Does GPU acceleration be supported?	Version 2.0 has been completely migrated to pure CPU and no longer supports GPU acceleration. This is a deliberate architectural choice-lowering the threshold for deployment, but at the expense of processing large-scale documents faster than GPU solutions.
Can I export or back up the knowledge base?	The knowledge base data is stored in the Elasticsearch and can be backed up through the snapshot API of ES. Q & A records and bot configurations are exportable through the API.
Is there a mobile terminal?	No official mobile App. The web UI is responsive and can be used in mobile browsers. The mobile terminal needs to develop its own docking API.

10. PoC Recommendations

Recommended PoC Direction: Chinese Scanned PDF Q & A

This is the core differentiation scenario of QAnything, and it is recommended that PoC focus on this to maximize its irreplaceability.

Phase	Content	Time	Output
1. Environment preparation	One-click deployment of Docker and configuration of LLM API (such as Tongyi Thousand Questions or DeepSeek)	0.5 days	QAnything instances that can be run
2. Data Preparation	Collect 30-50 real documents from customers (it is recommended to include scanned PDF, PDF with forms, and double-column typesetting documents)	0.5 days	Test Document Set
3. Document storage	Batch upload documents, observe OCR analysis effect, and check chunk quality	1 day	Indexed knowledge base
4. Q & A verification	Design 20-30 test questions (covering: scanned text recognition, table data Q & A, cross-page content understanding, original source tracing), score one by one	1 day	Accuracy report
5. Competition Comparison	Run the same document set with RAGFlow/MaxKB to compare OCR recognition rate, table analysis and answer accuracy	1 day	Comparative analysis report
6. Integration Demonstration	Interface the customer's existing system (OA/customer service) through API to demonstrate the actual business process	1 day	Demonstrable integration scheme

Validation Metrics:

-OCR text recognition accuracy> 95% (Chinese scan)

-Complete extraction rate of table data> 90%

-End-to-end answer accuracy> 85% (based on document factual validation)

-Average response time <5 seconds (CPU-only environment)

-Answer traceability is accurate (the location of the cited document matches the answer)

PoC Note:

-Be sure to use the customer's own real documents, not clean typeset test documents.

-If the customer has concerns about the AGPL protocol, the PoC phase should be clear: the open source version is only used for technical verification, and the commercial use requires the enterprise version authorization.

-Inform customers in advance of the expected indexing speed of large-scale documents in CPU-only mode

11. Risks and Considerations

Risk	Level	Description	Mitigation
AGPL-3.0 Agreement	🔴The high	strong Copyleft protocol requires open source derivative code when providing network services to the outside world. Most corporate legal departments would object to direct use of AGPL open source in production systems.	Pure intranet use security; Purchase enterprise commercial license for external services; Or use Apache-2.0 BCEmbedding to build RAG system
Project Maintenance Stagnation	🔴High	Last code push 2025-03-24(15 months ago), latest Release 2024-08-23 (nearly 2 years),403 Open Issues not processed. The project has essentially gone into maintenance hibernation.	Available if the current v2.0 features meet the requirements; if you want to continue to add new features, recommended RAGFlow/MaxKB
Enterprise Edition Closed Source Dependency	🟡The medium	open source version has limited functions and performance-the document parsing effect is general, the number of files is limited, parallel operation is not supported, and the production environment is not supported. True production-level capability in Enterprise Edition.	PoC phase clarifies the difference between open source version and enterprise version; the enterprise version license fee is reserved in the budget
Pure CPU Performance	🟡Medium	v2.0 gives up GPU acceleration, and performance may become a bottleneck when processing a large number of documents or high concurrency.	Evaluate the actual document level; If there is a high concurrency requirement, consider enterprise version or GPU scheme
Competitive Ecological Suppression	🟡Medium	RAGFlow update is extremely fast (v0.21 introduces Ingestion Pipeline and Long-Context RAG),MaxKB is ecologically active (20,600 Stars, workflow MCP tool call), and the two protocols are more friendly	Focus on the QAnything OCR/scan differentiation advantages to avoid "comprehensive functions" compared with competitors
NetEase Strategy Shifting to Risk	🟡China	QAnything may be Netease Youdao's exploration project in the AI boom, and its core energy may have shifted to the commercialization of the enterprise version or other product lines	Pay attention to the relationship between the open source version and the enterprise version; Evaluate Netease Youdao's long-term investment willingness
Security Vulnerability Response	🟡The stagnation of project maintenance means that security vulnerabilities may not be fixed in time. Security audit before use; Run in an intranet isolation environment; Monitor the security announcements of dependent components (ES, Nginx, etc.).

12. My Pre-Sales Judgment

Recommendation: Cautiously recommended (very suitable for specific scenarios, but the overall risk is high)

Reason:

irreplaceable differentiation advantage : There is a combination of OCR BCEmbedding. At present, there is no better open source scheme for processing the subdivision scenario of Chinese scanned PDF. If the customer happens to be in this scenario, QAnything is the most preferred.
Deployment Friendly :Docker one-click startup, pure CPU operation, mirror image 4.88GB, extremely friendly to small and medium-sized enterprises without GPU.
Full-featured and stable: Although it is no longer updated, v2.0 is already a mature and fully functional document answering system.
But-AGPL and maintenance stagnation are two thunder : Legal Risk Technology Stagnation = Customers must be fully informed before sales and cannot be avoided.

Recommended Customer Persona:

-Core Appeal Accurately Hits "Chinese Scanned Document/Form PDF → Local Q & A"

-Pure intranet deployment, do not provide external document Q & A service (to avoid AGPL risks)

-The requirement for function update frequency is not high, and the existing functions of v2.0 can already meet the demand

-Limited budget but need OCR advantage (open source version can meet)

-Or have a budget to buy the enterprise version (more powerful business license)

Not recommended situations:

-Customer Legal Affairs explicitly prohibits the use of AGPL protocol → Directly exclude open source version, push enterprise version or change RAGFlow/MaxKB

-Need long-term continuous function iteration and community support → recommend RAGFlow (update the most active)

-Workflow orchestration, multi-Agent, plug-in ecology required → Dify/MaxKB recommended

-Documents are mainly in English or non-scanned documents → OCR advantage is not obvious, RAGFlow or Haystack is recommended

-High concurrency external services (and do not want to pay) → AGPL risk unacceptable

Pre-sales strategy recommendations:

-First judge the type of customer document: there are a large number of scanned documents → QAnything is the first recommendation

-reconfirm AGPL's position: legal OK and intranet use → direct push of open source PoC

-Legal service is not OK or needs external service → Push enterprise version or BCEmbedding self-built scheme (BCEmbedding Apache-2.0!)

13. REFERENCE

-GitHub repository: https://github.com/netease-youdao/QAnything

-official website/online experience: https://qanything.ai

-BCEmbedding (Retrieval Model, Apache-2.0):https://github.com/netease-youdao/BCEmbedding

-Youdao Speed Reading (online trial):https://read.youdao.com

-FAQ (Chinese):https://github.com/netease-youdao/QAnything/blob/qanything-v2/FAQ_zh.md

-Demand feedback: https://qanything.canny.io/feature-requests

-HuggingFace model: https://huggingface.co/maidalun1020

-Enterprise Business Contact: AIcloud_Business@corp.youdao.com -8255-8901

Analysis Date: 2026-07-02 | Data Aging: GitHub Information Pull in Real Time, Official Website Content from qanything. AI, Competition Comparison Based on 2026 Latest Data *