← Back to Project List
QAnything is netease's open source local knowledge base question-and-answer system (AGPL-3.0,14,025 Stars), which was released in January 2024 to locate "throw any file into it and you can accurately ask questions". The core highlight lies in Youdao's self-developed BCEmbedding retrieval model (bilingual cross-language SOTA) and profound OCR/document analysis accumulation-Chinese scanned PDF, form recognition and cross-page layout processing capabilities are unique among domestic open source schemes. V2.0 (2024-08) completed the architecture migration from GPU to pure CPU, Docker one-click deployment, and the image was slimmed from 18.94GB to 4.88GB. However, the code update of the project has basically stalled since March 2025, and 403 Open Issues have not been processed. AGPL-3.0 agreement poses legal risks to commercial deployment. If the customer scenario happens to be "a large number of Chinese scanned documents → structured analysis → local Q & A" and can accept AGPL agreement or purchase enterprise license, QAnything is still one of the best choices.

1. Project/Product Overview

DimensionInformation
Project nameQAnything(Question and Answer based on Anything)
DeveloperNetease Youdao (NetEase Youdao)
Open Source ProtocolAGPL-3.0 (⚠Strong Copyleft, commercial need special attention)
Main LanguagePython
GitHub Stars14,025
Forks1,347
Commits754
Created2024-01-03 (approximately 2.5 years)
Recent Code Push2025-03-24 (⚠There has been no substantial code update for 15 months)
Latest Releasev2.0.0(2024-08-23, about 22 months ago)
Open Issues403 (a large number of unclosed issues, low maintenance activity)
Default Branchqanything-v2
official websitehttps://qanything.ai
Online Experiencehttps://qanything.ai
Enterprise EditionQAnything Enterprise Edition (closed source, commercial license, Turbo/Plus/Long/Max multi-size model available)
Core ComponentsBCEmbedding(Embedding Rerank, Independent Open Source Project Apache-2.0)
CommunityWeChat Community 3,000 People, HuggingFace Downloads 1,025,440
Business Contact010-82558901 / AIcloud_Business@corp.youdao.com

2. What does it mostly do?

The core of the QAnything is a complete document parsing → vector indexing → two-stage retrieval → LLM generation pipeline. Users only need to upload files to obtain accurate answers based on document content.

2.1 Document Analysis Ability (Youdao Core Accumulation)

Parse DimensionCapability Description
Supported formatsPDF, Word(docx), PPT(pptx), XLS(xlsx), Markdown(md), EML (email), TXT, Image (jpg/jpeg/png), CSV, HTML
PDF Table Parsingv2.0 Rewrite Logic to Recognize Cross-Page Table Structure, Row-Column Layout, and Automatic Header Extraction; Tables with embedded text columns can also be correctly recognized instead of being processed as plain text
Column/multi-column layoutsIdentify double-column or multi-column layouts, sort text blocks by reading habits; cross-page text to ensure that they belong to the same chunk
Image ExtractionImages in PDF are completely retained, embedded in the corresponding chunk, and "Answer with Image" is supported
Subtitle attributionThe text under a specific subtitle is assigned to the same chunk first. If it is too long, repeat the title before the new chunk to maintain semantic coherence
Metadata Embeddingmetadata information is embedded in both the retrieval and Q & A phases to improve retrieval accuracy
Chunk VisualizationThe front end can preview and manually edit the content of each chunk online and take effect in real time

2.2 OCR Capability

The technical accumulation of Youdao Translation/OCR is the core barrier that distinguishes QAnything from other domestic RAG schemes. For Chinese scanned PDF (image PDF), the OCR recognition accuracy of the QAnything is significantly better than that of the general OCR engine. In v2.0, OCR can be called separately as a separate service (HTTP).

115.00g Phase Retrieval (Core Architecture Differentiation)

QAnything uses the self-developed BCEmbedding model (Apache-2.0 open source) and consists of two components:

- One-stage Embedding(bce-embedding-base_v1) :MTEB has a comprehensive score of 59.43, significantly better than BGE-large-zh-v1.5(54.21) and M3E-base(53.54), especially in Chinese cross-language scenes.

-Phase II Rerank(bce-reranker-base_v1):Reranking score 60.06, better than BGE-reranker-large(59.69)

-Combination effect SOTA: In the LlamaIndex RAG evaluation, the combination of BCEmbedding BCReranker reaches the current best.

The core value of two-stage retrieval: the larger the amount of data in the knowledge base, the problem of "retrieval degradation" will appear in one-stage Embedding retrieval, and Rerank rearrangement can reverse the trend and realize "the more data, the better the effect".

2.4 Hybrid Search & Other Features

FunctionDescription
hybrid searchBM25 (keyword) Embedding (semantic) two-way fusion
Online SearchSupport External Network Search to Supplement Information Outside the Knowledge Base (VPN Required)
Quick Start ModeSimilar to Kimi, you can upload files without creating a knowledge base.
Fileless ConversationPure LLM Chat Mode Not Dependant on Knowledge Base
Retrieve-only modeOnly the retrieved document fragment is returned, without calling LLM
Custom BotCan bind knowledge base, customize role prompt, configure model parameters, and share with others
FAQBuilt-in FAQ matching engine
File traceabilityThe answer can be traced back to the specific location of the original document. Click to open it directly.
Web UISupports multiple dialog windows, saving Q & A records as pictures, and configuring API Key/Base URL/fragment size on the front end

3. Applicable Scenario

ScenarioDescriptionTypical Customer
Chinese Scanned Documents/Pictures PDF Q & AYoudao OCR has accumulated a deep accumulation, and the recognition rate of Chinese Scanned Documents is the leading in the industryGovernment Files, Legal Contracts, Financial Bills
Enterprise internal knowledge baseEmployee handbooks, product documents, technical specifications, etc. upload and askKnowledge management for medium and large enterprises
offline/classified environment deployment supports the installation and use of the whole network disconnection, and the data is not available in the LANmilitary industry, government affairs, finance
Form-intensive document processingPDF form parsing capability is a differentiating highlightFinancial report analysis, research reports, experimental data
Cross-language Document Q & AHigh Accuracy of Cross-language Semantic Retrieval in Mixed Chinese and English DocumentsForeign Enterprise China Branch, Import and Export Trade
Enterprise Digital EmployeeSales Assistant, Customer Service Robot, Technical Consultant 7 × 24 ServiceEnterprise Customer Service Center, IT Service Desk

Report study/investment research, content summary, key point extraction, document question and answer, investment institutions, consulting companies.

4. Not quite the scene

ScenarioReasonAlternative Suggestions
Requires commercial License securityAGPL-3.0 requires open source derivative codes when providing services to the outside world, lawyers usually do not recommend direct use in productionPurchase enterprise license/exchange Apache-2.0 protocol (such as MaxKB)
Long-term maintenance and ecological importance⚠The project basically stagnated after March 2025, 403 Open Issues, no new Release for nearly 2 yearsRAGFlow (update very active)/ MaxKB(20,000 Stars)
Complex Agent/Workflow OrchestrationQAnything is a straight pipeline of "Document → Question and Answer" without visual workflow orchestrationDify / RAGFlow(v0.21 introduces Ingestion Pipeline)
Non-document knowledge managementNon-document scenarios such as knowledge graph and structured database queryGraphRAG / LlamaIndex
Requires GPU accelerationCompletely migrated from v2.0 to pure CPU, no longer provides native GPU inferenceRAGFlow self-hosted LLM (with GPU support)
Large-scale concurrent production environment (open source version)The open source version cannot perform parallel operations when uploading files, and the number and size of files are limited.Enterprise Edition/RAGFlow/MaxKB

5. Core Competence List

Capability CategoryCapability ItemDetailed Description
Document parsingPDF parsing (including tables)Self-developed parser to identify table structure/cross-page table/column layout/embedded picture
OCR RecognitionYoudao OCR Technology, PDF Recognition of Scanned Documents, Obvious Advantages of Chinese Scenes
Multi-format supportPDF/Word/PPT/XLS/Markdown/EML/TXT/Picture/CSV/HTML
Visual Chunk EditingThe front-end directly previews the contents of a chunk, supports manual editing, and takes effect in real time
Metadata embeddingBoth the retrieval phase and the Q & A phase carry metadata
Retrieval Embedding RetrievalSelf-developed BCEmbedding,MTEB Comprehensive 59.43, Chinese Scene SOTA
Rerank ReorderingSelf-developed BCReranker with a score of 60.06 to solve large-scale retrieval degradation
Mixed retrievalBM25 keyword Embedding semantic two-way fusion
Fragment Fusion SortAggregates chunk fragments of single or double documents
LLMMulti-model accessSupports all models compatible with OpenAI APIs (Ollama, Tongyi Qiwen DashScope, etc.)
Front-end and Back-end ConfigurationAPI Key/Base URL/Fragment Size/Number of Output Tokens/Number of Context Messages can be configured on the front-end
Custom BotConfigure model parameters, role prompt, and binding knowledge base independently for each Bot
Q & AMultiple ConversationsSupport multiple conversation windows and save multiple sets of history records at the same time
File traceabilityThe answer can be traced back to the original document location and opened directly
Retrieval-only modeOnly return results without calling LLM
Fileless ConversationsPure LLM Chat Mode
Internet SearchExtranet Search Supplementary Knowledge
DeploymentOne-click Docker deploymentStart with the 'docker compose up -d' single-line command
CPU-only operationv2.0 completely migrated to CPU,Mac/Linux/Win three-terminal unified
Offline useSupport full network disconnection installation and operation
Mirror slimmingCompressed from 18.94GB to 4.88GB(1/4)
Independent service callsEmbed/Rerank/OCR/PDF parsing can be independently HTTP calls
Enterprise Edition ExtraLarge Model CustomizationTurbo/Plus/Long/Max Available in Various Sizes
Large-scale supportThe number and size of files is 10-100 times that of the open source version
Parallel operationUpload files in parallel with other operations
Field landingFine tuning prompt to reduce illusion, multi-industry landing cases

6. Architecture/deployment/integration approach

Overall Architecture

用户上传文档(PDF/Word/PPT/...)
    │
    ▼
文档解析层(PDF Parser / OCR / 格式转换器)
    │
    ▼
文本分块 + 元数据提取
    │
    ▼
向量化索引(Embedding 服务 + Elasticsearch/Milvus 等)
    │
    ▼
用户提问 → 一阶段 Embedding 检索(粗筛)
    │
    ▼
二阶段 Rerank 重排序(精排)
    │
    ▼
LLM 生成回答(OpenAI 兼容接口)
    │
    ▼
返回答案 + 溯源引用

Deployment Mode

ModeDescription
Docker Compose (recommended)'docker compose up -d' one-click start, support Linux, Mac, and Windows (no WSL required)
Pure Pythonv1.4.2 supports the 'pip install' method, but is not recommended for production use.
Offline deploymentDocker images can be downloaded in advance and imported offline. The whole process can be run offline.

Hardware Requirements

EnvironmentRequirements
CPUv2.0 pure CPU operation, 32GB memory recommended
StorageMirroring 4.88GB of knowledge base data
networkoffline can run, network retrieval requires external network

Model Integration

-LLM: all OpenAI API-compatible models (Ollama, Tongyi Qiwen DashScope, DeepSeek, GLM, etc.)

-Embedding Rerank: default BCEmbedding (self-developed, replaceable)

-Vector storage:Elasticsearch (built-in Chinese word segmentation IK)

API Support

Provides RESTful API, which can perform all operations such as file upload, knowledge base management, and Q & A. In v2.0, Embed, Rerank, OCR, and PDF parsing all support independent HTTP calls.

How to use #7.

Docker one-click deployment (recommended)

# 克隆项目
git clone https://github.com/netease-youdao/QAnything.git
cd QAnything

# 启动服务(根据操作系统选择 compose 文件)
# Linux
docker compose -f docker-compose-linux.yaml up -d

# Mac
docker compose -f docker-compose-mac.yaml up -d

# Windows
docker compose -f docker-compose-win.yaml up -d

# 访问 Web UI
# 浏览器打开 http://localhost:5052

Configure LLM

To set the page configuration in the Web UI frontend:

-'API_BASE': the API address of the LLM service (e. g. 'https://api.openai.com/v1' or Ollama's' http:// localhost:11434/v1')

-'API_KEY: API Key

-'MODEL': model name (e. g. gpt-4o, qwen-plus, deepseek-chat)

API call example

import requests

# 上传文件到知识库
url = "http://localhost:5052/api/local_doc_qa/upload_files"
files = {"files": open("contract.pdf", "rb")}
data = {"kb_id": "KB123456", "user_id": "user001"}
resp = requests.post(url, files=files, data=data)

# 问答
url = "http://localhost:5052/api/local_doc_qa/local_doc_chat"
payload = {
    "question": "这份合同的关键条款是什么?",
    "kb_ids": ["KB123456"],
    "user_id": "user001"
}
resp = requests.post(url, json=payload)
print(resp.json()["response"])

Use BCEmbedding independent components

# BCEmbedding 是 Apache-2.0 协议,可单独使用
from BCEmbedding import EmbeddingModel, RerankerModel

# Embedding
embed_model = EmbeddingModel(model_name_or_path="maidalun1020/bce-embedding-base_v1")
embeddings = embed_model.encode(["什么是RAG?", "RAG是检索增强生成"])

# Rerank
reranker = RerankerModel(model_name_or_path="maidalun1020/bce-reranker-base_v1")
scores = reranker.compute_score(["什么是RAG?"], ["RAG是检索增强生成技术"])

8. What can I say before sales

8.1 a sentence positioning

  • * "QAnything is a document intelligent question and answer system produced by Netease Youdao-throw in the scanned Chinese document and give accurate answers in seconds. "**

8.2 customer pain points → solutions

Customer pain pointsQAnything solutions
"We have a large number of scanned PDF,OCR is not allowed, and the question-and-answer effect is poor"Youdao OCR technology has accumulated deeply, and the recognition rate of Chinese scanned documents is the industry leader, which is the core differentiation advantage
"Company data cannot go to the cloud and must be deployed on the intranet"The whole network is disconnected for installation and use, Docker is deployed with one click, and the data cannot go out of the LAN
"The larger the knowledge base, the more inaccurate the retrieval"Two-stage retrieval (Embedding Rerank), the more data, the better the effect, BCEmbedding evaluation SOTA
"The table in PDF cannot be recognized"v2.0 rewrites the table parsing logic, and the cross-page table, embedded table and header recognition have been optimized
"Can the open source solution agreement be commercially available?"Enterprise Edition provides commercial license, Turbo/Plus/Long/Max models are available
"Can I run without GPU?"v2.0 is completely migrated to pure CPU operation, Mac/Linux/Win all
"The deployment is too complicated, the team does not have ML engineers"'docker compose up -d' one-line command to start, out-of-the-box

8.3 Differentiated Selling Points

vs RAGFlow(InfiniFlow open source):

-QAnything advantages: Youdao OCR accumulation → better PDF analysis of Chinese scanned documents/forms; BCEmbedding bilingual and cross-language SOTA; Pure CPU deployment is lighter

-RAGFlow advantages: more comprehensive DeepDoc parsing engine (support for layout recognition YOLOv8); V0.21 introduces Ingestion Pipeline that can be arranged; The update is very active (2025-2026 continuous high frequency iteration);Apache-2.0 protocol is more friendly

-Conclusion: If the customer's core requirement is "a large number of complex documents in various formats can be Pipeline by deep analysis", select RAGFlow; If it is "simple deployment of Chinese scanned documents/forms PDF Q & A", the QAnything is more accurate.

vs MaxKB(1Panel open source):

-QAnything advantages: stronger retrieval model (BCEmbedding SOTA vs MaxKB basic retrieval), deeper document analysis (especially OCR/table), commercial support in enterprise version

-MaxKB Advantages: Stars More (20,600), Protocol Friendly (GPL-3.0 with MaxKB EULA), Workflow Orchestration and MCP Tool Call, Much Higher Community Activity, Deep Integration with 1Panel Operation and Maintenance Ecology

-Conclusion: MaxKB is more suitable as a general platform Agent base; QAnything focus more on the "document → question and answer" path to the extreme.

vs Dify:

-QAnything is "Documentation Q & A Special Tool";Dify is "AI Application Development Platform"

-Dify has visual workflow orchestration, rich tools and plug-in ecology, but the document analysis depth is not as deep as QAnything.

-QAnything is suitable for customers who do not need complicated choreography and only need document questions and answers; Dify is suitable for enterprises that need custom AI applications.

  • Core Differentiation Summary: Youdao OCR BCEmbedding Pure CPU Deployment Scan Friendly *

8.4 Customer Value Story Line

  1. Cut in *:"Your company has a lot of scanned contracts/files/reports, want to use AI Q & A but OCR effect is not good?"
  2. Resonance :"The universal OCR engine has a low recognition rate for Chinese scanned documents, the contents of the form are broken after page spanning, and the column layout is misread-these are common pits."
  3. Demo: Upload a Chinese scanned PDF contract (including forms), and ask "What is the liability clause for breach of contract?" -- QAnything accurately identify the scanned text, parse the forms, locate the answers, and trace the original text.
  4. Advanced : Build Enterprise Knowledge Base from Single Document → Batch Upload → Custom Bot (Sales Assistant/Legal Assistant/Technical Consultant) → API Integration into OA System
  5. Ends :"Netease has 20 years of OCR technology accumulation, 14,000 GitHub Stars,3000 WeChat community users-there is no stronger open source scheme on this subdivision track."

9. Frequently Asked Customer Questions

QuestionAnswer
Can AGPL-3.0 agreements be commercially available?This is probably the most critical question. AGPL-3.0 requirements: If your system provides external network services (such as SaaS), you must provide users with the complete source code of derivative works. For internal use only and no external service, open source is not required. If the customer needs to "provide external document question and answer service" and modify the QAnything source code, it must be open source. Suggestion: The use of pure intranet is not a big problem. If you do not want to undertake open source obligations or external services, purchase the enterprise version of the commercial license.
Is the project not maintained?Objectively speaking, the code warehouse of the project has been basically stagnant since March 2025 (no new submissions have been made in 15 months). The latest Release v2.0.0 is nearly 2 years ago, and 403 Open Issues have not been processed. However, the v2.0 version itself has complete and stable functions, which is sufficient for the relatively mature requirement of "document question and answer. If customers need new features for continuous iteration, they need to be evaluated carefully.
Which is better than RAGFlow?Look at the scene. QAnything OCR/scan processing is a strong point, and pure CPU deployment is lighter. The RAGFlow DeepDoc analysis engine is more comprehensive and updated more actively (new functions such as Ingestion Pipeline and Long-Context RAG continue to be added). If you value the depth of document analysis and retrieval accuracy, the RAGFlow ecology is currently stronger. If the scene happens to be a "Chinese scan question and answer", the QAnything is more right.
Which major models are supported?All OpenAI API-compatible models can be accessed: OpenAI GPT series, Tongyi Qiwen (DashScope), DeepSeek, GLM, Ollama deployed local models, etc. The front-end directly configures API Key and Base URL without changing the code.
How many files can you handle?The open source version has a limit (the official limit is not clear, but the enterprise version claims to be 10-100 times that of the open source version). In actual use, the knowledge base experience of hundreds of documents is good; more than thousands of recommended enterprise edition or evaluation performance.
Does GPU acceleration be supported?Version 2.0 has been completely migrated to pure CPU and no longer supports GPU acceleration. This is a deliberate architectural choice-lowering the threshold for deployment, but at the expense of processing large-scale documents faster than GPU solutions.
Can I export or back up the knowledge base?The knowledge base data is stored in the Elasticsearch and can be backed up through the snapshot API of ES. Q & A records and bot configurations are exportable through the API.
Is there a mobile terminal?No official mobile App. The web UI is responsive and can be used in mobile browsers. The mobile terminal needs to develop its own docking API.

10. PoC Recommendations

Recommended PoC Direction: Chinese Scanned PDF Q & A

This is the core differentiation scenario of QAnything, and it is recommended that PoC focus on this to maximize its irreplaceability.

PhaseContentTimeOutput
1. Environment preparationOne-click deployment of Docker and configuration of LLM API (such as Tongyi Thousand Questions or DeepSeek)0.5 daysQAnything instances that can be run
2. Data PreparationCollect 30-50 real documents from customers (it is recommended to include scanned PDF, PDF with forms, and double-column typesetting documents)0.5 daysTest Document Set
3. Document storageBatch upload documents, observe OCR analysis effect, and check chunk quality1 dayIndexed knowledge base
4. Q & A verificationDesign 20-30 test questions (covering: scanned text recognition, table data Q & A, cross-page content understanding, original source tracing), score one by one1 dayAccuracy report
5. Competition ComparisonRun the same document set with RAGFlow/MaxKB to compare OCR recognition rate, table analysis and answer accuracy1 dayComparative analysis report
6. Integration DemonstrationInterface the customer's existing system (OA/customer service) through API to demonstrate the actual business process1 dayDemonstrable integration scheme

Validation Metrics:

-OCR text recognition accuracy> 95% (Chinese scan)

-Complete extraction rate of table data> 90%

-End-to-end answer accuracy> 85% (based on document factual validation)

-Average response time <5 seconds (CPU-only environment)

-Answer traceability is accurate (the location of the cited document matches the answer)

PoC Note:

-Be sure to use the customer's own real documents, not clean typeset test documents.

-If the customer has concerns about the AGPL protocol, the PoC phase should be clear: the open source version is only used for technical verification, and the commercial use requires the enterprise version authorization.

-Inform customers in advance of the expected indexing speed of large-scale documents in CPU-only mode

11. Risks and Considerations

RiskLevelDescriptionMitigation
AGPL-3.0 Agreement🔴The highstrong Copyleft protocol requires open source derivative code when providing network services to the outside world. Most corporate legal departments would object to direct use of AGPL open source in production systems.Pure intranet use security; Purchase enterprise commercial license for external services; Or use Apache-2.0 BCEmbedding to build RAG system
Project Maintenance Stagnation🔴HighLast code push 2025-03-24(15 months ago), latest Release 2024-08-23 (nearly 2 years),403 Open Issues not processed. The project has essentially gone into maintenance hibernation.Available if the current v2.0 features meet the requirements; if you want to continue to add new features, recommended RAGFlow/MaxKB
Enterprise Edition Closed Source Dependency🟡The mediumopen source version has limited functions and performance-the document parsing effect is general, the number of files is limited, parallel operation is not supported, and the production environment is not supported. True production-level capability in Enterprise Edition.PoC phase clarifies the difference between open source version and enterprise version; the enterprise version license fee is reserved in the budget
Pure CPU Performance🟡Mediumv2.0 gives up GPU acceleration, and performance may become a bottleneck when processing a large number of documents or high concurrency.Evaluate the actual document level; If there is a high concurrency requirement, consider enterprise version or GPU scheme
Competitive Ecological Suppression🟡MediumRAGFlow update is extremely fast (v0.21 introduces Ingestion Pipeline and Long-Context RAG),MaxKB is ecologically active (20,600 Stars, workflow MCP tool call), and the two protocols are more friendlyFocus on the QAnything OCR/scan differentiation advantages to avoid "comprehensive functions" compared with competitors
NetEase Strategy Shifting to Risk🟡ChinaQAnything may be Netease Youdao's exploration project in the AI boom, and its core energy may have shifted to the commercialization of the enterprise version or other product linesPay attention to the relationship between the open source version and the enterprise version; Evaluate Netease Youdao's long-term investment willingness
Security Vulnerability Response🟡The stagnation of project maintenance means that security vulnerabilities may not be fixed in time. Security audit before use; Run in an intranet isolation environment; Monitor the security announcements of dependent components (ES, Nginx, etc.).

12. My Pre-Sales Judgment

Recommendation: Cautiously recommended (very suitable for specific scenarios, but the overall risk is high)

Reason:

  1. irreplaceable differentiation advantage : There is a combination of OCR BCEmbedding. At present, there is no better open source scheme for processing the subdivision scenario of Chinese scanned PDF. If the customer happens to be in this scenario, QAnything is the most preferred.
  2. Deployment Friendly :Docker one-click startup, pure CPU operation, mirror image 4.88GB, extremely friendly to small and medium-sized enterprises without GPU.
  3. Full-featured and stable: Although it is no longer updated, v2.0 is already a mature and fully functional document answering system.
  4. But-AGPL and maintenance stagnation are two thunder : Legal Risk Technology Stagnation = Customers must be fully informed before sales and cannot be avoided.

Recommended Customer Persona:

-Core Appeal Accurately Hits "Chinese Scanned Document/Form PDF → Local Q & A"

-Pure intranet deployment, do not provide external document Q & A service (to avoid AGPL risks)

-The requirement for function update frequency is not high, and the existing functions of v2.0 can already meet the demand

-Limited budget but need OCR advantage (open source version can meet)

-Or have a budget to buy the enterprise version (more powerful business license)

Not recommended situations:

-Customer Legal Affairs explicitly prohibits the use of AGPL protocol → Directly exclude open source version, push enterprise version or change RAGFlow/MaxKB

-Need long-term continuous function iteration and community support → recommend RAGFlow (update the most active)

-Workflow orchestration, multi-Agent, plug-in ecology required → Dify/MaxKB recommended

-Documents are mainly in English or non-scanned documents → OCR advantage is not obvious, RAGFlow or Haystack is recommended

-High concurrency external services (and do not want to pay) → AGPL risk unacceptable

Pre-sales strategy recommendations:

-First judge the type of customer document: there are a large number of scanned documents → QAnything is the first recommendation

-reconfirm AGPL's position: legal OK and intranet use → direct push of open source PoC

-Legal service is not OK or needs external service → Push enterprise version or BCEmbedding self-built scheme (BCEmbedding Apache-2.0!)

13. REFERENCE

-GitHub repository: https://github.com/netease-youdao/QAnything

-official website/online experience: https://qanything.ai

-BCEmbedding (Retrieval Model, Apache-2.0):https://github.com/netease-youdao/BCEmbedding

-Youdao Speed Reading (online trial):https://read.youdao.com

-FAQ (Chinese):https://github.com/netease-youdao/QAnything/blob/qanything-v2/FAQ_zh.md

-Demand feedback: https://qanything.canny.io/feature-requests

-HuggingFace model: https://huggingface.co/maidalun1020

-Enterprise Business Contact: AIcloud_Business@corp.youdao.com -8255-8901

  • Analysis Date: 2026-07-02 | Data Aging: GitHub Information Pull in Real Time, Official Website Content from qanything. AI, Competition Comparison Based on 2026 Latest Data *