1. One sentence positioning
html-video is a native open source workflow that lets the AI Agent convert HTML/CSS/data/articles/repository content into MP4 video.
Its core idea is not to build a separate video editing software, but to put an Agent meta-layer on top of various HTML-to-Video / code-to-video engines:
-User description of desired video, or paste article/GitHub repo.
-Studio grabs source content and turns it into Markdown.
-Agent reads content, generates content-graph storyboards and frame-by-frame HTML.
-The Hyperframes engine records each frame of HTML animation as a video clip.
-ffmpeg encoding, splicing, mixing, export MP4.
Before the sale can say this:
html-video connect "writing web pages" and "making videos. For enterprises, it can turn articles, reports, warehouse introductions, product selling points, and data charts into short videos, and rendering is done locally, without a single rendering fee in the cloud, and without being locked by SaaS generated by a certain video.
2. What does it mostly do?
2.1 Prompt/Article/GitHub Warehouse to Video
html-video supports three sources of input:
| Enter | Official Description | Pre-Sales Value |
|---|---|---|
| Prompt | User describes the theme, Agent generates content from scratch | Quickly make creative short films, product concept films, and social media videos |
| Web article | Grasp web articles and flatten them into Markdown. README specifically mentions supporting WeChat public number articles | Suitable for converting public numbers, blogs, news and white papers into interpretation videos |
| GitHub repo | Pull warehouse introduction, top-level structure, README | Suitable for transferring open source projects and technical products to explanation videos |
This is very practical for pre-sale. You have asked me to do a lot of "open source project pre-sales analysis notes" in the past, and html-video application scenarios can be further changed:
-Turn open source project analysis into a 60-second interpretation video.
-Turn technical solutions into customer demos.
-Turn corporate reports/public articles into short video materials.
-Turn the product update log into a social media release video.
2.2 multi-frame storyboards and content-graph
html-video don't just record a page of HTML as a video. It supports the multi-frame storyboard:
-Agent reads source material.
-Extract key points.
-Formation content-graph: Nodes are entities, data, and text; edges represent relationships such as order, dependency, and comparison.
-content-graph is topologically sorted into video frame order and time.
-Each frame turns into a self-contained animated HTML frame.
This means that a long article is not simply stuffed into a template, but can be broken down into rhythmic multi-scene commentary.
2.3 local real MP4 rendering
The currently runable rendering paths are:
HTML + CSS + GSAP
-> Hyperframes
-> Headless Chromium 逐帧/逐段录制
-> webm per frame
-> ffmpeg libx264 编码
-> concat 成 MP4
This has two pre-sale values:
- Local Rendering : Unlike SaaS for cloud video generation, it charges per time and reduces material transfer.
Real MP4: The end product is a distributable video file, not just a web preview.
2.4 Studio CLI in two ways
html-video provide:
-Local browser Studio: template library, agent dialogue, frame-by-frame copy editing, soundtrack, export.
-CLI: suitable for automation, scripting, checking agents and templates.
Common commands:
pnpm install
pnpm -r build
node packages/cli/dist/bin.js studio
node packages/cli/dist/bin.js doctor
node packages/cli/dist/bin.js search-templates --intent "github stars race" --top 3
Studio default address:
http://127.0.0.1:3071
2.5 21 license-clean template
README emphasizes that the template is not a random piece, but is described by 'template.html-video.yaml', including:
-category/tags/best_for.
-Support resolution, frame ratio, fps, duration range.
-Enter the JSON schema.
-SPDX license.
-attribution_required/redistribution_allowed/commercial_use.
-assets_attribution.
Template types include:
-Data visualization: NYT-style chart, Swiss/Vignelli data cards.
-Title and VFX: glitch title, kinetic type, typewriter cursor.
-Main vision and cinematic sense: liquid background, light leak, warm grain.
-Product promotion: 15s / 30s multi-scene product promo.
-Interpretive framework: decision tree explainer.
-Logo outro.
Template NOTICE display:
-Multiple templates from heygen-com/hyperframes,Apache-2.0.
-A 30s product promo template from nateherkai/hyperframes-student-kit,MIT, replaced brand assets.
-There are also several html-video original Apache-2.0 templates.
Pre-sales value:
For enterprise content production, it is very important that the template license is clear. html-video writing the template source, SPDX and commercial use tags into the metadata, the cost of subsequent commercial review is reduced.
2.6 AI soundtrack and narration
README mentions support MiniMax:
-Background music: describe mood with words and generate instrumental track.
-Narrator: input script, MiniMax TTS read aloud.
-Mix MP4 through ffmpeg when exporting, music can be duck under voice, and support fade in and fade out.
Note before sales: this part needs to MiniMax API key, which is optional. If there is no key, other functions of Studio are not affected.
2.7 supports 14 types of Agent backend
The Agents listed in the README include:
-Open Design (Vela).
-Windsurf CLI.
-Trae CLI.
-Claude Code.
-Cursor Agent.
-Codex CLI.
-Gemini CLI.
-Grok Build.
-Qwen Code.
-OpenCode.
-GitHub Copilot CLI.
-Aider.
-Hermes.
-Anthropic Messages API.
It will automatically detect the installed Agent in 'PATH' and can be switched in the Studio top bar. If the local machine does not have CLI, you can also Anthropic API BYOK.
3. Distinguish between current available capabilities and roadmap
This is the part that must be made clear before sales.
| Engine adapter spec | Completed | One interface can connect multiple backends in the future | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Template metadata | Completed | license-first, agent-readable | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Content-graph multi-frame storyboard | Completed | Article/repo to intermediate representation of multi-scene video | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Studio | Completed | Template Library, Agent Switching, Frame-by-Frame Editing | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Source fetch | Completed | Articles/GitHub repo | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| AI soundtrack | Completed | MiniMax Music Narrator | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Real MP4 render | Done | Hyperframes Chromium ffmpeg | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Open Design Vela backend | Completed | Model selection and catalog | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Remotion adapter | Planned | README Clear Incomplete | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Motion Canvas / Revideo adapter | Planned | README explicitly incomplete | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Manim | In research | Math/3D direction, not yet adapted | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Agent skill packages | Planned | Not completed | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Template marketplace | Planned | Not Completed |
Presentation of recommendations:
It can currently be used as a Hyperframes-driven local HTML-to-MP4 video generation Studio. The pluggable engine architecture has been designed, but multi-engine adaptation cannot be used as a current delivery commitment.
4. Applicable Scenario
4.1 technical content to short video
Not suitable:
-GitHub open source project interpretation video.
-Tech blog interpretation.
-Product update instructions.
-API/SDK usage tutorial.
-Technical programme highlights short film.
Obsidian of your current open source project notes can form a content production chain:
GitHub 项目 -> 售前分析笔记 -> html-video -> 技术解读短视频 -> 客户沟通/社媒发布
4.2 Enterprise Marketing Department Content Automation
Suitable for:
-Product release short video.
-Data reporting dynamic graph.
-Brand credits/credits.
-Social Media Vertical Short Video.
-Event teaser video.
-Customer case interpretation video.
Advantages:
-Templates for Unity style.
-Local batch generation.
-Modifiable HTML/CSS/Templates.
-No single render charge.
4.3 data reports and visual videos
The template contains NYT-style data chart, Swiss/Vignelli data card, etc., suitable:
-Weekly/monthly highlights.
-Sales data changes.
-Interpretation of industry trends.
-Investment and financing/market size chart.
Pre-sales words:
many customer reports are sunk in PPT and PDF. html-video can turn key data into dynamic short videos for internal reporting, sales empowerment or social media dissemination.
4.4 knowledge dissemination within enterprises
Suitable for:
-Transforming system announcements into videos.
-Transfer training materials into short videos.
-The project weekly report into a dynamic summary.
-Transforming the R & D change log into a business-oriented explanation video.
The advantage is not "movie-grade video", but low-cost, high-frequency, templated video production.
4.5 Agent Video Production Workbench
For customers who are working on Agent platforms, html-video can be used as a "content production Agent tool":
-Agent reads material.
-Agent Select Template.
-Agent generates frame content.
-User adjustments in Studio.
-Local render export.
This is easier to form deliverables than plain text agents.
5. Not suitable for the scene
| Not suitable for the scene | Reason | Suggestion |
|---|---|---|
| movie-level real-life video generation | html-video is mainly HTML/CSS/templated, not diffuse video models | Runway, Keling, Pika, Sora video models |
| Complex Timeline Professional Editing | Not a Premiere/DaVinci Substitute | Generate footage and give it to the Professional Editing Tool |
| Highly realistic 3D/mathematical animation | Manim/3D is still the research direction | Direct use of Manim, Blender or specialized engines |
| Zero configuration for non-technical users | Requires Node, pnpm, ffmpeg, Chromium, Agent or API key | Requires product encapsulation or integration services |
| Strong demand for Remotion/Motion Canvas | Adapters on roadmap | Should not be committed to support |
6. Architecture Understanding
6.1 official pipeline finishing
URL/repo -> Markdown"] Fetch --> Agent["Agent Loop
读取素材 + 模板风格"] Agent --> Graph["Content Graph
nodes + edges + timing"] Graph --> HTML["Per-frame HTML
self-contained animated frames"] HTML --> Render["Hyperframes Render
Headless Chromium -> WebM"] Render --> FFmpeg["ffmpeg
libx264 concat + audio mux"] FFmpeg --> MP4["Final MP4"] Templates["21 License-clean Templates"] --> Agent Templates --> HTML Audio["MiniMax Music / TTS"] --> FFmpeg
6.2 Monorepo module
| Module | Role |
|---|---|
| 'packages/core' | Project / Asset / ContentGraph type, registry,orchestrator,MiniMax provider,ffmpeg audio mux |
| 'packages/content-graph' | Multi-frame storyboard IR, nodes, edges, topological sort |
| 'packages/runtime' | Agent runtime, detecting, starting, and streaming communication |
| 'packages/adapter-hyperframes ' | Hyperframes engine adapter, Chromium ffmpeg rendering |
| 'packages/cli' | html-video command, Studio HTTP server, source fetching |
| 'packages/project-studio ' | Browser Studio UI |
| 'templates' | 21 licensed clear video templates |
| 'research' | RFC:engine adapter, template metadata, agent skill, content-graph |
How to use #7.
7.1 Preconditions
| Dependent | Minimum version |
|---|---|
| Node.js | 20 |
| pnpm | 9 |
| ffmpeg | A newer version is enough |
| Chromium / Playwright browser | For Headless Chromium rendering |
Installation Chromium:
npx playwright install chromium
Start Studio 7.2
pnpm install
pnpm -r build
node packages/cli/dist/bin.js studio
Open:
http://127.0.0.1:3071
7.3 CLI Tools
node packages/cli/dist/bin.js doctor
node packages/cli/dist/bin.js search-templates --intent "github stars race" --top 3
7.4 Recommended Demo Process
- Start Studio.
- Enter a GitHub repository link.
- Let the Agent generate 5-8 frames of interpretation video.
- Select a data visualization or explainer template.
- Modify the title and copy frame by frame.
- Optionally add MiniMax narration.
- Export MP4.
8. What can I say before sales
8.1 Marketing/Content Team
html-video can automatically turn articles, reports, product selling points and open source projects into short videos. It is not a substitute for professional editing, but to solve the video problem of high-frequency, templated and data-based content, allowing the team to produce more disseminated material at a lower cost.
8.2 for technical teams
It uses HTML/CSS/GSAP to express the picture, and uses Headless Chromium and ffmpeg to render MP4 locally. A template has a JSON schema and license metadata. The Agent can understand the template input and generate frame-by-frame HTML. The whole can be expanded, the template can be changed, and the local Agent can be connected.
8.3 for management
This is an open source, locally run, no single-chip rendering cost video production tool. PoC suitable for enterprise content automation: first select fixed templates and fixed content sources, stably convert a type of document or article into short videos, and then consider product integration.
9. Frequently Asked Customer Questions
| Is it a AI video generation model? | Not a diffuse video model. It is an Agent HTML/CSS dynamic local rendering workflow, which is more suitable for information, data and template videos. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| What rendering engines are now supported? | Hyperframes are currently fully available. Remotion, Motion Canvas, Revideo, Manim in the roadmap or survey. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Can commercial templates be used? | README and NOTICE emphasize template license-clean; Many templates come from Apache-2.0 / MIT sources, and metadata records SPDX and commercial use flags. Specific commercial still recommended according to the enterprise process review. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Is networking required? | Local rendering does not require cloud rendering; however, crawling articles/repo, calling Agent API, and MiniMax soundtrack/narration will generate network requests. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Can I make a Chinese video? | README Chinese examples include the interpretation of WeChat public number articles; Copy generation depends on the selected Agent/model. The template itself is HTML and can carry Chinese. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Can I generate a batch? | It has CLI and local architecture, which is suitable for automation in theory. However, production-level batches need to fill task queues, retry failures, template management and material review. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Can we take our own brand template? | Yes. Templates are self-contained HTML YAML manifest, suitable for corporate branding template libraries. |
10. PoC Recommendations
10.1 PoC Topic Selection
It is recommended to choose an explicit content type:
-A corporate public article turns to a 60-second interpretation video.
-A GitHub project turn 90 second technical introduction video.
-A monthly data transfer 30-second dynamic chart video.
-A product feature update to the social media short video.
10.2 acceptance index
| Indicator | Description |
|---|---|
| CONTENT ACCURACY | VIDEO COPY READY TO ORIGINAL/WAREHOUSE README |
| Generation time | Time to link from input to export MP4 |
| Manual Modifier | How much text and typesetting need to be changed frame by frame |
| Visual consistency | Comply with corporate branding and template style |
| Rendering stability | Chromium/ffmpeg stable export |
| film quality | resolution, fps, subtitles, audio tracks, and transition are available |
| Template Reusability | Can the same template cover different contents |
| Compliance | Is the license for articles, pictures, music, fonts, templates clear |
10.3 Presales Demo Script
- Prepare a URL related to the customer's industry.
- Open Studio and paste the link.
- Let the Agent generate multi-frame storyboard.
- Select a explainer or data-viz template.
- Modify 2-3 frames of text in Studio.
- Optionally generate narration.
- Export as MP4.
- Explain local rendering, template licensing, and subsequent branding template customization paths.
11. Risks and Considerations
11.1 product maturity
html-video is still at an early stage. Although README shows that the core route has run through, the future engine, skill packages and template market are still in the roadmap.
Do not package it as a mature commercial video platform before sales, but should be positioned:
Open source local video generation base that can be PoC, secondary development, and suitable for automatic exploration of internal content.
11.2 Agent output quality is unstable
The video content is generated by the Agent, and the quality depends on:
-Enter the material mass.
-Model capacity.
-Template constraints.
-prompt.
-Manual review.
Enterprise applications must retain manual review, especially for market releases and customer deliveries.
11.3 Rendering Environment Dependency
Node.js, pnpm, ffmpeg, Chromium/Playwright is required. If the customer environment is intranet, Windows, no GUI, and no browser dependency, verify before deployment.
11.4 Copyright and Material Compliance
While the template license is clear, the final video may still contain:
-The original content of the article.
-GitHub README content.
-Customers upload pictures.
-Font.
-AI music/narration.
These all need to be reviewed according to the enterprise release process.
12. My Pre-Sales Judgment
The value of html-video is to string Agent, templates, HTML motion and local rendering into an operational video production workbench.
It is best suited not to "replace all video production", but to "quickly videoize large amounts of informative content":
-Technical interpretation.
-Data reporting.
-Product introduction.
-Corporate announcements.
-Introduction to open source projects.
-Social media short video material.
Suitable for advancing customers:
- The marketing department needs high-frequency short video content, but the budget and manpower are limited.
- The technical team wants to automatically convert documents/warehouses/reports into communicable content.
- Enterprise Agent platform wants to add "video generation" tool capabilities.
- Customers who are sensitive to cloud video generation costs, material transfer and vendor lock-in.
Promises not recommended:
-Completely replace professional clips.
-All major video-as-code engines are supported.
-Automatically generate commercial slices without review.
13. REFERENCE
-GitHub repository:nexu-io/html-video
-English README:README.md
-Chinese README:README.zh-CN.md
-Template NOTICE:templates/NOTICE.md
-package.json:package.json
-Hero Chart:hero.png
-Template example:
Information verification date: 2026-06-30. This note has not been written into real-time stars/forks due to anonymous access to GitHub API triggering stream restriction. The project capability, roadmap, installation method and template license are mainly based on official README, Chinese README, NOTICE and package.json.