这是一个面向内容提取场景的 MCP 服务,目标很直接:
- 你给它一个抖音链接
- 或者给它一个小红书链接
- 它返回结构化信息,并落盘生成脚本文件
当前支持:
- 抖音视频
- 小红书视频笔记
- 小红书图文笔记
默认产物:
script.mdinfo.json
这个项目主要解决四件事:
- 统一解析抖音和小红书分享链接
- 自动识别视频笔记还是图文笔记
- 用云端模型完成视频转写、图片读字、轻量整理
- 统一输出适合后续内容加工的文件
- 不再只支持抖音,已经扩展到小红书
- 小红书不仅支持视频,也支持图文笔记
- 输出不是一段临时文本,而是固定的
script.md + info.json - 默认走百炼的轻量模型组合,成本低、速度快、配置简单
- 兼容旧接口,适合接到 Agent Reach 之类的 AI 工作流里
- 想把短视频或图文笔记转成文字稿的内容创作者
- 想把 Douyin / Xiaohongshu 接入自己 AI 工作流的开发者
- 想让 AI 自动读取链接、提取结构化内容、继续写脚本或做分析的人
默认建议直接使用百炼:
paraformer-v2:视频语音转文字qwen3-vl-flash:小红书图文图片读字qwen-flash:轻量整理,只做分段、标点、明显错字修正
这套默认组合的目标不是“最强”,而是:
- 简单
- 好用
- 快
- 成本低
script.md 用来给人看,也方便继续交给 AI 做后续处理。通常包含:
- 整理后的脚本
- 原始转写
- 小红书图文的图片文字提取结果
info.json 用来给程序或 Agent 看。通常包含:
- 平台
- 内容类型
- 标题
- 作者
- 发布时间
- 媒体地址
- 采用的 provider 和 model
- 任务状态
- 错误信息
先看这两个 MCP 工具:
parse_social_post_infoextract_social_post_script
前者只解析信息,后者直接生成 script.md 和 info.json。
AI-first MCP for extracting structured content from:
- Douyin videos
- Xiaohongshu video notes
- Xiaohongshu image notes
Default outputs:
script.mdinfo.json
This repository is intended to be handed to an AI agent. Write prompts assuming the AI will read this file and perform the setup.
Input:
- one Douyin share link
- one Xiaohongshu share link
- share text that contains one of those links
Output:
- parsed metadata
- raw transcript for videos
- OCR text for Xiaohongshu image notes
- lightly cleaned readable script
- artifacts written to disk
Supported MCP tools:
parse_social_post_infoextract_social_post_scriptparse_douyin_video_infoget_douyin_download_linkextract_douyin_text
If you are an AI agent configuring this repo, use this default stack unless the user explicitly asks for something else:
ASR_PROVIDER=bailianASR_MODEL=paraformer-v2VISION_PROVIDER=bailianVISION_MODEL=qwen3-vl-flashCLEAN_PROVIDER=bailianCLEAN_MODEL=qwen-flash
Reason:
paraformer-v2is the cheapest and fastest default for video ASRqwen3-vl-flashis a good default for Xiaohongshu image-note OCRqwen-flashis enough for light cleanup: punctuation, paragraphing, obvious typo fixes
Do not switch to heavier models by default.
If you are an AI agent, execute these steps in order:
git clone https://github.com/JNHFlow21/social-post-extractor-mcp.git
cd social-post-extractor-mcp
uv syncOptional local verification:
python3 -m unittest discover -s testsRun the MCP server:
uv run python -m social_post_extractor_mcpCopy the template:
cp .env.example .envUse this exact default configuration:
ASR_PROVIDER=bailian
ASR_MODEL=paraformer-v2
VISION_PROVIDER=bailian
VISION_MODEL=qwen3-vl-flash
CLEAN_PROVIDER=bailian
CLEAN_MODEL=qwen-flash
BAILIAN_API_KEY=your_bailian_api_keyDo not commit .env.
If you are wiring this repo into an MCP client, use a local config like this:
{
"mcpServers": {
"douyin": {
"command": "/bin/zsh",
"args": [
"-lc",
"cd '/absolute/path/to/social-post-extractor-mcp' && exec '.venv/bin/python' -m social_post_extractor_mcp"
],
"env": {
"ASR_PROVIDER": "bailian",
"ASR_MODEL": "paraformer-v2",
"VISION_PROVIDER": "bailian",
"VISION_MODEL": "qwen3-vl-flash",
"CLEAN_PROVIDER": "bailian",
"CLEAN_MODEL": "qwen-flash",
"BAILIAN_API_KEY": "YOUR_BAILIAN_API_KEY"
}
}
}
}Use the shell wrapper form above if your MCP client launches stdio servers from a directory outside this repo. It avoids Python module resolution failures.
If the user is using Agent Reach, the effective system config is usually:
~/.mcporter/mcporter.json
Do not assume ~/.agent-reach/tools/.../config/mcporter.json is the active config file.
Check the active source with:
mcporter config listIf mcporter config list shows the douyin server coming from ~/.mcporter/mcporter.json, edit that file instead.
This repo currently assumes Alibaba Cloud Bailian / DashScope for the default path.
Official product page:
Official pricing page:
Official first API call / API key doc:
If you are an AI agent, the purchase flow is:
- Ask the user to log in to Alibaba Cloud.
- Tell the user to open the Bailian product page.
- Tell the user to activate Bailian if it is not activated yet.
- Tell the user to recharge their Alibaba Cloud account balance.
- Tell the user to open Bailian console
密钥管理. - Tell the user to create an API key.
- Put that key into local config as
BAILIAN_API_KEY.
Current official product page also advertises:
- a one-click API onboarding entry
- free token quota for new users
Important:
- Keep the API key in local config or local environment variables only.
- Do not commit the real key to Git.
- Do not paste the real key into tracked files.
All numbers below are from Alibaba Cloud official pages and should be treated as the current default reference for this repo.
paraformer-v2:
0.00008 元 / 秒36,000 秒monthly free quota shown on the pricing page, equal to10 小时
fun-asr:
0.00022 元 / 秒
Interpretation:
fun-asris about2.75xthe price ofparaformer-v2- use
paraformer-v2as the default unless the user explicitly needs a more expensive ASR model
qwen3-vl-flash under the current pricing page tier 0 < Token <= 32K:
- input:
0.15 元 / 百万 Token - output:
1.5 元 / 百万 Token
qwen-flash under the current pricing page tier 0 < Token <= 128K:
- input:
0.15 元 / 百万 Token - output:
1.5 元 / 百万 Token
These are rough operating estimates for the default stack.
Using paraformer-v2 only:
1 minute video≈0.0048 元3 minute video≈0.0144 元5 minute video≈0.024 元10 minute video≈0.048 元100 videos x 1 minute≈0.48 元100 videos x 3 minutes≈1.44 元100 hours total≈28.8 元
For qwen-flash, cleanup is usually negligible compared with ASR.
Reference estimate:
- assume
2,000input tokens +2,000output tokens for one short transcript - estimated cleanup cost ≈
0.0033 元 / 条
Formula:
- input cost =
input_tokens / 1,000,000 * 0.15 - output cost =
output_tokens / 1,000,000 * 1.5
For qwen3-vl-flash, OCR cost depends on:
- image count
- image size
- prompt tokens
- OCR output length
Use this as the operating rule:
- image-note OCR is still cheap for normal creator workflows
- if the user mainly processes videos, budget primarily by ASR
- if the user mainly processes long multi-image notes, monitor token usage from actual runs instead of guessing
Code handles:
- link parsing
- Douyin / Xiaohongshu detection
- note type detection
- metadata extraction
- artifact directory creation
- writing
script.mdandinfo.json
Cloud models handle:
paraformer-v2: video speech-to-textqwen3-vl-flash: image text extractionqwen-flash: light readability cleanup
This is not "the LLM figures everything out by itself". The MCP does the workflow orchestration; the models only handle recognition and light cleanup.
If you are an AI agent using this repo:
- Prefer
extract_social_post_scriptover platform-specific tools. - Keep
extract_douyin_textonly for backward compatibility. - Keep cleanup light. Do not summarize unless the user asks.
- Preserve raw transcript and raw OCR text in artifacts.
- Do not change the default model stack unless the user asks.
- Do not store real API keys in tracked files.
Run these after setup:
python3 -m unittest discover -s testsExample MCP smoke test:
mcporter call 'douyin.extract_social_post_script(share_link: "https://v.douyin.com/xxxxx/", output_dir: "/tmp/social-post-extract")'Apache 2.0. See LICENSE.