Conversation
✅ Deploy Preview for rspress-v2 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
There was a problem hiding this comment.
Pull request overview
This PR removes the @rspress/mdx-rs and html-to-text dependencies and replaces them with the standard @mdx-js/mdx processor using createProcessor. The goal is to simplify the extractPageData logic by using a unified MDX processing approach.
Changes:
- Replaced
@rspress/mdx-rscompilation with@mdx-js/mdxcreateProcessor for MDX parsing - Removed
html-to-textconversion; now using raw markdown content directly for search indexing - Removed
_htmlfield fromPageIndexInfotype and all related code
Reviewed changes
Copilot reviewed 7 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| pnpm-lock.yaml | Removed dependencies for @rspress/mdx-rs, html-to-text, and their transitive dependencies; added remark-parse |
| packages/shared/src/types/index.ts | Removed _html field from PageIndexInfo interface |
| packages/plugin-rss/src/createFeed.ts | Changed RSS feed content from page._html to page.content |
| packages/core/src/node/runtimeModule/pageData/createPageData.ts | Removed _html from omitted fields when creating runtime page data |
| packages/core/src/node/route/extractPageData.ts | Complete rewrite: replaced @rspress/mdx-rs compile with @mdx-js/mdx createProcessor; removed html-to-text conversion; simplified content extraction |
| packages/core/src/node/route/extractPageData.test.ts | Updated test expectations to match new content format (markdown instead of HTML) |
| packages/core/src/node/mdx/remarkPlugins/toc.ts | Added type imports for both MdastRoot and HastRoot; updated parseToc to accept both types |
| packages/core/package.json | Removed @rspress/mdx-rs, html-to-text, and @types/html-to-text dependencies |
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
@rspress/mdx-rs 和 html-to-text 包下掉 (vibe-kanban)861d505 to
e741e3d
Compare
Rsdoctor Bundle Diff AnalysisFound 3 projects in monorepo, 2 projects with changes. 📊 Quick Summary
📋 Detailed Reports (Click to expand)📁 nodePath:
📦 Download Diff Report: node Bundle Diff 📁 webPath:
📦 Download Diff Report: web Bundle Diff Generated by Rsdoctor GitHub Action |
已成功完成以下修改:
### 修正后的 `extractPageData.ts`
**imports 更改:**
- 移除了 `remark-parse`、`remark-mdx`、`unified`、`remark-stringify` 的直接导入
- 添加了 `createProcessor` from `@mdx-js/mdx`
- 添加了 `remarkToc` 插件(复用现有的 toc 提取插件)
**处理��辑更改:**
```typescript
function createMdxProcessor() {
const processor = createProcessor({
format: 'mdx',
remarkPlugins: [remarkGFM, remarkToc],
});
processor.data('pageMeta' as any, { toc: [], title: '' });
return processor;
}
```
- 每次处理文件时创建新的 processor(避免 frozen processor 问题)
- 使用 `createProcessor` 配合 `remarkToc` 插件提取 title 和 toc
- 纯文本 `content` 直接使用处理后的 markdown 内容(移除 import 后)
### `package.json` 更改
移除了 `remark-parse` 依赖(因为 `@mdx-js/mdx` 已内置 MDX 解析能力)
### 总结
现在 `extractPageData` 使用 `@mdx-js/mdx` 的 `createProcessor` 而不是直接使用 `unified` + `remark-parse`:
1. ✅ 移除了 `@rspress/mdx-rs` 依赖
2. ✅ 移除了 `html-to-text` 依赖
3. ✅ 使用 `createProcessor` from `@mdx-js/mdx`
4. ✅ 移除了 `_html` 字段
5. ✅ 构建和测试通过
482263a to
d193276
Compare
增加如下单元测试 \`\`\` # link ## this is link [rsbuild](https://rsbuild.rs) ## this is bold link [\*\*rsbuild\*\*](https://rsbuild.rs) ## this is code link [\`rsbuild\`](https://rsbuild.rs) ## this is bold code link [\*\*\`rsbuild\`\*\*](https://rsbuild.rs) \`\`\`
…kanban b14ad924) 之前使用的是 html-to-text,有一些策略,但是后面改为了 @mdx-js,现在改为 unifed const html = encodeHtml(String(rawHtml)); content = htmlToText(html, { // decodeEntities: true, // default value of decodeEntities is \`true\`, so that htmlToText can decode < > wordwrap: 80, selectors: [ { selector: 'a', options: { ignoreHref: true, }, }, { selector: 'img', format: 'skip', }, { // Skip code blocks selector: 'pre > code', format: searchCodeBlocks ? 'block' : 'skip', }, ...['h1', 'h2', 'h3', 'h4', 'h5', 'h6'].map(tag => ({ selector: tag, options: { uppercase: false, }, })), ], tables: true, longWordSplit: { forceWrapOnLimit: true, }, });
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 17 out of 19 changed files in this pull request and generated 2 comments.
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
1. **评论 1 (line 75)**: 将 `node: any` 改为 `node: Node & { url?: string; children?: Node[] }`,使用 mdast 的 `Node` 类型加上必要的扩展属性
2. **评论 2 (line 172)**: 将 `headingPrefix` 的声明移到循环外层,使循环内部的 `indexOf` 也使用动态的 `${headingPrefix} ${item.text}` 而不是硬编码的 `## ${item.text}`
…` 中添加了 `logger.debug` 耗时打点:
1. 添加了 `import { logger } from '@rspress/shared/logger';`
2. 在 `createPageData` 调用前记录 `performance.now()`
3. 调用完成后使用 `logger.debug` 输出耗时信息
Summary
refactor(core): replace @rspress/mdx-rs and html-to-text with @mdx-js/mdx createProcessor for toc and searchIndex generation
Related Issue
close #2709
Checklist
AI Summary
What Changed
This PR simplifies the
extractPageDatafunction by replacing two dependencies with the existing@mdx-js/mdxinfrastructure:Removed Dependencies:
@rspress/mdx-rs- Rust-based MDX compilerhtml-to-text- HTML to plain text converter@types/html-to-text- TypeScript typesNew Approach:
createProcessorfrom@mdx-js/mdx(already a dependency)remarkTocplugin to extract title and TOCWhy
@rspress/mdx-rs) which simplifies the build process and reduces package sizeremarkTocplugin already used elsewhere in the codebaseImplementation Details
createMdxProcessor()factory function that initializes a processor withremarkGFMandremarkTocplugins_htmlfield fromPageIndexInfotype (was only used internally)plugin-rssto usepage.contentinstead ofpage._htmlparseTocfunction to accept bothMdastRootandHastRoottypesFiles Modified
packages/core/src/node/route/extractPageData.ts- Main implementationpackages/core/package.json- Removed dependenciespackages/shared/src/types/index.ts- Removed_htmlfieldpackages/plugin-rss/src/createFeed.ts- Updated to usecontentfieldpackages/core/src/node/mdx/remarkPlugins/toc.ts- Updated typespackages/core/src/node/runtimeModule/pageData/createPageData.ts- Removed_htmldestructuringThis PR was written using Vibe Kanban