Skip to content

Conversation

@su0as
Copy link
Contributor

@su0as su0as commented Aug 24, 2025

Description

CSS files were not being indexed because codeChunker expects code structures (classes/functions) that don't exist in CSS. The chunker would return zero chunks, preventing CSS files from being indexed.

This fix routes CSS, HTML, JSON and similar non-code files to basicChunker instead of codeChunker, ensuring they are properly indexed while maintaining intelligent chunking for actual code files.

Root Cause's Found

  • CSS files were being processed by codeChunker which expects code structures (classes/functions)
  • CSS doesn't have these structures, so codeChunker returns 0 chunks
  • No chunks = no indexing = no retrieval

Screen recording or screenshot

Screenshot 2025-08-24 at 11 53 20 PM

Tests

• Manually verified CSS files are now indexed and retrievable via @codebase
• Confirmed all existing chunk tests pass (10/10)
• TypeScript compilation succeeds with no errors
• Prettier formatting applied


Summary by cubic

Fixes CSS files not being skipped during indexing by sending non-code files to the basic chunker so they produce chunks and can be retrieved. Code files still use the code chunker.

  • Bug Fixes
    • Route css, html/htm, json, toml, yaml/yml to basicChunker.
    • Verified CSS indexing and retrieval; existing chunk tests pass.
    • No changes to behavior for code files.

CSS files were not being indexed because codeChunker expects code
structures (classes/functions) that don't exist in CSS. The chunker
would return zero chunks, preventing CSS files from being indexed.

This fix routes CSS, HTML, JSON and similar non-code files to
basicChunker instead of codeChunker, ensuring they are properly
indexed while maintaining intelligent chunking for actual code files.
@su0as su0as requested a review from a team as a code owner August 24, 2025 16:11
@su0as su0as requested review from RomneyDa and removed request for a team August 24, 2025 16:11
@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Aug 24, 2025
@su0as
Copy link
Contributor Author

su0as commented Aug 25, 2025

@RomneyDa Initially, this appeared to be a retrieval issue - CSS files were in the database but not being returned by @codebase queries. However, CSS files were never being chunked/indexed at all.

Route CSS, HTML, JSON and similar non-code files to basicChunker instead of codeChunker. These files have structure but not code constructs, so they need simple line-based chunking rather than AST-based chunking.
This minimal change (adding one array check) ensures all file types are properly indexed while maintaining intelligent chunking for actual code files.

Can you review it and Provide if any changes is needed.

Copy link
Collaborator

@RomneyDa RomneyDa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems right, can't nontrivially remove from supportedLanguages since it is used for autocomplete, etc.

@github-project-automation github-project-automation bot moved this from Todo to In Progress in Issues and PRs Aug 25, 2025
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Aug 25, 2025
@RomneyDa RomneyDa merged commit 752b3d8 into continuedev:main Aug 25, 2025
57 of 58 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in Issues and PRs Aug 25, 2025
@github-actions github-actions bot locked and limited conversation to collaborators Aug 25, 2025
@sestinj
Copy link
Contributor

sestinj commented Aug 25, 2025

🎉 This PR is included in version 1.10.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@sestinj
Copy link
Contributor

sestinj commented Aug 27, 2025

🎉 This PR is included in version 1.11.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

lgtm This PR has been approved by a maintainer released size:S This PR changes 10-29 lines, ignoring generated files.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants