Skip to content

Enhancement: Dynamic High Star Ranges for Spider #9042

@tobiu

Description

@tobiu

Enhance the "Core: High Stars" strategy in the Spider to avoid repeatedly scanning the same top-level repositories.

Current Behavior:
The strategy uses a fixed query stars:>1000 (or configured minStars). This always returns the same top repositories (React, Vue, etc.), which are likely already visited, resulting in wasted runs.

Goal:
Implement "Deep Slicing" by using random star count ranges. This allows the Spider to "jump" into the middle of the dataset (e.g., repositories with 1200-1500 stars) and treat them as "Page 1" of a new search, effectively bypassing the 1000-result limit of GitHub search API and discovering repositories that would otherwise be on Page 50+.

Logic:
Instead of stars:>1000:

  1. Pick a random lower bound (e.g., 1000 + random(0..10000)).
  2. Pick a random upper bound (e.g., lower + 1000).
  3. Query stars:LOWER..UPPER.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions