Skip to content

feat(sites): fix false positives: disable 74 broken sites, fix 8 with…#2302

Merged
soxoj merged 1 commit intomainfrom
updates-220326
Mar 22, 2026
Merged

feat(sites): fix false positives: disable 74 broken sites, fix 8 with…#2302
soxoj merged 1 commit intomainfrom
updates-220326

Conversation

@soxoj
Copy link
Copy Markdown
Owner

@soxoj soxoj commented Mar 22, 2026

… API probes and better markers

  • Disable 74 sites: Cloudflare/captcha blocks, identical responses, dead domains, vBulletin/phpBB engine failures
  • Fix Roblox, Salon24.pl, Planetaexcel → status_code (clear 404 signal)
  • Fix en.brickimedia.org → message with "noarticletext" absenceStr
  • Fix Arduino → narrower title-based presenseStrs/absenceStrs
  • Re-enable Fandom (3 wikis) via MediaWiki api.php urlProbe
  • Re-enable Substack via /api/v1/user/{}/public_profile urlProbe
  • Re-enable hashnode via GraphQL GET urlProbe (URL-encoded query)
  • Document lessons: engine template drift, search-by-author fragility, always-200 sites, TLS degradation, API bypassing Cloudflare, GraphQL GET support, URL-encoding for template safety

… API probes and better markers

  - Disable 74 sites: Cloudflare/captcha blocks, identical responses,
    dead domains, vBulletin/phpBB engine failures
  - Fix Roblox, Salon24.pl, Planetaexcel → status_code (clear 404 signal)
  - Fix en.brickimedia.org → message with "noarticletext" absenceStr
  - Fix Arduino → narrower title-based presenseStrs/absenceStrs
  - Re-enable Fandom (3 wikis) via MediaWiki api.php urlProbe
  - Re-enable Substack via /api/v1/user/{}/public_profile urlProbe
  - Re-enable hashnode via GraphQL GET urlProbe (URL-encoded query)
  - Document lessons: engine template drift, search-by-author fragility,
    always-200 sites, TLS degradation, API bypassing Cloudflare,
    GraphQL GET support, URL-encoding for template safety
@soxoj soxoj mentioned this pull request Mar 22, 2026
@soxoj soxoj merged commit 959b2be into main Mar 22, 2026
4 checks passed
@soxoj soxoj deleted the updates-220326 branch March 22, 2026 19:47
soxoj added a commit that referenced this pull request Mar 22, 2026
… API probes and better markers (#2302)

- Disable 74 sites: Cloudflare/captcha blocks, identical responses,
    dead domains, vBulletin/phpBB engine failures
  - Fix Roblox, Salon24.pl, Planetaexcel → status_code (clear 404 signal)
  - Fix en.brickimedia.org → message with "noarticletext" absenceStr
  - Fix Arduino → narrower title-based presenseStrs/absenceStrs
  - Re-enable Fandom (3 wikis) via MediaWiki api.php urlProbe
  - Re-enable Substack via /api/v1/user/{}/public_profile urlProbe
  - Re-enable hashnode via GraphQL GET urlProbe (URL-encoded query)
  - Document lessons: engine template drift, search-by-author fragility,
    always-200 sites, TLS degradation, API bypassing Cloudflare,
    GraphQL GET support, URL-encoding for template safety
soxoj added a commit that referenced this pull request Apr 7, 2026
… API probes and better markers (#2302)

- Disable 74 sites: Cloudflare/captcha blocks, identical responses,
    dead domains, vBulletin/phpBB engine failures
  - Fix Roblox, Salon24.pl, Planetaexcel → status_code (clear 404 signal)
  - Fix en.brickimedia.org → message with "noarticletext" absenceStr
  - Fix Arduino → narrower title-based presenseStrs/absenceStrs
  - Re-enable Fandom (3 wikis) via MediaWiki api.php urlProbe
  - Re-enable Substack via /api/v1/user/{}/public_profile urlProbe
  - Re-enable hashnode via GraphQL GET urlProbe (URL-encoded query)
  - Document lessons: engine template drift, search-by-author fragility,
    always-200 sites, TLS degradation, API bypassing Cloudflare,
    GraphQL GET support, URL-encoding for template safety
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant