Skip to content

Conversation

@jimwins
Copy link
Contributor

@jimwins jimwins commented Feb 27, 2024

Summary

This fixes the generic_json parser by not always assuming the JSON needs special handling and doing a more straightforward workaround when it might.

Also adds support for a tags field.

Related issues

Fixes #1347.

Changes these areas

  • Bugfixes
  • Feature behavior
  • Command line interface
  • Configuration options
  • Internal architecture
  • Snapshot data layout on disk

Rather than by assuming the JSON file we are parsing has junk at the beginning
(which maybe only used to happen?), try parsing it as-is first, and then fall
back to trying again after skipping the first line

Fixes ArchiveBox#1347
@pirate
Copy link
Member

pirate commented Feb 29, 2024

Looks good, thanks! Ready to merge @jimwins?

@jimwins
Copy link
Contributor Author

jimwins commented Mar 1, 2024

Yeah, I think this is good to go. I'll open a new issue to track adding JSONL handling.

@pirate pirate merged commit 7b042c8 into ArchiveBox:dev Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants