Conversation
equals215
approved these changes
Sep 26, 2025
equals215
left a comment
Member
There was a problem hiding this comment.
LGTM besides the 2 comments that are more good additions IMO
Collaborator
|
Some mend tests are failing: Reviewing everything else in a moment! |
NGTmeaty
reviewed
Sep 27, 2025
NGTmeaty
left a comment
Collaborator
There was a problem hiding this comment.
Initial review looks good! I want to check this against internal warctool next week (or maybe this weekend) but otherwise it looks really good! Thanks again!
Collaborator
Author
|
@NGTmeaty I made it so that it asks for deletion and ultimately delete empty WARCs, that also includes WARCs with only 1 warcinfo (e.g. when Zeno starts but no URL gets crawled) |
equals215
approved these changes
Oct 1, 2025
equals215
left a comment
Member
There was a problem hiding this comment.
LGTM, not sure the 2 first comments are actual issues or just a me skill issue
Restructure cmd/ to cmd/warc/ to fix go install installing the binary as 'cmd' instead of 'warc'. Update all documentation and examples to reflect the new binary name.
NGTmeaty
approved these changes
Oct 7, 2025
NGTmeaty
left a comment
Collaborator
There was a problem hiding this comment.
Looks good. Thank you for the modifications!
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces significant improvements to the
gowarcproject by restructuring the CLI tools, enhancing usability, and improving code organization and documentation. The main changes include modularizing command implementations, introducing a robust logging and flag system, and providing comprehensive documentation for CLI usage, especially for the newmendcommand.CLI mend command:
This command replicates the mend feature of warctool, allowing to properly truncate and close/rename .open WARC files left by gowarc in case of crashes.
CLI and Documentation Enhancements:
README.md, documenting installation, available commands (extract,mend,verify,completion), usage examples, global flags, and linking to further documentation formend.cmd/mend/README.mdwith extensive documentation on themendcommand, outlining features, usage, safety, and example outputs.Codebase Refactoring and Modularization:
extractcommand implementation:cmd/extract.gotocmd/extract/extract.gofor better modularity and maintainability.utilspackage for shared functionality.utilsfor better maintainability [1] [2] [3] [4] [5].CLI Framework and Logging Improvements:
cmd/main.goto:extract,mend,verify) using their modularizedCommandobjects instead of monolithic command definitions [1] [2].slogbased on these flags.These changes collectively improve the usability, maintainability, and extensibility of the
gowarcCLI, making it easier for users to interact with WARC files and for developers to add new features in the future.References: