Skip to content

OSSFuzz Bug Fix #1203

Merged
Gallaecio merged 11 commits intoscrapinghub:masterfrom
ennamarie19:feat/oss-fuzz-bug-fix
Dec 21, 2023
Merged

OSSFuzz Bug Fix #1203
Gallaecio merged 11 commits intoscrapinghub:masterfrom
ennamarie19:feat/oss-fuzz-bug-fix

Conversation

@ennamarie19
Copy link
Copy Markdown
Contributor

This pull request fixes a crash encountered during OSSFuzz continuous fuzzing. Previously, an InvalidTimeError exception in pytz's UTC offset when processing 01-01-1970 0400 would cause parsing to exit prematurely. This fix catches the exception and resolves dateobj_time the secondary way.

Additionally, to detect bugs prior to them being merged into the codebase, the CIFuzz.yml job that comes with OSSFuzz is proposed in this PR. With this, each PR will be fuzz-tested as part of the CI workflow.

Thank you for your time and your review!

@github-advanced-security
Copy link
Copy Markdown

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

@Gallaecio
Copy link
Copy Markdown
Contributor

Could we include a test case for this scenario as well?

@capuanob
Copy link
Copy Markdown
Contributor

@Gallaecio Good evening, we just pushed a test case to go with the change. It utilizes the same settings and args that caused the bug during fuzzing to ensure that this bug is not reintroduced down the line.

Thank you!

Comment on lines +570 to +578
dateobj_time = None
if tz:
try:
dateobj_time = (dateobj - tz.utcoffset(dateobj)).time()
except pytz.InvalidTimeError:
pass

if not dateobj_time:
dateobj_time = dateobj.time()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code looks like a no-op at the moment, it defines a dateobj_time variable that is never used.

@ennamarie19
Copy link
Copy Markdown
Contributor Author

@Gallaecio It looks like someone updated the same function and merged to main since this fix, which caused a merge conflict that was not resolved correctly. This latest commit actually handles the bug

Copy link
Copy Markdown
Contributor

@Gallaecio Gallaecio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@Gallaecio Gallaecio merged commit 5355e86 into scrapinghub:master Dec 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants