Refactor scoring jobs to use IocRepository. Addresses #633#696
Conversation
- Add scoring-specific methods to IocRepository: - get_scanners_for_scoring(): Fetch scanners for scoring updates - get_scanners_by_pks(): Retrieve scanners by primary keys - get_recent_scanners(): Get scanners seen after cutoff date - bulk_update_scores(): Bulk update score fields - Refactor UpdateScores to use dependency injection with IocRepository - Add optional ioc_repo parameter (backward compatible) - Replace IOC.objects.filter() with repository methods - Replace IOC.objects.bulk_update() with repository.bulk_update_scores() - Update utility functions to accept IocRepository parameter: - get_current_data(days_lookback, ioc_repo) - get_data_by_pks(primary_keys, ioc_repo) - Remove unused Django ORM imports (Q, F, ArrayAgg, IOC model) - Add comprehensive test coverage (47 tests total): - 11 basic repository method tests - 10 edge case tests (empty results, inactive honeypots, etc.) - 6 integration tests proving end-to-end functionality - All tests passing with full Ruff compliance This change improves testability and consistency with the extraction pipeline. Addresses Phase 1 of issue intelowlproject#633.
a10d460 to
ec0f633
Compare
CI Failures & FixesThe initial CI runs failed due to two issues in the Ruff Formatting IssueFile: Error: Root Cause:
Question: I'm not entirely sure how this passed CI initially. Looking at the GitHub Actions configuration, Ruff runs with the Fix: Commit Test FailureFile: Error: AssertionError: 21 != 3 # Expected 3 honeypots, found 21Root Cause (my best guess):
Uncertainty: I'm not 100% certain why this didn't fail in recent PRs before mine. It's possible the test only fails when multiple Fix: Commit Note: I've included these fixes as separate commits since they address pre-existing fragility in the test suite. Happy to discuss if there's a better approach for handling this! |
|
The tests merged in #669 caused other tests to fail. I'm also not sure how the CI did not catch this, I will investigate next week. I pushed a fix for this into |
regulartim
left a comment
There was a problem hiding this comment.
All in all very good! There should now be several functions/methods that are not used anymore, right? If so, please delete them.
| from django.contrib.postgres.aggregates import ArrayAgg | ||
| from django.db.models import F |
There was a problem hiding this comment.
Why do you import inside of the method?
There was a problem hiding this comment.
Good catch! I put them inside the methods to follow the lazy import pattern I saw in other parts of the codebase (like in UpdateScores.__init__), but you're absolutely right that it's more conventional to have them at the top.
(I initially thought it might help avoid any potential circular import issues, but these are Django core imports so that's not a concern here.)
There was a problem hiding this comment.
I don't see any imports in UpdateScores.__init__. Out of curiosity, would you please provide an example?
Generally, I would prefer to stick to the convention and import at the top of the file. Except there is a good reason not to do so, of course.
- Move all Django imports (ArrayAgg, F, Q) to module top - Use ternary operators for cleaner None checks - Fix test_200_active_general_honeypots to properly verify filtering logic Changes: - ioc.py: Moved imports to top, removed duplicates from methods - scoring_jobs.py: Simplified __init__ and update_db with ternary operators - utils.py: Moved IocRepository import to top, simplified both functions - test_views.py: Restored proper active/inactive honeypot filtering assertions Addresses code review comments from @regulartim
- Move IocRepository import from UpdateScores.__init__ to module level - Consistent with maintainer's feedback on keeping all imports at top - No circular dependency issue, safe to move
|
Hi @regulartim, Thank you for the feedback! I've thoroughly analyzed the codebase to identify any unused functions/methods that may have become redundant after the refactoring. Here's what I found: Analysis SummaryI checked all functions/methods across the refactored files:
|
|
My bad, I thought of |
regulartim
left a comment
There was a problem hiding this comment.
Very nice! I really appreciate your work.
Description
This PR refactors the scoring module to use the repository pattern, making it consistent with the extraction pipeline and significantly improving testability.
Related issues
Addresses Phase 1 of #633
Type of change
Changes Made
1. Extended IocRepository with scoring-specific methods
Added four new methods to greedybear/cronjobs/repositories/ioc.py:
2. Refactored UpdateScores class
File: greedybear/cronjobs/scoring/scoring_jobs.py
ioc_repoparameter (backward compatible withioc_repo=None)IOC.objects.filter()calls withioc_repo.get_scanners_for_scoring()IOC.objects.bulk_update()withioc_repo.bulk_update_scores()Qimport3. Updated utility functions
File: greedybear/cronjobs/scoring/utils.py
Q,F,ArrayAgg, IOC4. Comprehensive test coverage
File: tests/test_repositories.py
Benefits
ioc_repo=Nonefallback)Checklist
develop.Ruff) gave 0 errors. Pre-commit hooks passed successfully.