Skip to content

refactor:honeypot extraction using DB-driven exclusion. closes #631#670

Merged
regulartim merged 11 commits intoGreedyBear-Project:developfrom
drona-gyawali:refactor/no-hardcode-val
Jan 8, 2026
Merged

refactor:honeypot extraction using DB-driven exclusion. closes #631#670
regulartim merged 11 commits intoGreedyBear-Project:developfrom
drona-gyawali:refactor/no-hardcode-val

Conversation

@drona-gyawali
Copy link
Copy Markdown
Contributor

@drona-gyawali drona-gyawali commented Jan 2, 2026

Description

This PR refactors the is_ready_for_extraction method to respect the GeneralHoneypot.active flag, making honeypot exclusion fully database-driven. No honeypots are hardcoded, and no migrations were added as discussed previously.

Changes

  • Normalizes honeypot names for cache lookup while preserving original casing in the DB.
  • Uses name__iexact for case-insensitive DB lookup.
  • Dynamically creates new honeypots with active=True.
  • Updates _honeypot_cache to reflect the DB active state.
  • Extraction pipeline now automatically skips honeypots where active=False.

Notes / Observations

  • The code assumes _honeypot_cache always uses normalized keys. If elsewhere the cache is populated with original casing, it could cause a cache miss and fallback to the DB. This was already the case before this change.
  • Currently, the GeneralHoneypot.name field is not unique at the DB level. I didn’t change this, but we might consider adding a unique constraint. Am I missing anything here?

Related issues

closes #631

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue).
  • New feature (non-breaking change which adds functionality).
  • Breaking change (fix or feature that would cause existing functionality to not work as expected).

Checklist

  • I have read and understood the rules about how to Contribute to this project.
  • The pull request is for the branch develop.
  • I have added documentation of the new features.
  • Linters (Black, Flake, Isort) gave 0 errors. If you have correctly installed pre-commit, it does these checks and adjustments on your behalf.
  • I have added tests for the feature/bug I solved. All the tests (new and old ones) gave 0 errors.
  • If changes were made to an existing model/serializer/view, the docs were updated and regenerated (check CONTRIBUTE.md).
  • If the GUI has been modified:
    • I have a provided a screenshot of the result in the PR.
    • I have created new frontend tests for the new component or updated existing ones.

Important Rules

  • If you miss to compile the Checklist properly, your PR won't be reviewed by the maintainers.
  • If your changes decrease the overall tests coverage (you will know after the Codecov CI job is done), you should add the required tests to fix the problem
  • Everytime you make changes to the PR and you think the work is done, you should explicitly ask for a review. After being reviewed and received a "change request", you should explicitly ask for a review again once you have made the requested changes.

@drona-gyawali
Copy link
Copy Markdown
Contributor Author

Hi @regulartim,
I’ve refactored is_ready_for_extraction to fully respect GeneralHoneypot.active, and all tests are passing.

Could you clarify the plan for setting up the initial honeypot exclusion list? Will it be handled via a migration file, raw SQL, the admin interface, or some other method? I want to make sure our implementation aligns with the intended workflow.

@regulartim regulartim marked this pull request as ready for review January 2, 2026 19:02
Copy link
Copy Markdown
Collaborator

@regulartim regulartim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your approach makes matters more complicated than they were before: it has a cache that contains the not-normalized honeypot names as keys (they usually start with an upper case character) so your first cache lookup always fails and, in the next step, you write a cache entry for the normalized name. Am I right?
What do you think about only writing normalized keys to the cache? That would make things easier, right?

@regulartim
Copy link
Copy Markdown
Collaborator

Could you clarify the plan for setting up the initial honeypot exclusion list? Will it be handled via a migration file, raw SQL, the admin interface, or some other method? I want to make sure our implementation aligns with the intended workflow.

Yep, will write that into the issue. 👍

@regulartim
Copy link
Copy Markdown
Collaborator

Is this ready to get reviewed again?

@drona-gyawali
Copy link
Copy Markdown
Contributor Author

Is this ready to get reviewed again?

Hi @regulartim,

Yes, it’s ready for review again.

I added a migration file for the initial honeypot setup. In the migration, I’m using a try/except (get → create) pattern intentionally, since it’s limited strictly to the migration and only used for the one-time initial setup.

Runtime extraction now relies solely on the normalized cache, with no DB creation or fallback logic involved.

Please let me know if you’d like any adjustments.

@regulartim
Copy link
Copy Markdown
Collaborator

Ah, I didn't notice it was ready. You can click "re-request review" next time, then that I get notified. I'll take a look now.

Copy link
Copy Markdown
Collaborator

@regulartim regulartim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seems like we are having some communication issues. If your not sure what to do, please thoroughly read the issue and the comments in the PR again. If you still have questions or things are unclear, please ask (in the issue or the PR, as you like).

Copy link
Copy Markdown
Collaborator

@regulartim regulartim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks for your work. One last thing that I am concerned about:
Say we have a honeypot named "Cowrie" in our database. Now for some odd reason we extract an event from T-Pot where the name of the honeypot is lower case "cowrie". What do you think would happen then? Can we cover this with one or more tests?

@drona-gyawali
Copy link
Copy Markdown
Contributor Author

Looks good. Thanks for your work. One last thing that I am concerned about:
Say we have a honeypot named "Cowrie" in our database. Now for some odd reason we extract an event from T-Pot where the name of the honeypot is lower case "cowrie". What do you think would happen then? Can we cover this with one or more tests?

Thanks for the review!

I’ve added three tests that together cover the case-insensitive handling of honeypot names:

  1. Ensures an enabled honeypot like "Cowrie" works even when called as lowercase "cowrie".
  2. Ensures a disabled honeypot returns False even with a lowercase lookup.
  3. Confirms that special honeypots (Cowrie, Log4Pot) are always enabled and normal honeypots respect their active status, all in a case-insensitive manner.

I hope this tests satisfy the concern about case-insensitive extraction from T-Pot events.

Copy link
Copy Markdown
Collaborator

@regulartim regulartim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're getting close! 👍

@regulartim regulartim merged commit fc5b5f1 into GreedyBear-Project:develop Jan 8, 2026
5 checks passed
@regulartim
Copy link
Copy Markdown
Collaborator

Thanks for your work! :) As a follow up, would you like to open a new issue describing the problem of the not-normalized honeypot names in the DB and how to handle this?

@drona-gyawali
Copy link
Copy Markdown
Contributor Author

Thanks for your work! :) As a follow up, would you like to open a new issue describing the problem of the not-normalized honeypot names in the DB and how to handle this?

Thanks for the review and merge!
I’ll open a new issue describing the problem and the approach to handle this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants