Replace regex IP validation with ipaddress stdlib in EnrichmentSerializer. Closes #881 by tanmayjoddar · Pull Request #885 · intelowlproject/GreedyBear

tanmayjoddar · 2026-02-25T22:12:22Z

Description

EnrichmentSerializer.validate() used REGEX_IP which only checked digit count per octet (\d{1,3}), not the valid 0-255 range. This caused invalid IPs like 999.999.999.999 to be accepted. Additionally, valid IPv6 addresses like 2001:db8::1 were incorrectly rejected.

Fixed by replacing the regex with ipaddress.ip_address() from Python stdlib.

Note: attempted to use the existing is_ip_address() from api/views/utils.py as suggested, but this caused a circular import since api/views/utils.py imports from api/serializers.py. Used ipaddress.ip_address() directly instead, which is the same underlying implementation. Happy to refactor if there is a preferred way to break the circular dependency.

Related issues

Closes #881

Type of change

Bug fix (non-breaking change which fixes an issue).
New feature (non-breaking change which adds functionality).
Breaking change (fix or feature that would cause existing functionality to not work as expected).
Chore (refactoring, dependency updates, CI/CD changes, code cleanup, docs-only changes).

Checklist

Formalities

I have read and understood the rules about how to Contribute to this project.
I chose an appropriate title for the pull request in the form: <feature name>. Closes #999
My branch is based on develop.
The pull request is for the branch develop.
I have reviewed and verified any LLM-generated code included in this PR.

Docs and tests

I documented my code changes with docstrings and/or comments.
I have checked if my changes affect user-facing behavior that is described in the docs. No user-facing behavior is affected.
Linter (Ruff) gave 0 errors.
I have added tests for the feature/bug I solved.
All the tests gave 0 errors.

GUI changes

No GUI changes were made.

Review process

PR was opened as ready for review after all 468 tests passed locally.

Copilot

Pull request overview

This PR fixes IP validation in EnrichmentSerializer by replacing regex-based validation (which incorrectly accepted invalid IPs like "999.999.999.999") with Python's ipaddress.ip_address() from the stdlib. The change also adds IPv6 support and properly rejects IPs with octets outside the 0-255 range.

Changes:

Replaced REGEX_IP pattern matching with ipaddress.ip_address() for proper IP validation (IPv4 and IPv6)
Enhanced domain validation to require at least one alphabetic character
Added comprehensive test coverage for invalid IPs, IPv6 addresses, and whitespace handling

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
api/serializers.py	Replaced regex IP validation with `ipaddress.ip_address()`, added `.strip()` for whitespace handling, enhanced domain validation logic, removed unused `REGEX_IP` import
tests/api/views/test_enrichment_view.py	Added 4 new test cases covering invalid IPs with out-of-range octets (999.999.999.999, 256.1.1.1), valid IPv6 addresses, and whitespace handling

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-25T22:17:09Z

api/serializers.py

+        observable = data["query"].strip()
+
+        try:
+            ipaddress.ip_address(observable)
+            is_ip = True
+        except ValueError:
+            is_ip = False
+
+        # A valid domain must match the domain regex AND contain at least one
+        is_domain = bool(re.match(REGEX_DOMAIN, observable)) and any(c.isalpha() for c in observable)
+
+        if not is_ip and not is_domain:
+            raise serializers.ValidationError("Observable is not a valid IP address or domain")
+
        try:
            required_object = IOC.objects.get(name=observable)


The stripped observable is validated but not written back to data["query"]. This means whitespace in the query parameter will be preserved in the database lookup and potentially stored. You should add data["query"] = observable after line 42 to ensure the stripped value is used consistently throughout the validation and database operations.

Copilot · 2026-02-25T22:17:10Z

api/serializers.py

+        except ValueError:
+            is_ip = False
+
+        # A valid domain must match the domain regex AND contain at least one


The comment on line 50 is incomplete: "A valid domain must match the domain regex AND contain at least one". It should complete the thought, e.g., "A valid domain must match the domain regex AND contain at least one alphabetic character".

Suggested change

# A valid domain must match the domain regex AND contain at least one

# A valid domain must match the domain regex AND contain at least one alphabetic character

Copilot is right here, no?

Copilot · 2026-02-25T22:17:10Z

tests/api/views/test_enrichment_view.py

+    def test_query_with_whitespace(self):
+        """Query with leading/trailing whitespace should be handled"""
+        response = self.client.get("/api/enrichment?query=  192.168.0.1  ")
+        self.assertEqual(response.status_code, 200)


The new validation logic adds a requirement that domains must contain at least one alphabetic character (line 51 in api/serializers.py). However, there are no new tests verifying that valid domains still work correctly after this change. Consider adding a test case that queries for a valid domain (e.g., "malicious.example.com" from the test fixtures) to ensure domain validation hasn't regressed.

regulartim

Hey @tanmayjoddar ! We need to make some changes here.

I'll use the existing is_ip_address() from api/views/utils.py directly instead of reimplementing.

You did not do that, why?

Also, you did not use our PR template and you did not follow the convention for PR titles. Please make sure you use the PR template and maybe read the contribution guidelines again.

tanmayjoddar · 2026-02-26T12:04:21Z

Thanks for the review @regulartim!

I initially tried importing is_ip_address() directly from api.views.utils as you suggested. However, it causes a circular import:

api/serializers.py → from api.views.utils import is_ip_address

api/views/utils.py (line 19) → from api.serializers import FeedsRequestSerializer

This raises:

ImportError: cannot import name 'is_ip_address' from partially initialized module 'api.views.utils'

I used ipaddress.ip_address() directly as a workaround since it's the same underlying call that is_ip_address() makes.

I have also updated the PR title and body to follow the contribution guidelines and PR template. Sorry for missing that initially.

regulartim

Thanks for the clarification! 👍

I initially tried importing is_ip_address() directly from api.views.utils as you suggested. However, it causes a circular import

Then we should move the function to a different file I guess. What do you think about creating a greedybear/utils.py file and populate it with this function? We might find other shared functions that also should live there.

regulartim · 2026-02-26T21:06:46Z

api/serializers.py

+        except ValueError:
+            is_ip = False
+
+        # A valid domain must match the domain regex AND contain at least one


Copilot is right here, no?

tanmayjoddar · 2026-02-27T07:12:57Z

Agreed, utils.py makes sense as a dependency-free home for shared validation helpers.
Looking at the codebase, is_sha256hash() is in the same situation as is_ip_address() — both are pure stdlib wrappers with no dependency on Django views, serializers, or models, but used across multiple layers:
-command_sequence.py— uses both
-cowrie_session.py — uses both
-erializers.py — needs is_ip_address() (this PR)
-test_validation_helpers.py — tests both
I'll move both to utils.py and update all import sites accordingly.

--Also fixed the incomplete comment flagged by Copilot in the latest push.

regulartim

I'll move both to utils.py and update all import sites accordingly.

You did not do that yet. Do you want to do it here or in a separate PR?

tanmayjoddar · 2026-02-27T07:26:49Z

I'll do it in this PR. Working on it now.

tanmayjoddar · 2026-02-27T07:43:36Z

Done — created greedybear/utils.py with both is_ip_address() and is_sha256hash(), updated all import sites, and removed the moved functions from api/views/utils.py. Ready for re-review.

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-27T07:44:01Z

api/serializers.py

+        data["query"] = observable
+
+        # A valid domain must match the domain regex AND contain at least one alphabetic character
+        is_domain = bool(re.match(REGEX_DOMAIN, observable)) and any(c.isalpha() for c in observable)


The domain validation logic with the alphabetic character requirement is not covered by tests in the enrichment endpoint. While the IP validation tests are comprehensive, consider adding a test for valid domain queries (e.g., querying "malicious.example.com" which exists in test data) to ensure this validation logic works correctly.

…izer. Closes intelowlproject#881 - Create greedybear/utils.py with shared validation helpers (is_ip_address, is_sha256hash) - Replace REGEX_IP with is_ip_address() for proper IPv4/IPv6 validation - Add .strip() with write-back for whitespace handling - Improve domain validation to require at least one alphabetic character - Update all import sites to use greedybear.utils - Remove moved functions and unused imports from api/views/utils.py - Add regression tests for invalid IPs, valid IPv6, whitespace, and domains

regulartim

Good job, thanks! :)

…izer. Closes intelowlproject#881 (intelowlproject#885) - Create greedybear/utils.py with shared validation helpers (is_ip_address, is_sha256hash) - Replace REGEX_IP with is_ip_address() for proper IPv4/IPv6 validation - Add .strip() with write-back for whitespace handling - Improve domain validation to require at least one alphabetic character - Update all import sites to use greedybear.utils - Remove moved functions and unused imports from api/views/utils.py - Add regression tests for invalid IPs, valid IPv6, whitespace, and domains

Copilot AI review requested due to automatic review settings February 25, 2026 22:12

Copilot started reviewing on behalf of tanmayjoddar February 25, 2026 22:12 View session

Copilot AI reviewed Feb 25, 2026

View reviewed changes

tanmayjoddar mentioned this pull request Feb 25, 2026

fix(api): EnrichmentSerializer accepts invalid IPs like 999.999.999.999 #881

Closed

regulartim reviewed Feb 26, 2026

View reviewed changes

tanmayjoddar changed the title ~~fix(api): replace regex IP validation with ipaddress stdlib in EnrichmentSerializer~~ fix(api): use ipaddress stdlib for IP validation in EnrichmentSerializer. Closes #881 Feb 26, 2026

tanmayjoddar changed the title ~~fix(api): use ipaddress stdlib for IP validation in EnrichmentSerializer. Closes #881~~ Replace regex IP validation with ipaddress stdlib in EnrichmentSerializer. Closes #881 Feb 26, 2026

regulartim requested changes Feb 26, 2026

View reviewed changes

tanmayjoddar force-pushed the fix/enrichment-ip-validation-881-v2 branch from e9bc9a6 to fa36d1b Compare February 27, 2026 07:05

tanmayjoddar requested a review from regulartim February 27, 2026 07:17

regulartim reviewed Feb 27, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings February 27, 2026 07:39

tanmayjoddar force-pushed the fix/enrichment-ip-validation-881-v2 branch from fa36d1b to 863009e Compare February 27, 2026 07:39

Copilot started reviewing on behalf of tanmayjoddar February 27, 2026 07:39 View session

tanmayjoddar requested a review from regulartim February 27, 2026 07:43

Copilot AI reviewed Feb 27, 2026

View reviewed changes

tanmayjoddar force-pushed the fix/enrichment-ip-validation-881-v2 branch from 863009e to 69d2575 Compare February 27, 2026 07:50

regulartim approved these changes Feb 27, 2026

View reviewed changes

regulartim merged commit d68c85c into intelowlproject:develop Feb 27, 2026
5 checks passed

	# A valid domain must match the domain regex AND contain at least one
	# A valid domain must match the domain regex AND contain at least one alphabetic character

Uh oh!

Conversation

tanmayjoddar commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related issues

Type of change

Checklist

Formalities

Docs and tests

GUI changes

Review process

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

regulartim Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

regulartim left a comment

Choose a reason for hiding this comment

Uh oh!

tanmayjoddar commented Feb 26, 2026

Uh oh!

regulartim left a comment

Choose a reason for hiding this comment

Uh oh!

regulartim Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

tanmayjoddar commented Feb 27, 2026

Uh oh!

regulartim left a comment

Choose a reason for hiding this comment

Uh oh!

tanmayjoddar commented Feb 27, 2026

Uh oh!

tanmayjoddar commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

regulartim left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tanmayjoddar commented Feb 25, 2026 •

edited

Loading

tanmayjoddar commented Feb 27, 2026 •

edited

Loading