Improve API performance. Closes #394#397
Improve API performance. Closes #394#397regulartim merged 18 commits intoGreedyBear-Project:developfrom
Conversation
…x. This commit will be included in a future PR. This reverts commit ca95b9a.
… class with custom function
888b400 to
6461156
Compare
| VALID_ATTACK_TYPES = {"scanner", "payload_request", "all"} | ||
| VALID_AGES = {"persistent", "recent"} | ||
| VALID_FORMATS = {"csv", "json", "txt"} |
There was a problem hiding this comment.
ideally it would be better to have these values separated in a consts.py file so other modules can leverage it without importing all the stuff here.
There was a problem hiding this comment.
I removed them completely as they are not needed anymore.
| @cache | ||
| def feed_type_validation(feed_type: str) -> bool: | ||
| general_honeypots = GeneralHoneypot.objects.all().filter(active=True) | ||
| valid_types = set(["log4j", "cowrie", "all"] + [hp.name.lower() for hp in general_honeypots]) | ||
| return feed_type in valid_types |
There was a problem hiding this comment.
caching the query is a great idea, thank you!
However, this would make the output of the GeneralHoneypot.objects.all().filter(active=True) always the same, even if an admin changed something in that table. So this should not be cached in this way.
In my opinion, the better solution would be to calculate the items from GeneralHoneypots only once at the start of the available API views (feeds and feeds_pagination) and then pass the result as a parameter to the get_queryset and the feeds_response function and then, utlimately, to the Serializer for the usage.
There was a problem hiding this comment.
Problem: I am not able to pass the set of valid feed types through the Serializer to the validation function. If I understand the documentation correctly, the validation functions are build to only validate the data that was serialized by the serializer. But I do not serialize the set of valid feed types, so it ca not be passed to the validation function.
Suggestion: Instead of calling the feed_type_validation function inside the serializer class, we call it after the instantiation of the serializer in get_queryset and feeds_response. Are you happy with that?
There was a problem hiding this comment.
yep, I mean, I don't want to force any kind of specific pattern, like the DRF, because I think it is better to find a solution fast than to focus to code perfectionism because that can be always opinionable.
Another option can be using the object-level validation and extra-context. In this way you can pass all the data that you want through the serializers and with the more generic validator you can do all the stuff that you want
There was a problem hiding this comment.
The extra-context hint was what i needed! Thanks a lot! :)
| def feed_request_validation(data: dict) -> bool: | ||
| if not feed_type_validation(data["feed_type"]): | ||
| raise serializers.ValidationError(f"Invalid feed_type: {data['feed_type']}") | ||
| if data["attack_type"] not in VALID_ATTACK_TYPES: | ||
| raise serializers.ValidationError(f"Invalid attack_type: {data['attack_type']}") | ||
| if data["age"] not in VALID_AGES: | ||
| raise serializers.ValidationError(f"Invalid age: {data['age']}") | ||
| if data["format"] not in VALID_FORMATS: | ||
| raise serializers.ValidationError(f"Invalid format: {data['format']}") | ||
| return True | ||
|
|
||
|
|
||
| def ioc_validation(data: dict) -> bool: | ||
| if not feed_type_validation(data["feed_type"]): | ||
| raise serializers.ValidationError(f"Invalid feed_type: {data['feed_type']}") | ||
| if data["times_seen"] < 1: | ||
| raise serializers.ValidationError(f"Invalid value for 'times_seen'': {data['times_seen']}") | ||
| if not (data[SCANNER] or data[PAYLOAD_REQUEST]): | ||
| raise serializers.ValidationError(f"Invalid data: IOC is neither {SCANNER} nor {PAYLOAD_REQUEST}") | ||
| return True | ||
|
|
||
|
|
||
| def serialize_ioc(ioc, feed_type: str) -> dict: | ||
| data = { | ||
| "value": ioc.name, | ||
| SCANNER: ioc.scanner, | ||
| PAYLOAD_REQUEST: ioc.payload_request, | ||
| "first_seen": ioc.first_seen.strftime("%Y-%m-%d"), | ||
| "last_seen": ioc.last_seen.strftime("%Y-%m-%d"), | ||
| "times_seen": ioc.times_seen, | ||
| "feed_type": feed_type, | ||
| } | ||
| if not ioc_validation(data): | ||
| raise serializers.ValidationError(f"Unknown error while validating {data}") | ||
| return data |
There was a problem hiding this comment.
even if I am not an admirer of the Django Rest Framework at all, the custom Serializers are way more easy to read and to maintain. I would not change that stuff to simple functions.
There was a problem hiding this comment.
Ok, I'll look into it again. Maybe we can find a way to use the Django Serializers without such a massive performance loss.
There was a problem hiding this comment.
Unfortunately I have not found a possibility to do that. Since we do not pass the IOC object itself, but a dictionary into the Serializer, validation is mandatory. This seems to be expensive. I would like to add a constant that allows us to skip the serialisation via the Serializer and thus bypass the validation.
| POST = "POST" | ||
|
|
||
| FEEDS_LICENSE = "https://github.com/honeynet/GreedyBear/blob/main/FEEDS_LICENSE.md" | ||
| SKIP_FEED_VALIDATION = False |
There was a problem hiding this comment.
- if you want to dynamically change this without having to re-build the code, you can add it as an environment variable in
env_file_templateand then extract it as a Django constant insettings.py
There was a problem hiding this comment.
Great idea, thanks! :) Will do that and merge.
…ch may lead to ignoring admin changes
This reverts commit 3fbbcc0.
Description
Two changes are introduced here:
rest_framework, the request time could be reduced further to ~200ms. The downside is that some IOC properties (e.g. value, first_seen, ...) are not checked as carefully as before. But since these values come from our DB anyway, I personally think that the performance improvement is worth the inferior validation.Related issues
Type of change
Please delete options that are not relevant.
Checklist
develop.Black,Flake,Isort) gave 0 errors. If you have correctly installed pre-commit, it does these checks and adjustments on your behalf.Important Rules