Following up on #624 the new way of extraction hits from Elasticsearch might be too memory intensive.
https://github.com/intelowlproject/GreedyBear/blob/d5a9906da5cd3ebf293ffedc77f466400cf0b1be/greedybear/cronjobs/repositories/elastic.py#L74-L79
In line 74, the list constructor is called on search.scan(). The scan method returns an iterator containing all hits from the requested time span. But because we want to cache it, we have to create a list from the iterator, which means that all hits will be stored in system memory. This might be an issue when
a) the time span is very long (e.g. 3 days on the initial extraction run) and
b) the T-Pot instance records a high number of attacks
Since I am currently only using a T-Pot instance with very few active honeypots which is running on a residential internet connection, I only see ~25.000 honeypot hits a day, which is not a problem. So it would be nice if someone could test this on a GreedyBear instance that gets attacked more frequently.
Currently this code is only in develop and before it gets merged we should make sure that it is not too bad.
Following up on #624 the new way of extraction hits from Elasticsearch might be too memory intensive.
https://github.com/intelowlproject/GreedyBear/blob/d5a9906da5cd3ebf293ffedc77f466400cf0b1be/greedybear/cronjobs/repositories/elastic.py#L74-L79
In line 74, the
listconstructor is called onsearch.scan(). Thescanmethod returns an iterator containing all hits from the requested time span. But because we want to cache it, we have to create a list from the iterator, which means that all hits will be stored in system memory. This might be an issue whena) the time span is very long (e.g. 3 days on the initial extraction run) and
b) the T-Pot instance records a high number of attacks
Since I am currently only using a T-Pot instance with very few active honeypots which is running on a residential internet connection, I only see ~25.000 honeypot hits a day, which is not a problem. So it would be nice if someone could test this on a GreedyBear instance that gets attacked more frequently.
Currently this code is only in
developand before it gets merged we should make sure that it is not too bad.