Skip to content

ARC handler emits lots of false positives due to bad handler/yara rule combination. #129

@qkaiser

Description

@qkaiser

The ARC handler matches on the following YARA rule:

$arc_magic = /\x1A[\x00-\x07][\S]{12}[\x00|\xf0-\xff][\x00-\xff]{4}[\x00-\x8d][\x00-\x8f][\x00-\xc7][\x00-\x9f]/

And then does this:

offset = start_offset
while True:
    file.seek(offset)
    read_bytes = file.read(2)
    
    if read_bytes == "\x1A\x00":
        offset += 2
        break
    file.seek(offset)
    header = self.parse_header(file)
    offset += len(header) + header.size

I identified the issue by generating random ARC files. If the header starts with \x1A\x00, it will fail early on the first loop and the header object won't be set. Which ends up with us generating lots of false positive ARC chunks of length 2.

We can fix it by moving the Yara rule to this one:

$arc_magic = /\x1A[\x01-\x07][\S]{12}[\x00|\xf0-\xff][\x00-\xff]{4}[\x00-\x8d][\x00-\x8f][\x00-\xc7][\x00-\x9f]/

We won't loose anything, a second byte set to "\x00" in the ARC header means "no compression" and we should only use this "\x1A\x00" to match the archive's end. Interestingly, we stop having false positives when doing so. Magic 🌟

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions