regex – A Passionate Techie

OSSEC – Custom rules examples

Silencing certain rules

<rule id="100030" level="0">
  <if_sid>503,502</if_sid>
  <description>List of rules to be ignored.</description>
</rule>

OSSEC will not produce any alert when rule 502 and 503 is triggered, level=”0″ ignores alerts.

Ignore alert if rules triggered by certain IP

<rule id="100225" level="0">
  <if_sid>40101</if_sid>
  <srcip>127.0.0.1</srcip>
  <description>Ignore this</description>
</rule>

If rule 40101 triggered by 127.0.0.1, dont produce any alert

Ignore alert if contains certain strings

<rule id="100223" level="0">
  <if_sid>1002</if_sid>
  <match>terrorist|terror|femmefatale|heart-attack</match>
  <description>Ignore 1002 false positive</description>
</rule>

OSSEC is using OS_match/sregex syntax in <match>

Ignore alert if contains certain strings (using regex)

<rule id="100207" level="4">
  <if_sid>1002,1003</if_sid>
  <regex>^WordPress database error You have an error in your SQL syntax(\.*)functionName$</regex>
  <description>Unescaped SQL query, known issue</description>
</rule>

OSSEC is using OS_regex/regex syntax in <regex>

Trigger custom rule when certain field match certain value in cdb list

<rule id="100215" level="5">
  <if_sid>31101</if_sid>
  <list lookup="match_key" field="url">rules/badurl</list>
  <description>URL is in badurl</description>
</rule>

Trigger custom rule when certain rules is fired x time within n second from same srcip

<rule id="100216" level="10" frequency="4" timeframe="90">
  <if_matched_sid>100215</if_matched_sid>
  <same_source_ip />
  <description>Multiple badurl access </description>
  <description>from same source ip.</description>
  <group>web_scan,recon,</group>
</rule>

Overriding rules

<rule id="1003" level="13" overwrite="yes" maxsize="2000">
  <description>Non standard syslog message (size too large).</description>
</rule>

Original rule 1003 have 10245 as its maxsize. Using overwrite=”yes” will make OSSEC overwrite certain field in original rule

Custom rule group

<group name="app_error">
  <rule id="100207" level="4">
    <if_sid>1002,1003</if_sid>
    <regex>^WordPress database error You have an error in your SQL syntax(\.*)functionName$</regex>
    <description>Unescaped SQL query, known issue</description>
  </rule>

  <rule id="100218" level="0">
    <if_sid>1003</if_sid>
    <match>WUID | WTB</match>
    <description>ignorance is bliss</description>
  </rule>
</group>

Writing OSSEC Custom Rules, Decoders and Alerts

OSSEC (http://www.ossec.net) is an open-source host-based intrusion detection system (HIDS). OSSEC can be used to monitor your local files and logs to check for intrusions, alert you of rootkit installation and do file integrity checking. OSSEC is a wonderful tool because it is highly customizable. By default, OSSEC monitors many of the programs commonly installed on a machine, but its real power comes from the ability of system administrators to customize OSSEC. By writing custom rules and decoders, you can allow OSSEC to parse through non-standard log files and generate alerts based on custom criteria. This allows OSSEC to monitor custom applications and provide intrusion detection services that might otherwise not be available, or would have to be developed on a per-application basis.

OSSEC rules are based on log file parsing. The log files that OSSEC monitors are specified in the /var/log/ossec/etc/ossec.conf file in the following format:

  <!-- Files to monitor (localfiles) -->

  <localfile>
    <log_format>syslog</log_format>
    <location>/var/log/messages</location>
  </localfile>

  <localfile>
    <log_format>syslog</log_format>
    <location>/var/log/secure</location>
  </localfile>

  <localfile>
    <log_format>syslog</log_format>
    <location>/var/log/maillog</location>
  </localfile>

  <localfile>
    <log_format>apache</log_format>
    <location>/var/log/httpd/error_log</location>
  </localfile>

  <localfile>
    <log_format>apache</log_format>
    <location>/var/log/httpd/access_log</location>
  </localfile>

Each file that is monitored depends on a “decoder” which is a regular expression used to parse up the pieces of the log file to extract fields such as the source IP, the time, and the actual log message. The decoders are stored in /var/ossec/etc/decoder.xml. The following is an extract of the SSH decoder portion of the decoder.xml logfile:

<decoder name="sshd">
  <program_name>^sshd</program_name>
</decoder>

<decoder name="sshd-success">
  <parent>sshd</parent>
  <prematch>^Accepted</prematch>
  <regex offset="after_prematch">^ \S+ for (\S+) from (\S+) port </regex>
  <order>user, srcip</order>
  <fts>name, user, location</fts>
</decoder>

<decoder name="ssh-denied">
  <parent>sshd</parent>
  <prematch>^User \S+ from </prematch>
  <regex offset="after_parent">^User (\S+) from (\S+) </regex>
  <order>user, srcip</order>
</decoder>

You can see that the decoder.xml file is used to parse through the log file using regular expression pattern matching. This means that you can add additional files to the list of those which OSSEC is checking if you would like. You’ll also note that the XML rules in decoder.xml are nested so that you can use the <parent> tag to nest rules. A rule with a “parent” will only attempt matching if the parent rule matched successfully. Using the order and its statements you can populate OSSEC’s predefined variables with portions of the log file. The following variables are supported:

location
hostname
log_tag
srcip
dstip
srcport
dstport
protocol
action
user
dstuser
id
command
url
data

Supposing you have a log file produced by an application that isn’t covered by the default decoders you could write your own decoder and parsing rules. Unfortunately OSSEC only supports logs in the formats syslog, snort-full, snort-fast, squid, iis, eventlog, mysql_log, postgresql_log, nmapg or apache. Therefore any custom logging you write must conform to one of these formats. Syslog is probably the easiest to use as it is designed to handle any one line log entry.

Let us suppose we have a custom PHP based application that resides in /var/www/html/custom. Our application will write Apache format logs to a file called ‘alert.log’ in the ‘logs/’ application subdirectory. This program has the following lines in example.php:

<?php

$id = $_GET['id'];
$logfile = 'logs/alert.log';
if (! is_numeric($_GET['id'])) {
	$timestamp = date("Y-m-d H:m:s ");
	$ip = $_SERVER['REMOTE_ADDR'];
	$log = fopen($logfile, 'a');
	$message = $timestamp . $ip . ' PHP app Attempt at non-numeric input (possible attack) detected!' . "\n";
	fwrite($log, $message);
}

?>

This would write a log file to /var/www/html/custom/logs/alert.log in the format:

2009-10-13 11:10:36 192.168.97.1 PHP app Attempt at non-numeric input (possible attack) detected!

Once we have this application log set up we need to adjust our OSSEC configuration so that it reads the new log file. The following change needs to be done in both agent and server’s ossec.conf file. We can add the following lines to our /var/ossec/etc/ossec.conf file to enable OSSEC to read this new log file:

  <localfile>
    <log_format>syslog</log_format>
    <location>/var/www/html/custom/logs/alert.log</location>
  </localfile>

Once OSSEC is monitoring this file (this will require us to restart OSSEC) we’ll need an appropriate decoder. Make this change on ossec server. Add the following in /var/ossec/etc/local_decoder.xml. By default ossec reads only 2 decoder files: decoder.xml and local_decoder.xml. decoder.xml can be overwritten during upgrades, so add all the custom decoders in local_decoder.xml. Writing a decoder for this format would be quite simple. It would appear as:

<!-- Custom decoder for example -->
<decoder name="php-app">
  <prematch>^\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d</prematch>
</decoder>

<decoder name="php-app-alert">
  <parent>php-app</parent>
  <regex offset="after_parent">^ (\d+.\d+.\d+.\d+) PHP app</regex>
  <order>srcip</order>
</decoder>

What we’re doing here is telling OSSEC how to extract IP information from the log. All the strings in the regex portion of the new decoder can be assigned, in order, to options listed in the order tag. You can define each of OSSEC’s possible variables and tell OSSEC how to identify them in the logs using the decoder.

Once we have our decoder we can write custom rules based on the log file. This is to be done on ossec server. There are two ways to create custom rules for OSSEC. The first is to alter the ossec.conf configuration file and add a new rule file to the list. The second is to simply append your rules to the local-rules.xml rules file. Either one works, but the second makes upgrading to newer versions of OSSEC a little cleaner.

We’ll add the following group to our local-rules.xml file, found in the rules directory under the OSSEC installation root:

<group name="syslog,php-app,">
  <rule id="110000" level="0">
    <decoded_as>php-app</decoded_as>
    <description>PHP custom app group.</description>
  </rule>

  <rule id="110001" level="10">
    <if_sid>110000</if_sid>
    <srcip>127.0.0.1</srcip>
    <match>attack</match>
    <description>Possible attack from localhost?!?</description>
  </rule>
</group>

You’ll notice that we have two rules. Because rules can be nested it is usually helpful to subdivide them into small, hierarchical pieces. In this case, we have one rule that serves as a catch-all for our custom application alerts. After that, we can write rules for any number of circumstances and have these rules only checked if the parent rule is matched. This rule will fire if an entry is written into the custom alert.log that contains the source IP of 127.0.0.1. The rule id is extremely important in this definition. OSSEC reserves rule id’s above 100,000 for custom rules. It is useful to develop a schema for your new rules, for instance allocating each 1000 above 100,000 for a generic, catch-all rule and writing child rules in that space. This helps to avoid the hassle of having intermingled rule numbers and aids in long term maintenance.

To clarify the case above, there are two rules. The first rule will only fire if a log entry is “decoded_as” or matches the decoder for “php-app.” If this decoder is used then rule 110,000 will be triggered. The second rule is only checked if rule 110,000 is triggered as specified in the if_sid tag. This rule will only be triggered if the source ip, specified in the srcip tag, is equal to ‘127.0.0.1.’ If this is the case then the rule will do a string match for the word “attack” in the log entry. If this match is successful the rule will trigger at level 10 as specified in the rule tag. This will cause an OSSEC alert to be logged with the associated description. OSSEC by default also attempts to e-mail alerts with level 7 or higher to recipients specified in the ossec.conf file. As you can see, with the addition of the decoder and these rules we’ve allowed OSSEC to read our custom format logfile.

While this example may seem straightforward writing your own decoders and rules can be maddening. Because OSSEC will not dynamically load the XML files defining your decoders, rules, or files to watch, you must restart the program to propagate changes. This can be a real hassle when you’re debugging new XML rules or decoders. To alleviate the problem of constantly restarting the server you can use the program ossec-logtest found in the bin directory of the OSSEC installation root. This is present on the ossec server. This program allows you to paste, or type, one line of a log file into the input then traces the decoders and rules that the line matches like so:

# bin/ossec-logtest 
2009/10/13 13:30:25 ossec-testrule: INFO: Started (pid: 14330).
ossec-testrule: Type one log per line.

2009-10-13 12:10:09 127.0.0.1 PHP app Attempt to attack the host!


**Phase 1: Completed pre-decoding.
       full event: '2009-10-13 12:10:09 127.0.0.1 PHP app Attempt to attack the host!'
       hostname: 'webdev'
       program_name: '(null)'
       log: '2009-10-13 12:10:09 127.0.0.1 PHP app Attempt to attack the host!'

**Phase 2: Completed decoding.
       decoder: 'php-app'
       srcip: '127.0.0.1'

**Phase 3: Completed filtering (rules).
       Rule id: '110001'
       Level: '10'
       Description: 'Possible attack from localhost?!?'
**Alert to be generated.

Note that this program will not reload changes, but you can quit ossec-logtest, make changes to any of the XML files then restart it to test your changes. Using ossec-logtest is invaluable when trying to create new rules as it saves you the hassle of restarting the server and the hassle of actually triggering events for which you want to generate alerts.

In case you get an issue like “No decoders match” in ossec-logtest, please check your decoder file. It might be most probably due to some syntax error in your decoder xml.

After your testing in logtest is successful, restart your ossec agent and master with

/var/ossec/bin/ossec-control restart

After this your alerting system will be in place for your custom logs just like other alerts.

How to delete a specific line from a text file in command line on Linux?

sed -i '4d' ./file

Here, -i means edit the file in-place. d is the command to “delete the pattern space; immediately start next cycle”. 4 means the 4th line.

Remove the last line:

sed '$d' filename.txt

Remove all empty lines:

sed '/^$/d' ./file

sed '/./!d' ./file

Remove lines from 7 to 9:

sed '7,9d' ./file

Remove the line matching by a regular expression REGULAR:

sed '/REGULAR/d' ./file

For a simple example, remove the lines containing “oops”:

sed '/oops/d' ./file

How to give a pattern for new line in grep?

grep patterns are matched against individual lines so there is no way for a pattern to match a newline found in the input.

However you can find empty lines like this:

grep '^$' file
grep '^[[:space:]]*$' file # include white spaces

grep for special characters in Unix

Tell grep to treat your input as a fixed string using -F option.

grep -F '*^%Q&$*&^@$&*!^@$*&^&^*&^&' application.log

Option -n is required to get the line number,

grep -Fn '*^%Q&$*&^@$&*!^@$*&^&^*&^&' application.log

Select only regex match from a continuous string

I want to use this regex

r"Summe\d+\W\d+"

to match this string

150,90‡50,90‡8,13‡Summe50,90•50,90•8,13•Kreditkartenzahlung

but I want to only filter out this specific part

Summe50,90

I can select the entire string with this regex but I’m not sure how to filter out only the matching part

here is the function it is in where i am trying to get the amount from a pdf:

    def get_amount(url):
      data = requests.get(url)
      with open('/Users/derricdonehoo/code/derric-d/price-processor/exmpl.pdf', 'wb') as f:
        f.write(data.content)

      pdfFileObj = open('exmpl.pdf', 'rb')
      pdfReader = PyPDF2.PdfFileReader(pdfFileObj)

      pageObj = pdfReader.getPage(0)
      text = pageObj.extractText().split()

      regex = re.compile(r"Summe\d+\W\d+")

      matches = list(filter(regex.search, text))
      for i in range(len(matches)):
        matchString = '\n'.join(matches)


      print(matchString)

as described above, I would like guidance on how I can best filter out a part of this string so that it returns just the matching portion. preferably with varying lengths of characters on either side but that’s not a priority.

thanks!!

Solution:

This is what you want, your regex is correct but you must get the match after searching for it.

  regex = re.compile(r"Summe\d+\W\d+")
  text = ["150,90‡50,90‡8,13‡Summe50,90•50,90•8,13•Kreditkartenzahlung"]

  matches = []
  for t in text:
    m = regex.search(t)
    if m:
      matches.append(m.group(0))

  print(matches)

re.search returns a Match object on success, None on failure, and that object contains all the information about your matching regex. To get the whole match you call Match.group().

Use regexp in bash to obtain substring of string

I have a string like follows:

my-name-is-yes-111111.maybe.text.here?-34.34.34

I’d like to use a regular expression to capture all text before the first instance of -[0-9] so in this case I would get:

my-name-is-yes

I’m going to be porting this to ansible so it must use regexp and not sed or awk or something like that.

I’ve used sed to come up with something, but again, I need regexp:

echo $x | rev |cut -d. -f6 | rev | sed -e 's/-[0-9]*$//g'
my-name-is-yes

Issue here is there may be more periods than . 6 that I would need to cut on.

Solution:

The expression that can extract our desired output here would be as simple as:

([A-Za-z-]+)(-[0-9].+)

and our desired data is in this capturing group: ([A-Za-z-]+).

Demo

Advice

user3299633 has much simplified that with this solution:

if [[ $x =~ ([[:alnum:]-]+)(-[[:digit:]].+) ]]; then echo ${BASH_REMATCH[1]}; fi

Split string only if BOTH the negative lookahead and negative lookbehind are statisfied

Hello i came along this question where the author wanted to convert the String:

exampleString =  "2 Marine Cargo       14,642 10,528  Denver Factory North     16,016 more text 8,609 argA 2,106 argB"

into an array / list that looks similar to this:

String[] resultArray = {"2", "Marine Cargo", "14,642", "10,528", "Denver Factory North", "16,016",
                "more text", "8,609", "argA", "2,106", "argB"};

So numeric parts (with or without a comma) are considered an element
and pure alpha sequences (divided by none, one or multiple spaces) are considered an element.

This can be done by matching the groups
or by splitting on the spaces where both the previous and the next part of the string is no alpha sequence. I am curious if the latter is possible.
I think part should be done with a negative look ahead:

\s+(?![A-Za-z]+)

and part with a negative look behind.

(?<![a-zA-Z])\s+

I am looking to combine both statements in such a way that it only does not match if both the parts before and after the sequence of spaces are alpha, so you can chain multiple words together without splitting in between. I found another question on this topic but i am not able to reverse engineer it for this particular case. Is this possible?

Solution:

You may use

String[] results = exampleString.split("(?<=\\d)\\s+(?=[a-zA-Z])|(?<=[a-zA-Z])\\s+(?=\\d)|(?<=\\d)\\s+(?=\\d)");

See the regex demo

Details

(?<=\d)\s+(?=[a-zA-Z]) – 1+ whitespaces that have a digit on the left and a letter on the right
| – or
(?<=[a-zA-Z])\s+(?=\d) – 1+ whitespaces that have a letter on the left and a digit on the right
| – or
(?<=\d)\s+(?=\d) – 1+ whitespaces that have a digit on the left and a digit on the right.

Java demo:

String exampleString =  "2 Marine Cargo       14,642 10,528  Denver Factory North     16,016 more text 8,609 argA 2,106 argB";
String results[] = exampleString.split("(?<=\\d)\\s+(?=[a-zA-Z])|(?<=[a-zA-Z])\\s+(?=\\d)|(?<=\\d)\\s+(?=\\d)");
for (String s: results) {
    System.out.println(s);
}

Output:

2
Marine Cargo
14,642
10,528
Denver Factory North
16,016
more text
8,609
argA
2,106
argB

How to combine all the words of a sentence extracted with a regex?

I would like to combine with a linux command, if possible, all the words that start with a capital letter, excluding the one at the beginning of the line. The goal is to create edges between these words.
For example:

My friend John met Beatrice and Lucio.

The result I would like to have should be:

John, Beatrice
John, Lucio
Beatrice, Lucio

I managed to get all the words that start with a capital letter, excluding the word at the beginning of the line through a regex. The regex is:

*cat gov.json | grep -oP "\b([A-Z][a-z']*)(\s[A-Z][a-z']*)*\b | ^(\s*.*?\s).*" > nodes.csv*

The nodes managed to enter them individually in column, ie:

John
Beatrice
Lucio

The goal now is to create the possible combinations between names that start with a capital letter and put them into a file. Any suggestions?

Solution:

Here is another awk script doing the task, building the output while reading input.

script.awk allowing duplicate names.

BEGIN {FPAT =  " [[:upper:]][[:alpha:]]+"}
{
    for (i = 1; i <= NF; i++ ) {
        for (name in namesArr) {
            namePairsArr[pairsCount++] = namesArr[name] $i;
        }
        namesArr[namesCount++] = $i;
    }   
}
END {for (i = 0; i < pairsCount; i++) print namePairsArr[i];}

If duplicate names not allowed, script.awk is:

BEGIN {FPAT =  " [[:upper:]][[:alpha:]]+"}
{
    for (i = 1; i <= NF; i++ ) {
        if (nameSeenArr[$i]) continue;
        nameSeenArr[$i] = 1;
        for (name in namesArr) {
              namePairsArr[pairsCount++] = namesArr[name] $i;
        }
        namesArr[namesCount++] = $i;
    }
}
END {for (i = 0; i < pairsCount; i++) print namePairsArr[i];}**

run

awk -f script.awk gov.json > nodes.csv

sample input file:

My friend John met Beatrice and Lucio
My friend Johna met Beatricea and Lucioa

sample output:

 John Beatrice
 John Lucio
 Beatrice Lucio
 John Johna
 Beatrice Johna
 Lucio Johna
 John Beatricea
 Beatrice Beatricea
 Lucio Beatricea
 Johna Beatricea
 John Lucioa
 Beatrice Lucioa
 Lucio Lucioa
 Johna Lucioa
 Beatricea Lucioa

Getting function Content and function name in C with regular expression in python

I am trying to get function content (body) if the function’s name matches a defined pattern

what I tried so far:

(Step1) get with a recursion all function bodies in a define C file
{(?:[^{}]+|(?R))*+}

(Step2) find all matches of wanted function’ s name

(Step3) Combine both steps. This where I am struggling

Input

TASK(arg1)
{
    if (cond)
    {
      /* Comment */
      function_call();
      if(condIsTrue)
      {
         DoSomethingelse();
      }
    }
    if (cond1)
    {
      /* Comment */
      function_call1();
    }
}


void FunctionIDoNotWant(void)
{
    if (cond)
    {
      /* Comment */
      function_call();
    }
    if (cond1)
    {
      /* Comment */
      function_call1();
    }
}

I am looking for the function TASK. When I add the regex to match TASK in front of “{(?:[^{}]+|(?R))*+}”, nothing works.

(TASK\s*\(.*?\)\s)({((?>[^{}]+|(?R))*)})

Desired Output

Group1:
   TASK(arg1)
Group2:
    if (cond)
    {
      /* Comment */
      function_call();
      if(condIsTrue)
      {
         DoSomethingelse();
      }
    }
    if (cond1)
    {
      /* Comment */
      function_call1();
    }

Solution:

You are recursing the whole pattern with (?R) which is the same like (?0) whereas you want to recurse (?2), the second group. Group one contains your (TASK…)

See this demo at regex101

(TASK\s*\(.*?\)\s)({((?>[^{}]+|(?2))*)})
                  ^ here starts the second group -> recursion with (?2)

	gamejudilebaran.word… on Chef: Roles and Environme…
	WARN: Waiting for se… on OSSEC start problem due to…
	aratik711 on Ansible issues
	aratik711 on Chef: Roles and Environme…
	situs judi on Chef: Roles and Environme…

Silencing certain rules

Ignore alert if rules triggered by certain IP

Ignore alert if contains certain strings

Ignore alert if contains certain strings (using regex)

Trigger custom rule when certain field match certain value in cdb list

Trigger custom rule when certain rules is fired x time within n second from same srcip

Overriding rules

Custom rule group

Rate this:

Share this:

Rate this:

Share this:

Rate this:

Share this:

Rate this:

Share this:

Rate this:

Share this:

Rate this:

Share this:

Advice

Rate this:

Share this:

Rate this:

Share this:

Rate this:

Share this:

Input

Desired Output

Rate this:

Share this: