c

How to do static code analysis in C/C++ (using sparse, splint, cpplint and clang)

Static program analysis is basically analysis looking at the source code without executing it (as opposed to dynamic analysis). Generally used to find bugs or ensure conformance to coding guidelines.

  • sparse@wiki/sparse@man is a static analysis tool that was initially designed to only flag constructs that were likely to be of interest to kernel developers, such as the mixing of pointers to user and kernel address spaces. cgcc@man is a perl-script compiler wrapper to run Sparse after compiling.
## install
$ sudo apt-get install sparse | $ sudo yum install sparse (EPEL)

## example
$ cat test.c
include <stdio.h>
int main(void) {
        int *p = 0;
        printf("Hello, Worldn");
        return 0;
}

$ cgcc -Wsparse-all -c test.c
(or make CC=cgcc)
test.c:4:18: warning: Using plain integer as NULL pointer

# if you get error: "unable to open 'sys/cdefs.h'" then
$ sudo ln -s /usr/include/x86_64-linux-gnu/sys /usr/include/sys
$ sudo ln -s /usr/include/x86_64-linux-gnu/bits /usr/include/bits
$ sudo ln -s /usr/include/x86_64-linux-gnu/gnu /usr/include/gnu
  • splint/splint@wiki/splint@man statically checking C programs for security vulnerabilities and coding mistakes. Formerly called LCLint, it is a modern version of the Unix lint tool. Project’s last update was November 2010.
## install
$ sudo apt-get install splint | $ sudo yum install splint (EPEL)

## example
$ cat test2.c
#include <stdio.h>
int main()
{
    char c;
    while (c != 'x');
    {
        c = getchar();
        if (c = 'x')
            return 0;
        switch (c) {
        case 'n':
        case 'r':
            printf("Newlinen");

    }
    return 0;
}

$ splint -hints test2.c
test2.c: (in function main)
test2.c:5:12: Variable c used before definition
test2.c:5:12: Suspected infinite loop.  No value used in loop test (c) is modified by test or loop body.
test2.c:7:9: Assignment of int to char: c = getchar()
test2.c:8:13: Test expression for if is assignment expression: c = 'x'
test2.c:8:13: Test expression for if not boolean, type char: c = 'x'
test2.c:18:1: Parse Error. (For help on parse errors, see splint -help parseerrors.)
*** Cannot continue.
## install
$ wget http://google-styleguide.googlecode.com/svn/trunk/cpplint/cpplint.py
$ chmod +x cpplint.py

## example
$ ./cpplint.py --extensions=c test2.c 
test2.c:0:  No copyright message found.  You should have a line: "Copyright [year] <Copyright Owner>"  [legal/copyright] [5]
test2.c:3:  { should almost always be at the end of the previous line  [whitespace/braces] [4]
test2.c:5:  Empty loop bodies should use {} or continue  [whitespace/empty_loop_body] [5]
test2.c:14:  Line ends in whitespace.  Consider deleting these extra spaces.  [whitespace/end_of_line] [4]
test2.c:14:  Redundant blank line at the end of a code block should be deleted.  [whitespace/blank_line] [3]
Done processing test2.c
Total errors found: 5
## install
$ sudo aptitude install clang | sudo yum install clang (EPEL)

## example
$ cat test3.c 
void test() {
  int x;
  x = 1; // warn
}

$ clang --analyze test3.c 
test3.c:3:3: warning: Value stored to 'x' is never read
  x = 1; // warn
  ^   ~
1 warning generated.

$ scan-build gcc -c test3.c 
scan-build: Using '/usr/lib/llvm-3.5/bin/clang' for static analysis
test3.c:3:3: warning: Value stored to 'x' is never read
  x = 1; // warn
  ^   ~
1 warning generated.
scan-build: 1 bug found.

Using semantic patching with Coccinelle (a patching tool that knows C)

Coccinelle is a program matching and transformation engine which provides the language SmPL (Semantic Patch Language) for specifying desired matches and transformations in C code.

## install see http://coccinelle.lip6.fr/download.php
$ sudo apt-get install coccinelle | sudo yum install coccinelle (from fedora rawhide)

## usage
spatch -sp_file <SP> <files> [-o <outfile> ] [-iso_file <iso> ] [ options ]

## examples
$ cat test.cocci
// Replaces calls to alloca by malloc and checks return value
@@
expression E;
identifier ptr;
@@
-ptr = alloca(E);
+ptr = malloc(E);
+if (ptr == NULL)
+        return 1;

$ cat test.c
#include <alloca.h>
int main(int argc, char *argv[]) {
    unsigned int bytes = 1024 * 1024;
    char *buf;
    /* allocate memory */
    buf = alloca(bytes);
    return 0;
}

$ spatch -sp_file test.cocci test.c
--- test.c
+++ /tmp/cocci-output-29896-40280c-test.c
@@ -3,6 +3,8 @@ int main(int argc, char *argv[]) {
     unsigned int bytes = 1024 * 1024;
     char *buf;
     /* allocate memory */
-    buf = alloca(bytes);
+    buf = malloc(bytes);
+    if (buf == NULL)
+        return 1;
     return 0;
}

from coccinelle, coccinelle@lwn, coccinelle for the newbie and coccinelle patch examples

How to count number of source code lines (using cloc)

cloc counts, and compute differences of, lines of source code and comments.

# install
$ sudo yum install cloc (Fedora Rawhide) | sudo apt-get install cloc

cloc [optins] <FILE|DIR> ...
$ find . -name "*.py" | xargs cloc
$ cloc package.tar.gz

# proprocessor options: '--diff', '--ignore-whitespace'
$ cloc --diff old.tar.bz  new.tar.bz2

# filter options: '--exclude-dir', '--exclude-lang', '--match-d', '--match-f'
$ cloc --show-lang

# output options: '--quiet', '--csv', '--report-file', '--xml', '--out'

from How to count lines of source code in Linux