The strnlen function in C is used to find the length of a string, up to a specified number of characters. As a professional C developer, having a deep understanding of strnlen can help you write more secure, optimized code. This comprehensive guide will dive into all aspects, use cases, and best practices of strnlen usage.

Overview of strnlen

The prototype of strnlen is:

size_t strnlen(const char *s, size_t maxlen); 

It scans the string s up to maxlen characters and returns the number of characters before the terminating null byte ‘\0‘ is found.

Key properties:

  • Runs in O(n) linear time based on lesser of string length or maxlen
  • Never reads past maxlen characters (prevents overflows)
  • Returns either the string‘s length or maxlen if no null found
  • Does not count terminating null byte in length
  • String must be null-terminated, otherwise causes undefined behavior

Here is a diagram illustrating how strnlen operates:

strnlen diagram

Figure 1: Visual depiction of strnlen behavior

As you can see, strnlen will stop and return the length once maxlen is reached OR a null terminator is encountered – whichever comes first.

Understanding this core functionality of strnlen allows optimizing use cases accordingly.

Real-World Use Cases

Strnlen shines for safely handling buffers and strings from untrusted sources across many security-conscious C programs.

Common use cases:

  • Validate user input length
  • Read string data from files
  • Bounded copying into fixed-size buffers
  • Safer concatenation with length checks
  • Print truncated string snippets
  • Parse headers and metadata of packets/protocols

Let‘s explore some real-world examples in detail:

A. Web Server Handling Input

Consider a simple web server handling GET requests – there may be arbitrary untrusted strings coming in:

// Handle GET request
void handle_get(char* request) {

  char buf[256]; // Stack buffer

  // COPY USER DATA 
  strcpy(buf, request);  

  // Serve back request
  send(buf); 
}

This is dangerous! The user request string may not be null terminated or excessively large. Using strnlen is better:

void handle_get(char* request) {

  char buf[256];

  // Prevent overflow
  if(strnlen(request, 256) < 256) {
    strncpy(buf, request, 256);
  } else {
    error("Request too large");
  }

  send(buf);
}  

Now the request is restricted to the buffer size maximum. Much safer!

B. Reading File Format Metadata

Many file formats have fixed-length headers at the start indicating metadata. For example, PNG images:

File header diagram

Figure 2: PNG file format layout

We can use strnlen to only read up to the header size:

// PNG header is 8 bytes
#define HEADER_SIZE 8  

// Read PNG metadata
FILE *image = fopen("image.png", "r"); 
char header[HEADER_SIZE];

fread(header, 1, strnlen(header, HEADER_SIZE), image); 

This prevents overrunning the buffer if the file is crafted incorrectly.

C. Truncating Output

You may want to truncate a string before displaying it:

#define MAX_LEN 16   

void print_truncated(char *s) {

  char truncated[MAX_LEN];

  strncpy(truncated, s, MAX_LEN-1);
  truncated[MAX_LEN-1] = ‘\0‘;  

  puts(truncated);
}

However, this forces a fixed length. With strnlen, we can be more dynamic:

void print_truncated(char *s) {

  size_t len = strnlen(s, MAX_LEN);

  char truncated[len+1];

  strncpy(truncated, s, len);
  truncated[len] = ‘\0‘;

  puts(truncated);  
}

Now we don‘t truncate arbitrarily if the source string is short, preserving its full content up to the max.

Avoiding Common Pitfalls

While strnlen makes processing strings safer compared to strlen or strcpy, beware of some pitfalls:

Not Checking Return Value

Don‘t assume the string length is limited to what you passed for maxlen. Always check the return!

❌ BAD

char long_str[1024]; // Unchecked length 

strnlen(long_str, 16);

// BUG: long_str may now overflow buffer   
process_string(long_str); 

✅ GOOD

char long_str[1024];

size_t len = strnlen(long_str, 16);

if (len < 16) {
   // String is terminated  
  process_string(long_str);  
} else {
  // Error, string too long!
}

No Null Termination

Ensure the destination has an extra byte for the added null terminator:

❌ BAD:

char str[10];

strnlen(src, sizeof(str));  // Not 10 bytes!
strcpy(str, src); // Unterminated buffer!

✅ GOOD:

char str[10+1]; // Extra byte for ‘\0‘ 

strnlen(src, 10);
strncpy(str, src, 10); // Copied 10 bytes maximum  
str[10] = ‘\0‘; // Explicitly terminate

Truncation Quirks

Beware truncating without null-terminating can cause odd behavior:

Original string: Hello world

strnlen(str, 5);
// => Returns 5

// Truncates within word 
Truncated string: "Hello" 

Watch for unintended partial words or phrases if not handled properly.

Optimized Implementation

Under the hood, strnlen leverages various CPU architecture-specific instructions for optimized performance.

For example, x86-64 GNU/Linux uses special REP instructions combined with SCASB to achieve high speed:

strnlen:
  mov rcx, rdx     ; rcx = maxlen
  xor eax,eax      ; rax = 0
  repne scasb
  mov rax, rdx
  sub rax, rcx     ; rax = maxlen - rcx  
  dec rax          ; Don‘t count last char
  ret

The SCASB instruction scans until it encounters a byte matching the value in the AL register – which is initialized to 0 to detect the null terminator.

Combined with the REPNE prefix for repeated iteration, this creates an highly optimized loop without manual branching.

The difference between the original maxlen and count after SCASB finishes represents the string length.

Recap of Benefits

Let‘s recap the major advantages of strnlen over unsafe C string handling:

  • Prevents buffer overflows: By stopping at an explicit length, strnlen avoids overrunning destination buffers
  • Stops exploitation: Malicious strings with no null byte used to overflow buffers won‘t overread
  • Performance gains:Specialized implementation optimized for modern CPUs
  • Portability: Compatible across any C compiler and platform
  • Ease of use: Seamless integration and simple parameters

For protecting against dangerous bugs, strnlen is a simple drop-in solution requiring only minor changes for tremendous safety and security gains.

Alternatives to strnlen

The strnlen function works great for bounded string access in C, but developers have a few alternatives:

strlen – Determines full string length, but slower and less safe

strcspn – Finds offset of first occurrence in a set of bytes

Custom routines – Handwritten to handle specific use cases

However, few alternatives match strnlen for simplicity, portability, and performance. The core task of limiting length checks remains manually in other options.

Stats on Exploitable Buffer Overflows

Buffer overflows from unbounded strings remain a chief vector for hacks according to language analyzers. Some statistics:

Year Total Software Vulnerabilities % Buffer Overflows
2015 16,081 23%
2016 18,432 21%
2017 20,832 15%

Table 1 – Buffer overflows percentage of total vulnerabilities identified (Source)

And a breakdown of buffer overflows by language:

Language % Buffer Overflows
C 76%
C++ 10%
Python 3%
JavaScript 2%

Table 2 – Prevalence of buffer overflows across programming languages (Source)

As shown C and C++ top the list by a significant margin, demonstrating the critical need for safe string handling. By incorporating validation via strnlen consistently, the number of potential vulnerable functions can be reduced drastically.

Putting into Practice

Though simple, consistently applying proper usage of strnlen can have profound impacts on creating more hardened applications.

Here are key takeaways when leveraging strnlen:

DO:

  • Set reasonable maxlen values based on buffer sizes
  • Validate return length does not exceed maxlen before usage
  • Leave room for terminator when copying strings
  • Use strnlen as part of standard string handling routines

DO NOT:

  • Assume length is limited by maxlen without checking
  • Call strnlen on non-null terminated strings
  • Forget about truncation side effects

Adopting these best practices with strnlen as a core component leads to greater stability and security assurance.

Conclusion

The strnlen C function allows safely determining string lengths up to a fixed number of bytes. With widespread buffer overflow vulnerabilities in C programs handling untrusted input, strnlen serves as an invaluable tool for writing exploit-resistant code.

In this comprehensive guide, we covered all aspects of strnlen usage, including common use cases, optimized low-level implementation, benefits over unsafe functions like strcpy, and guidelines for integrating into new or existing C projects securely.

In a world full of risk from unbounded strings, strnlen enables developing resilient applications without sacrificing performance – a simple yet powerful function for every C programmer‘s toolkit.

Similar Posts