Strings are one of the most fundamental data types in programming. In C, strings are simply arrays of characters terminated by a null byte (‘\0‘). Comparing two strings is thus an extremely common task.

In this comprehensive guide, we‘ll explore the various methods available for comparing strings in C, including:

  • User-defined string comparison functions
  • The standard string.h library functions:
    • strcmp()
    • strncmp()
    • strcasecmp()
    • strncasecmp()
  • Raw memory comparison with memcmp()

We‘ll look at the syntax, examples, and use cases of each method. By the end, you‘ll have an expert-level grasp of all aspects of string comparison in C.

Building a Custom String Compare Function

Let‘s start from first principles by creating our own user-defined stringCompare() function.

The logic is straightforward:

  1. Initialize index i to 0 to track current position
  2. Use a loop to compare characters at each index
  3. Increment i when characters match
  4. Break if end of string (‘\0‘) is reached or characters don‘t match
  5. After loop, check if both strings ended – if yes strings are equal

Here is an implementation:

int stringCompare(char str1[], char str2[]) {
  int i = 0;

  while(str1[i] == str2[i]) {
    if(str1[i] == ‘\0‘ || str2[i] == ‘\0‘) 
        break;

    i++; 
  }

  if(str1[i] == ‘\0‘ && str2[i] == ‘\0‘)
      return 0; 
  else  
      return -1;
}

To use it:

if(stringCompare(str1, str2) == 0) {
    // strings are equal
}
else {
   // strings are not equal  
}

This checks characters at each index, stopping when:

  • End of string is reached (‘\0‘)
  • Characters at current position are not equal

It returns 0 for equal strings, -1 otherwise.

There are some limitations with this approach:

  • Only compares up to the first mismatch
  • No control over case sensitivity
  • Returns -1 even if strings differ at last character

So while simple to code ourselves, built-in library functions provide more flexibility.

strcmp()

The strcmp() function compares two strings lexicographically.

Syntax and Parameters

int strcmp(const char *str1, const char *str2);  

It takes two null-terminated string pointers and returns an integer based on comparison result.

Return Value

  • 0 – Strings are identical
  • < 0str1 is lexicographically smaller than str2
  • > 0str1 is lexicographically greater than str2

The return value indicates the first mismatched character index and its ASCII difference. This allows not just checking equality, but also ordering strings alphaneumerically.

Examples

strcmp("hello", "hello") == 0 // true, equal strings

strcmp("abc", "abcd") < 0  // true, ‘abc‘ before ‘abcd‘ alphabetically 

strcmp("abcx", "abc") > 0 // true, ‘abcx‘ after ‘abc‘ alphabetically

This compares characters one by one from index 0, stopping at first null byte or mismatch. Case sensitive.

Use Cases

strcmp() is useful for:

  • Checking exact string equality
  • Lexicographic ordering of strings
  • Sorting string arrays alphabetically
  • Dictionary lookup by exact match

It is fast since it runs linear time O(N) where N is minimum string length.

strncmp()

The strncmp() function compares only first n characters of the strings.

Syntax

int strncmp(const char *str1, const char *str2, size_t n);

The additional n parameter limits comparison to first n bytes of each string.

Examples

strncmp("hello", "hell", 4) == 0 // Compare only first 4 chars 

strncmp("abcde", "abc", 3) == 0 // First 3 chars same

This is useful when you only need to check a prefix substring match.

strcasecmp()

The strcasecmp() function works the same as strcmp() but is case insensitive.

Syntax

int strcasecmp(const char *str1, const char *str2); 

It doesn‘t take a length parameter. Case is ignored for the entire string.

Examples

strcasecmp("Hello", "hello") == 0 // Case insensitive match 

strcasecmp("AppLe", "apple") == 0 // Same after lowercasing both

Use Cases

Case insensitive string checking is useful for:

  • User login validation
  • Lookup in case insensitive dictionaries
  • URL/path comparison on filesystems without case sensitivity

The downside is it runs slower than strcmp(), around O(N) time where N is average string length after lower casing.

strncasecmp()

This combines advantages of strncmp() and strcasecmp() for case insensitive prefix substring checking.

Syntax

int strncasecmp(const char *str1, const char *str2, size_t n);

It takes an additional length parameter for limiting comparison.

Examples

strncasecmp("HelloWorld", "helloworld", 5) == 0 // "hello" matches

strncasecmp("HelloWorld", "helloword", 8) == 0 // "hellowor" matches 

Use Cases

Checking user input against allowed values is a common use case for strncasecmp():

// Check first 6 chars only
if(strncasecmp(input, "SELECT", 6) == 0) {
  // Handle SQL SELECT statement  
}

This provides input validation without worrying about overall string length or case differences.

memcmp() – Raw Binary Comparison

The memcmp() functions compares blocks of memory without assuming null terminated C strings.

Syntax and Parameters

int memcmp(const void *str1, const void *str2, size_t n);

It takes two void pointers to memory blocks along with number of bytes to compare n.

Treating input as raw memory allows comparing substrings within strings rather than just from start.

Return Value

This returns integer similar to strcmp():

  • 0 for equal
  • < 0 if first byte that differs in str1 is less than str2
  • 0 if first byte that differs in str1 is greater than str2

Examples

// Compare 12 bytes from offset 10 of both strings 
memcmp(str1+10, str2+10, 12);

// Check if final 5 chars same  
memcmp(str1+(len1-5), str2+(len2-5), 5); 

This has some key advantages over string compares:

  • Can compare substrings rather than just prefixes
  • Works for raw byte arrays and other data too
  • No need for null termination

The main disadvantage is complexity around pointer arithmetic to manage offsets.

Putting it All Together

Here is some sample code to demonstrate different string comparison approaches:

#include <string.h>
#include <stdio.h>

int main() {

  char str1[15];
  char str2[15];

  strcpy(str1, "Hello World"); 
  strcpy(str2, "Hello World");

  if(stringCompare(str1, str2) == 0) {
     printf("User defined match\n");  
  }

  if(strcmp(str1, str2) == 0) {
     printf("strcmp full match\n"); 
  }

  if(strncmp(str1, str2, 5) == 0) {
     printf("strncmp 5 char match\n");  
  }  

  if(strncasecmp(str1, str2, 5) == 0) {
     printf("strncasecmp 5 char case insensitive match\n");
  }

  if(memcmp(str1+6, str2+6, 5) == 0) {
     printf("memcmp offset match\n"); 
  }

  return 0;
}

This shows different types of matching possible:

  • User-defined compares full string
  • strcmp() checks full string equality
  • strncmp() matches first 5 characters only
  • strncasecmp() does case insensitive 5 character check
  • memcmp() compares 5 characters from offset 6

Choosing the right approach depends on the specific string handling needs.

Performance and Optimization

For most use cases, the standard library string functions offer the best combination of speed and convenience.

But there are some optimizations possible around string length:

Compare String Lengths First

Instead of directly calling strcmp(), check string lengths first:

int len1 = strlen(str1); 
int len2 = strlen(str2);

if(len1 != len2)
   return STRINGS_NOT_EQUAL;

return strcmp(str1, str2);

This avoids comparing character by character for inputs of different lengths.

Use Unsigned Characters

The standard compares use signed char values. But since ASCII characters don‘t need negative values, casting to unsigned char speeds up comparison:

int stringCompare(unsigned char *str1, unsigned char *str2)
{
  ...    
} 

This eliminates some unnecessary sign extension instructions in generated assembly code.

Unrolled Loop Implementation

Manual loop unrolling compares 4 or 8 bytes at once instead of just 1 byte per loop iteration. This gives significant speedup on some processors by better utilizing instruction pipelines.

For example, a quad byte compare:

int stringCompare(unsigned char *str1, unsigned char *str2) {   

  size_t i;
  for (i=0; ; i+=4) {

    if(str1[i] != str2[i] ||
       str1[i+1] != str2[i+1] || 
       str1[i+2] != str2[i+2] ||
       str1[i+3] != str2[i+3]) {

       return STRINGS_NOT_EQUAL; 
    } 
  }   

  return STRINGS_EQUAL;
}

The same idea can be applied to create an unrolled memcmp() as well.

On 64 bit CPUs this can be faster than calling the standard C library version. But it also makes the code much more complex.

Conclusion

We‘ve explored various methods for string comparison in C covering:

  • Writing custom compare functions
  • Library functions for case sensitive, length-limited, and case insensitive compares
  • Binary memcmp() for raw memory buffer comparison
  • Optimization techniques like unchecked compares and loop unrolling

Here is a quick guide for choosing the right approach:

  • strcmp() – fastest case sensitive comparison for simple equality check or sorting
  • strncmp() – for limiting compare to a length-prefixed substring
  • strcasecmp() – enables case insensitive comparisons
  • strncasecmp() – case insensitive prefix substring checking
  • memcmp() – comparing binary buffers and data other than strings

With the ability compare strings in multiple ways, you are equipped to handle any string processing needs in C programming!

Similar Posts