Anagram Substring Search using Python

An anagram is a rearrangement of the characters of a word or a phrase to generate a new word, using all the original characters exactly once. For example, thing and night are anagrams of each other.

An anagram substring search involves finding all substrings in a text that are anagrams of a given pattern. This means we need to find substrings that contain the same characters as the pattern, regardless of their order.

Problem Statement

Given two strings text and pattern, find all starting indices of substrings in the text that are anagrams of the pattern.

# Example scenario
text = "cbaebabacd"
pattern = "abc"
# Expected output: [0, 6]
# Explanation: 
# Substring "cba" at index 0 is an anagram of "abc"
# Substring "bac" at index 6 is an anagram of "abc"

Sliding Window Approach

The Sliding Window Technique is optimal for this problem because it maintains a window of fixed size (equal to pattern length) and slides through the text. Instead of recalculating character frequencies for each substring, we update the window by adding one character and removing another.

Algorithm Steps

  • Count the frequency of characters in the pattern
  • Create a sliding window of pattern length over the text
  • For each window position, compare character frequencies
  • If frequencies match, record the starting index
  • Slide the window by adding the next character and removing the leftmost character

Implementation

from collections import Counter

def find_anagram_indices(text, pattern):
    pattern_len = len(pattern)
    pattern_count = Counter(pattern)
    window_count = Counter()
    result = []
    
    for i in range(len(text)):
        # Add current character to window
        window_count[text[i]] += 1
        
        # If window size exceeds pattern length, remove leftmost character
        if i >= pattern_len:
            left_char = text[i - pattern_len]
            if window_count[left_char] == 1:
                del window_count[left_char]
            else:
                window_count[left_char] -= 1
        
        # Check if current window is an anagram of pattern
        if window_count == pattern_count:
            result.append(i - pattern_len + 1)
    
    return result

# Test the function
text = "cbaebabacd"
pattern = "abc"
indices = find_anagram_indices(text, pattern)
print(f"Anagram indices: {indices}")
Anagram indices: [0, 6]

How It Works

Text: c b a e b a b a c d Index: 0 1 2 3 4 5 6 7 8 9 Pattern: abc Window 1: cba ? Anagram at index 0 Window 2: bae ? Not an anagram Window 7: bac ? Anagram at index 6 The algorithm slides a 3-character window through the text, comparing character frequencies with the pattern.

Another Example

# Test with a different pattern
text = "abab"
pattern = "ab"
result = find_anagram_indices(text, pattern)
print(f"Text: {text}")
print(f"Pattern: {pattern}")
print(f"Anagram indices: {result}")
Text: abab
Pattern: ab
Anagram indices: [0, 2]

Time and Space Complexity

Aspect Complexity Explanation
Time O(n) Single pass through the text
Space O(k) k = number of unique characters in pattern

Conclusion

The sliding window technique provides an efficient O(n) solution for anagram substring search. By maintaining character frequency counts and updating the window incrementally, we avoid the overhead of checking each substring individually, making it much faster than brute force approaches.

Updated on: 2026-03-25T06:45:14+05:30

466 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements