Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Anagram Substring Search using Python
An anagram is a rearrangement of the characters of a word or a phrase to generate a new word, using all the original characters exactly once. For example, thing and night are anagrams of each other.
An anagram substring search involves finding all substrings in a text that are anagrams of a given pattern. This means we need to find substrings that contain the same characters as the pattern, regardless of their order.
Problem Statement
Given two strings text and pattern, find all starting indices of substrings in the text that are anagrams of the pattern.
# Example scenario text = "cbaebabacd" pattern = "abc" # Expected output: [0, 6] # Explanation: # Substring "cba" at index 0 is an anagram of "abc" # Substring "bac" at index 6 is an anagram of "abc"
Sliding Window Approach
The Sliding Window Technique is optimal for this problem because it maintains a window of fixed size (equal to pattern length) and slides through the text. Instead of recalculating character frequencies for each substring, we update the window by adding one character and removing another.
Algorithm Steps
- Count the frequency of characters in the pattern
- Create a sliding window of pattern length over the text
- For each window position, compare character frequencies
- If frequencies match, record the starting index
- Slide the window by adding the next character and removing the leftmost character
Implementation
from collections import Counter
def find_anagram_indices(text, pattern):
pattern_len = len(pattern)
pattern_count = Counter(pattern)
window_count = Counter()
result = []
for i in range(len(text)):
# Add current character to window
window_count[text[i]] += 1
# If window size exceeds pattern length, remove leftmost character
if i >= pattern_len:
left_char = text[i - pattern_len]
if window_count[left_char] == 1:
del window_count[left_char]
else:
window_count[left_char] -= 1
# Check if current window is an anagram of pattern
if window_count == pattern_count:
result.append(i - pattern_len + 1)
return result
# Test the function
text = "cbaebabacd"
pattern = "abc"
indices = find_anagram_indices(text, pattern)
print(f"Anagram indices: {indices}")
Anagram indices: [0, 6]
How It Works
Another Example
# Test with a different pattern
text = "abab"
pattern = "ab"
result = find_anagram_indices(text, pattern)
print(f"Text: {text}")
print(f"Pattern: {pattern}")
print(f"Anagram indices: {result}")
Text: abab Pattern: ab Anagram indices: [0, 2]
Time and Space Complexity
| Aspect | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Single pass through the text |
| Space | O(k) | k = number of unique characters in pattern |
Conclusion
The sliding window technique provides an efficient O(n) solution for anagram substring search. By maintaining character frequency counts and updating the window incrementally, we avoid the overhead of checking each substring individually, making it much faster than brute force approaches.
