Module gopy.strings.rabin_karp
The Rabin-Karp string matching algorithm calculates a hash value for the pattern, as well as for each M-character subsequences of text to be compared.
If the hash values are unequal, the algorithm will determine the hash value for next M-character sequence. If the hash values are equal, the algorithm will analyze the pattern and the M-character sequence.
In this way, there is only one comparison per text subsequence, and character matching is only required when the hash values match.
RABIN-KARP-MATCHER (T, P, d, q)
1. n ← length [T]
2. m ← length [P]
3. h ← dm-1 mod q
4. p ← 0
5. t0 ← 0
6. for i ← 1 to m
7. do p ← (dp + P[i]) mod q
8. t0 ← (dt0+T [i]) mod q
9. for s ← 0 to n-m
10. do if p = ts
11. then if P [1.....m] = T [s+1.....s + m]
12. then "Pattern occurs with shift" s
13. If s < n-m
14. then ts+1 ← (d (ts-T [s+1]h)+T [s+m+1])mod q
Complexity:
The running time of RABIN-KARP-MATCHER in the worst case scenario O ((n-m+1) m but it has a good average case running time. If the expected number of strong shifts is small O (1) and prime q is chosen to be quite large, then the Rabin-Karp algorithm can be expected to run in time O (n+m) plus the time to require to process spurious hits.
Expand source code
'''
The Rabin-Karp string matching algorithm calculates a hash value for the pattern,
as well as for each M-character subsequences of text to be compared.
If the hash values are unequal, the algorithm will determine the hash value for next M-character
sequence. If the hash values are equal, the algorithm will analyze the pattern and the
M-character sequence.
In this way, there is only one comparison per text subsequence,
and character matching is only required when the hash values match.
```
RABIN-KARP-MATCHER (T, P, d, q)
1. n ← length [T]
2. m ← length [P]
3. h ← dm-1 mod q
4. p ← 0
5. t0 ← 0
6. for i ← 1 to m
7. do p ← (dp + P[i]) mod q
8. t0 ← (dt0+T [i]) mod q
9. for s ← 0 to n-m
10. do if p = ts
11. then if P [1.....m] = T [s+1.....s + m]
12. then "Pattern occurs with shift" s
13. If s < n-m
14. then ts+1 ← (d (ts-T [s+1]h)+T [s+m+1])mod q
```
### Complexity:
The running time of RABIN-KARP-MATCHER in the worst case scenario O ((n-m+1) m but it has a
good average case running time. If the expected number of strong shifts is small O (1) and
prime q is chosen to be quite large, then the Rabin-Karp algorithm can be expected to run in
time O (n+m) plus the time to require to process spurious hits.
'''
def match(text,pat):
m = len(text)
n = len(pat)
p_hash = hash(pat)
pos = []
for i in range(m-(n-1)):
if hash(text[i:i+n])==p_hash and text[i:i+n] == pat:
pos.append(i)
if len(pos):
return pos
else:
return "No Match Found"
Functions
def match(text, pat)
-
Expand source code
def match(text,pat): m = len(text) n = len(pat) p_hash = hash(pat) pos = [] for i in range(m-(n-1)): if hash(text[i:i+n])==p_hash and text[i:i+n] == pat: pos.append(i) if len(pos): return pos else: return "No Match Found"