Algorithm Design
First, a helper function to check if a pattern matches a string for given values of a and b:
def matches(pattern, a, b, astring):
"""Does astring match pattern using proposed values for a and b?
>>> matches("aaba", "foo", "bar", "foofoobarfoo")
True
>>> matches("aaba", "foo", "nope", "foofoobarfoo")
False
"""
test_string = ""
for p in pattern:
test_string += a if p == "a" else b
return test_string == astring
For the main function, there are some observations we can make about the problem that will allow us to solve it optimallly:
Since a must be in the pattern at least once (the problem states that the pattern must start with a), we’ll concentrate on a first.
We can learn possible lengths of a by looking at the length of the string and the number of times a appears in a it. For example, for a string that is 10 characters long and a pattern of aaabb, a can only be 0, 1, 2, or 3 characters long — if a were longer, there would be no way it would fit, as 4 × 3 is greater than 10.
Once we find the possible values for the length of a, we can use this to find the possible values for the length of b. For our same example of a 10-character string that should match pattern aaabb, we can figure out:
length of each a |
total space of a |
remainder |
length of each b |
notes |
---|---|---|---|---|
0 |
0 |
10 |
5 |
ok |
1 |
3 |
7 |
3.5 |
not possible |
2 |
6 |
4 |
2 |
ok |
3 |
9 |
1 |
0.333 |
not possible |
So, as we loop over the possible lengths of a (0, 1, 2, 3), we only have to check possible b lengths that are possible, reducing our work.
Then, we can look at the string, figure out where a and b would have to start in the string, and use our helper function to see if this is a possible match. For example, for the 10-charater string “hihihimeme” and the pattern aaabb, we’ll try the following:
length of each a |
length of each b |
b starts at |
proposed a |
proposed b |
works? |
---|---|---|---|---|---|
0 |
5 |
pos 0 |
“” |
“hihih” |
no |
2 |
2 |
pos 4 |
“hi” |
“me” |
yes |
Since we found a match, we win!
def pattern_match(pattern, astring):
"""Can we make this pattern match this string?"""
# Q&D sanity check on pattern
assert (pattern.replace("a", "").replace("b", "") == ""
and pattern.startswith("a")), "invalid pattern"
# START SOLUTION
count_a = pattern.count("a") # num times a appears in pattern
count_b = pattern.count("b") # num times b appears in pattern
first_b = pattern.find("b") # index of first b (-1 if not in pattern)
# We'll check every possible length of a, from 0 to the max
# (where the max is affected by the count of how many a's must appear)
for a_length in range(0, len(astring) // count_a + 1):
# For this length of a, find required length of b
if count_b:
b_length = (len(astring) - (a_length * count_a)) / float(count_b)
else:
b_length = 0
# Fast fail optimization: b_length must be int and >= 0
if int(b_length) != b_length or b_length < 0:
continue
# Find where b would need to begin
b_start = first_b * a_length
# Check if this is a workable match; if so, we win!
if matches(pattern=pattern,
a=astring[0:a_length],
b=astring[b_start:b_start + int(b_length)],
astring=astring):
return True
return False # sad panda