Hackbright Code Challenges

Pattern Matching: Solution

Pattern Matching: Solution

Problem

Pattern Matching

Challenge

Harder

Concepts

Algorithm Design

Download

pattern-match-solution.zip


First, a helper function to check if a pattern matches a string for given values of a and b:

def matches(pattern, a, b, astring):
    """Does astring match pattern using proposed values for a and b?

        >>> matches("aaba", "foo", "bar", "foofoobarfoo")
        True

        >>> matches("aaba", "foo", "nope", "foofoobarfoo")
        False
    """

    test_string = ""

    for p in pattern:
        test_string += a if p == "a" else b

    return test_string == astring

For the main function, there are some observations we can make about the problem that will allow us to solve it optimallly:

  • Since a must be in the pattern at least once (the problem states that the pattern must start with a), we’ll concentrate on a first.

  • We can learn possible lengths of a by looking at the length of the string and the number of times a appears in a it. For example, for a string that is 10 characters long and a pattern of aaabb, a can only be 0, 1, 2, or 3 characters long — if a were longer, there would be no way it would fit, as 4 × 3 is greater than 10.

  • Once we find the possible values for the length of a, we can use this to find the possible values for the length of b. For our same example of a 10-character string that should match pattern aaabb, we can figure out:

    length of each a

    total space of a

    remainder

    length of each b

    notes

    0

    0

    10

    5

    ok

    1

    3

    7

    3.5

    not possible

    2

    6

    4

    2

    ok

    3

    9

    1

    0.333

    not possible

    So, as we loop over the possible lengths of a (0, 1, 2, 3), we only have to check possible b lengths that are possible, reducing our work.

  • Then, we can look at the string, figure out where a and b would have to start in the string, and use our helper function to see if this is a possible match. For example, for the 10-charater string “hihihimeme” and the pattern aaabb, we’ll try the following:

    length of each a

    length of each b

    b starts at

    proposed a

    proposed b

    works?

    0

    5

    pos 0

    “”

    “hihih”

    no

    2

    2

    pos 4

    “hi”

    “me”

    yes

  • Since we found a match, we win!

def pattern_match(pattern, astring):
    """Can we make this pattern match this string?"""

    # Q&D sanity check on pattern

    assert (pattern.replace("a", "").replace("b", "") == ""
            and pattern.startswith("a")), "invalid pattern"

    # START SOLUTION

    count_a = pattern.count("a")  # num times a appears in pattern
    count_b = pattern.count("b")  # num times b appears in pattern
    first_b = pattern.find("b")  # index of first b (-1 if not in pattern)

    # We'll check every possible length of a, from 0 to the max
    # (where the max is affected by the count of how many a's must appear)

    for a_length in range(0, len(astring) // count_a + 1):

        # For this length of a, find required length of b
        if count_b:
            b_length = (len(astring) - (a_length * count_a)) / float(count_b)
        else:
            b_length = 0

        # Fast fail optimization: b_length must be int and >= 0
        if int(b_length) != b_length or b_length < 0:
            continue

        # Find where b would need to begin
        b_start = first_b * a_length

        # Check if this is a workable match; if so, we win!
        if matches(pattern=pattern,
                   a=astring[0:a_length],
                   b=astring[b_start:b_start + int(b_length)],
                   astring=astring):
            return True

    return False  # sad panda