back

phonelst - phone list

problem

Phone List Given a list of phone numbers, determine if it is consistent in the sense that no number is the prefix of another. Let’s say the phone catalogue listed these numbers:

Emergency 911 Alice 97 625 999 Bob 91 12 54 26 In this case, it’s not possible to call Bob, because the central would direct your call to the emergency line as soon as you had dialled the first three digits of Bob’s phone number. So this list would not be consistent.

Input The first line of input gives a single integer, 1 ≤ t ≤ 40, the number of test cases. Each test case starts with n, the number of phone numbers, on a separate line, 1 ≤ n ≤ 10000. Then follows n lines with one unique phone number on each line. A phone number is a sequence of at most ten digits.

Output For each test case, output “YES” if the list is consistent, or “NO” otherwise.

Example

Input:
2
3
911
97625999
91125426
5
113
12340
123440
12345
98346

Output:
NO
YES

Source

solution

The problem calls for a prefix tree, a.k.a a trie. It’s a tree data structure that stores string prefixes (in this case digits) in nodes in such a way that we can be reconstruct a string by concatenating all the prefixes stored in nodes along a given path from the root node.

Python’s standard lib doesn’t come with a Trie implementation. So we’re left to implement one ourselves. All the better since can tailor it to the problem and perform consistency checks on insertion for maximum efficiency. This way, we can decide a list of phone numbers to be inconsistent as soon as we reach the first inconsistent number.

Taking Codecademy’s implementation as a base, we can add __slots__ to restrict attribute access and improve memory efficiency:

class TrieNode:
    __slots__ = ('children', 'is_end_of_word')
    def __init__(self):
        self.children = [None]*10
        self.is_end_of_word = False

class Trie:
    __slots__ = ('root')
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word) -> bool:
        current = self.root
        has_a_prefix = False
        is_a_prefix = True
        for char in word:
            digit = int(char)
            if current.is_end_of_word:
                has_a_prefix = True
            if current.children[digit] is None:
                current.children[digit] = TrieNode()
                is_a_prefix = False
            current = current.children[digit]
        current.is_end_of_word = True
        return has_a_prefix or is_a_prefix

On insertion, we check both:

  • if a new word has a prefix already in the trie, and
  • if a new word is a prefix of an existing word in the trie.

We can test for the former by checking if we reach a node that’s already an end of word. In which case, all the characters/digits we traversed up to this point form a word/phone number with the same digits as the word we’re trying to add, i.e. our new word has a prefix in the trie.

And we can test for the latter by checking if we reach a new node. Because a new node signals the appearance of a different character/digit after a sequence of already “seen” characters. And a new word/phone number cannot be a prefix of another if it has a different character/digit than any other word/phone number at a specific index.

With Trie in place, all that’s left is input reading and parsing. We take advantage of os.read()’s speedup over stdin.readline() to maximise efficiency (I was struggling with efficiency when submitting to solvers).

def phone_numbers(n: int):
    count = 0
    while count < n:
        yield read_phone_number()
        count += 1

def check_consistency(phone_numbers: str):
    numbers = Trie()
    for phone_number in phone_numbers:
        if numbers.insert(phone_number):
            return False
    return True

M = (1 + 40 * 10001) * 11

if __name__ == "__main__":
    lines = os.read(0,M).decode().split()
    i = 1 # we skip t at the start
    while i < len(lines):
        n = int(lines[i])
        i += 1
        phone_numbers = lines[i:i+n]
        if check_consistency(phone_numbers):
            print("YES")
        else:
            print("NO")
        i += n

Note that value of M is derived from the fact that there can be at most (1 + 10 * 10001) lines of input, each at most 11 characters long including newline characters.

The idea being, there can be at most 40 test cases. Each test case can contain as many as 10000 phone numbers, specified by a number at the start of the test case. Therefore, in the worst case, a test case can contain 10000+1=10001 lines. Hence, there are at most 40*10001 lines of input, plus 1 for the initial line specifying the the number of test cases.

Finally, a phone number is at most 10 characters. So a line can have no more than 11 characters when we include the newline character. Making (1+40*10001) * 11 an upper bound on the number of characters of input.

mail@jonahv.comrss