Daily Question 09/25/2024: Sum of Prefix Scores of Strings
The problem statement may be found on leetcode.
Analysis
Dynamic Programming Solution
This is pretty clearly a situation where using a trie is very helpful. I already implemented a trie for yesterdays problem but decided to update the terminates attribute to keep count of repeated additions since repeats may show up in the list. More or less, the answer is the sum of terminates on the subtree for each prefix for each word. For instance
words: ["ab", "abc", "abf", "b", "bc"]
drawing: |
a b
/ ^\
b c
/^\ ^
c f
^ ^
solution: [6, 7, 1, 2, 3]
#[3 + 3, 3 + 3 + 1, 3 + 3 + 1, 2, 2 + 1]Counting the Number of Entries for a Subtree.
Given a prefix, the tree should be traversed until this prefix is matched. If it is not, then return 0.
Count the number of terminating nodes on the subtrees.
Counting the Number of Entries for a Subtree for a Word
- Start with the longest prefix. Count all entries bellow and memoize or use memoized result.
- Truncate the prefix by one. Count all entries bellow that. This will involve calculating entries in any sibling trees, which should be remembered.
- Repeat 2.
Repeated Words
When a word is repeated, it is useful to use the terminates attribute as an integer.
Improved Solution
Since the terminates value is incremented every time a value that has occurred is reinserted, then it makes since to make a trie of prefixes, e.g.
a -> a -> a
/^ /^ /^
b 1 b 2 b 3
^ /^ /^\
1 c 2 c 3 f
^ ^ ^
1 1 1
and determine the sum over the path to each word.
Solution
The trie implementation is as follows:
class TrieNode:
terminates: int
children: dict[str, Self]
def __init__(
self,
children: dict[str, Self],
terminates: int = False,
):
self.children = children
self.terminates = terminates
The Top Down Dynamic Programming Solution
The following solution works, but can be optimized. Unfortunately, in one case the solution ran out of memory. This goes to show that dynamic programming is not always the solution.
class SolutionInitial:
def sumPrefixScores(self, words: list[str]) -> list[int]:
def countOne(pfx: str, *, node: TrieNode | None = None) -> int:
"""Count for a specific prefix."""
if pfx in memo:
return memo[pfx]
# NOTE: Traverse until prefix is done. If the prefix cannot be
# traversed through, then no words start with the prefix.
if node is None:
node = root
for char in pfx:
if char not in node.children:
memo[pfx] = 0
return 0
node = node.children[char]
# NOTE: If the current node is a terminating node, then one word
# is matched.
count = 0
if node.terminates:
count += node.terminates
# NOTE: Count the number of terminating nodes bellow path. This
# should be the sum for each subtree.
for char in node.children:
count += countOne(pfx + char, node=node.children[char])
memo[pfx] = count
return count
memo: dict[str, int] = dict()
root = TrieNode(dict())
for word in words:
root.insert(word)
out = []
for word in words:
count = 0
for end in range(len(word), 0, -1):
count += countOne(word[:end])
out.append(count)
return outThe Improved Solution
This solution placed in the \(50\%\) quantile of runtime and \(70\%\) quantile of runtime:
class Solution:
def sumPrefixScores(self, words: list[str]) -> list[int]:
root = TrieNode(dict())
for word in words:
node = root
for char in word:
if char not in node.children:
new_node = TrieNode(dict(), terminates=1)
node.children[char] = new_node
node = new_node
else:
node = node.children[char]
node.terminates += 1
def count_path(word: str):
node = root
count = 0
for char in word:
node = node.children[char]
count += node.terminates
return count
return [count_path(word) for word in words]