Daily Question 09/25/2024: Sum of Prefix Scores of Strings
The problem statement may be found on leetcode.
Analysis
Dynamic Programming Solution
This is pretty clearly a situation where using a trie is very helpful. I already implemented a trie for yesterdays problem but decided to update the terminates
attribute to keep count of repeated additions since repeats may show up in the list. More or less, the answer is the sum of terminates
on the subtree for each prefix for each word. For instance
words: ["ab", "abc", "abf", "b", "bc"]
drawing: |
a b
/ ^\
b c
/^\ ^
c f
^ ^solution: [6, 7, 1, 2, 3]
#[3 + 3, 3 + 3 + 1, 3 + 3 + 1, 2, 2 + 1]
Counting the Number of Entries for a Subtree.
Given a prefix, the tree should be traversed until this prefix is matched. If it is not, then return 0.
Count the number of terminating nodes on the subtrees.
Counting the Number of Entries for a Subtree for a Word
- Start with the longest prefix. Count all entries bellow and memoize or use memoized result.
- Truncate the prefix by one. Count all entries bellow that. This will involve calculating entries in any sibling trees, which should be remembered.
- Repeat 2.
Repeated Words
When a word is repeated, it is useful to use the terminates
attribute as an integer.
Improved Solution
Since the terminates
value is incremented every time a value that has occurred is reinserted, then it makes since to make a trie of prefixes, e.g.
a -> a -> a
/^ /^ /^
b 1 b 2 b 3
^ /^ /^\
1 c 2 c 3 f
^ ^ ^
1 1 1
and determine the sum over the path to each word.
Solution
The trie implementation is as follows:
class TrieNode:
int
terminates: dict[str, Self]
children:
def __init__(
self,
dict[str, Self],
children: int = False,
terminates:
):self.children = children
self.terminates = terminates
The Top Down Dynamic Programming Solution
The following solution works, but can be optimized. Unfortunately, in one case the solution ran out of memory. This goes to show that dynamic programming is not always the solution.
class SolutionInitial:
def sumPrefixScores(self, words: list[str]) -> list[int]:
def countOne(pfx: str, *, node: TrieNode | None = None) -> int:
"""Count for a specific prefix."""
if pfx in memo:
return memo[pfx]
# NOTE: Traverse until prefix is done. If the prefix cannot be
# traversed through, then no words start with the prefix.
if node is None:
= root
node for char in pfx:
if char not in node.children:
= 0
memo[pfx] return 0
= node.children[char]
node
# NOTE: If the current node is a terminating node, then one word
# is matched.
= 0
count if node.terminates:
+= node.terminates
count
# NOTE: Count the number of terminating nodes bellow path. This
# should be the sum for each subtree.
for char in node.children:
+= countOne(pfx + char, node=node.children[char])
count
= count
memo[pfx] return count
dict[str, int] = dict()
memo: = TrieNode(dict())
root for word in words:
root.insert(word)
= []
out for word in words:
= 0
count for end in range(len(word), 0, -1):
+= countOne(word[:end])
count
out.append(count)
return out
The Improved Solution
This solution placed in the \(50\%\) quantile of runtime and \(70\%\) quantile of runtime:
class Solution:
def sumPrefixScores(self, words: list[str]) -> list[int]:
= TrieNode(dict())
root for word in words:
= root
node for char in word:
if char not in node.children:
= TrieNode(dict(), terminates=1)
new_node = new_node
node.children[char] = new_node
node else:
= node.children[char]
node += 1
node.terminates
def count_path(word: str):
= root
node = 0
count for char in word:
= node.children[char]
node += node.terminates
count
return count
return [count_path(word) for word in words]