Palindrome Completion

Completing a string to a palindrome with the minimal number of characters.
Author
Published

September 23, 2024

Modified

September 23, 2024

This was meant to address the Shortest Palindrome question on stack exchange, however I did not notice that completions by adding to the back of the string were not permitted. There will be another post detailing my solution to that problem. Here, the details of any completion are discussed.

It turns out that I misread the question, so see the subsequent article detailing my actual submissions.

Analysis

A few cases

I have the thought that only terminal and initial palindromes are worth searching for. To this end, I want to inspect a few cases first.

Case 1: No Nontrivial Palindromes.

abcd -----> abcd -----> abcd -----> abcd
^            ^            ^            ^
size=0      size=0      size=0    size=0
biggest=a


so complete around 'a' since it is the biggest initial palindrome.

Case 2: Significant Terminating Palindrome

aaca ----> aaca ----> aaca ----> aaca
^           ^           ^           ^
size=0     size=0     size=1        size=0
best_start=0          best_start=1
best_stop=0           best_stop=3
biggest=a             biggest=aca


complete around index 2 since it is the biggest, i.e

best[0 : start] + best[start: stop + 1]
  = 'a' + 'aca' + 'a'

Case 3: No Significant Initial/Terminal Palindrome

faacaab 
 ^^^^^

Case 4: Significant Initial and Terminal Palindrome

abbacdefgfedca
^^^^ 
   ^^^^^^^^^^^

Algorithm

In this case, it is clearly best to use the terminal palindrome. The algorithm I will try initially will go like

  1. Search the string for initial and terminal palindromes - ignore internal palindromes - and if the entire string is a palindrome return it.
  2. Reflect around the terminal or initial palindrome, whichever is larger. There will always be such a palindrome, because a single character is trivially a palindrome.

To elaborate on 2, if \(s=a_0a_1a_2a_3...a_n\), and \(a_k...a_n\) is bigger than the initial palindrome, then the completion is

\[\begin{equation} \color{green}a_0a_1...a_{k-1} \color{red}a_k...a_n \color{blue} a_{k-1}..a_1a_0 \end{equation}\]

otherwise if the initial palindrome \(a_0...a_j\) is the bigger palindrome, then

\[\begin{equation} \color{blue}a_na{n-1}...a{j+1}\color{red}a_0...a_j\color{green}a_{j+1}...a{n-1}a_n \end{equation}\]

Note that in the above, the uncolored section is the compliment of the palindrome in \(s\), the red section is the best palindrome in \(s\), and the blue section is the reflection of the compliment of the palindrome.

The next question is how to find a palindrome efficiently. It is easy to find an odd/even palindrome starting from some character in the string, however could optimizations be made to look for both simultaneously.

Finding the Initial/Terminal Palindrome

My initial solution looks at every character in \(s\) - this is rather expensive in terms of runtime and will likely find many useless palindromes (those which are neither terminal nor initial).

To look for an initial palindrome, the initial character should be looked for elsewhere in the string and its locations recorded. For each such location, a palindrome of that particular radius should be searched for. Further, since the locations of such characters are known, it is immediately known if the candidate string will be an even/odd palindrome - and, in fact, this has no bearing on the check.

Solutions

Initial Solution

My initial solution passed almost all of the test cases, but was rather slow and timed out on one of the final test cases.

import pytest


# start snippit solution_initial
class SolutionInitial:
    def searchPalindrome(
        self,
        s: str,
        *,
        index_final: int,
        index: int,
        even: bool = False,
    ):

        stop, start = index, index
        if even:
            if stop == index_final:
                return start, start

            stop += 1
            if s[start] != s[stop]:
                return start, start
            delta = min(index_final - 1 - index, index)
        else:
            delta = min(index_final - index, index)

        for _ in range(delta):
            start -= 1
            stop += 1

            if s[start] != s[stop]:
                start += 1
                stop -= 1
                break

        return start, stop

    def shortestPalindrome(self, s: str) -> str:
        n = len(s)
        if n == 1:
            return s

        index_final = n - 1
        best_initial, best_terminal = 0, index_final

        for index in range(n):
            start, stop = self.searchPalindrome(
                s,
                index_final=index_final,
                index=index,
                even=False,
            )
            start_even, stop_even = self.searchPalindrome(
                s,
                index_final=index_final,
                index=index,
                even=True,
            )
            if start_even <= start and stop_even >= stop:
                start = start_even
                stop = stop_even

            # NOTE: If a palindrome is found that is the entire string, return
            #       the entire string. When the stop is is n, a terminal
            #       palindrome is found When the start point is 0, an initial
            #       palindrome is found.
            if start != 0 and stop != n:
                continue
            elif start == 0 and stop == n:
                return s
            elif start == 0 and stop > best_initial:
                best_initial = stop
            elif stop == index_final and start < best_terminal:
                best_terminal = start

        size_terminal = index_final - best_terminal
        size_initial = best_initial

        if size_initial >= size_terminal:
            palindrome = s[: best_initial + 1]
            mantisa = s[best_initial + 1 :]
            return mantisa[::-1] + palindrome + mantisa
        else:
            palindrome = s[best_terminal:]
            mantisa = s[:best_terminal]
            return mantisa + palindrome + mantisa[::-1]

    # stop snippit solution_initial


# start snippet solution
class Solution:
    def isPalindrome(
        self,
        s: str,
        start: int,
        stop: int,
    ):
        t = s[start : stop + 1]
        return t == t[::-1]

    def shortestPalindrome(self, s: str):

        n = len(s)
        if not n:
            return s

        index_final = n - 1
        char_init, char_term = s[0], s[-1]
        best_init, best_term = 0, index_final

        if self.isPalindrome(s, 0, n - 1):
            return s

        for index in range(n - 1):
            char_front = s[index]
            char_back = s[index_final - index]

            if (
                char_front == char_init
                and self.isPalindrome(s, 0, index)
                # and index > best_init
            ):
                best_init = index

            if (
                char_back == char_term
                and self.isPalindrome(s, index_final - index, index_final)
                # and index < best_term
            ):
                best_term = index_final - index

        size_terminal = index_final - best_term
        size_initial = best_init

        if size_initial >= size_terminal:
            palindrome = s[: best_init + 1]
            mantisa = s[best_init + 1 :]
            return mantisa[::-1] + palindrome + mantisa
        else:
            palindrome = s[best_term:]
            mantisa = s[:best_term]
            return mantisa + palindrome + mantisa[::-1]

    # end snippet solution


@pytest.fixture
def solution():
    return Solution()


big = 10 * "a"


@pytest.mark.parametrize(
    "question, answer",
    (
        ("a", "a"),
        ("ab", "bab"),
        ("aacecaaa", "aaacecaaa"),
        ("abcd", "dcbabcd"),
        ("aac", "caac"),
        ("kcaaffaack", "kcaaffaack"),
        ("aa", "aa"),
        (
            (big + "c" + big + "d" + big),
            (big + "d" + big + "c" + big + "d" + big),
        ),
    ),
)
def test_solution(solution: Solution, question: str, answer: str):

    answer_computed = solution.shortestPalindrome(question)
    if len(answer) < 1000:
        # print(answer_computed, answer)
        pass
    assert answer_computed == answer


@pytest.mark.parametrize(
    "question, answer",
    (
        ("aa", True),
        ("abbbbba", True),
        ("", True),
        (10**1 * "a", True),
    ),
)
def test_is_palindrome(solution: Solution, question: str, answer: bool):

    n = len(question)
    assert solution.isPalindrome(question, 0, n) == answer

This is because searchPalindrome is terribly slow, and does not leverage the builtin functionality of python. Further, as mentioned in the section about finding the initial and terminal palindromes, there is a great deal of wasted effort looking for palindromes that cannot be used to complete in any efficient way.

Improved Solution

My improved solution used

import pytest


# start snippit solution_initial
class SolutionInitial:
    def searchPalindrome(
        self,
        s: str,
        *,
        index_final: int,
        index: int,
        even: bool = False,
    ):

        stop, start = index, index
        if even:
            if stop == index_final:
                return start, start

            stop += 1
            if s[start] != s[stop]:
                return start, start
            delta = min(index_final - 1 - index, index)
        else:
            delta = min(index_final - index, index)

        for _ in range(delta):
            start -= 1
            stop += 1

            if s[start] != s[stop]:
                start += 1
                stop -= 1
                break

        return start, stop

    def shortestPalindrome(self, s: str) -> str:
        n = len(s)
        if n == 1:
            return s

        index_final = n - 1
        best_initial, best_terminal = 0, index_final

        for index in range(n):
            start, stop = self.searchPalindrome(
                s,
                index_final=index_final,
                index=index,
                even=False,
            )
            start_even, stop_even = self.searchPalindrome(
                s,
                index_final=index_final,
                index=index,
                even=True,
            )
            if start_even <= start and stop_even >= stop:
                start = start_even
                stop = stop_even

            # NOTE: If a palindrome is found that is the entire string, return
            #       the entire string. When the stop is is n, a terminal
            #       palindrome is found When the start point is 0, an initial
            #       palindrome is found.
            if start != 0 and stop != n:
                continue
            elif start == 0 and stop == n:
                return s
            elif start == 0 and stop > best_initial:
                best_initial = stop
            elif stop == index_final and start < best_terminal:
                best_terminal = start

        size_terminal = index_final - best_terminal
        size_initial = best_initial

        if size_initial >= size_terminal:
            palindrome = s[: best_initial + 1]
            mantisa = s[best_initial + 1 :]
            return mantisa[::-1] + palindrome + mantisa
        else:
            palindrome = s[best_terminal:]
            mantisa = s[:best_terminal]
            return mantisa + palindrome + mantisa[::-1]

    # stop snippit solution_initial


# start snippet solution
class Solution:
    def isPalindrome(
        self,
        s: str,
        start: int,
        stop: int,
    ):
        t = s[start : stop + 1]
        return t == t[::-1]

    def shortestPalindrome(self, s: str):

        n = len(s)
        if not n:
            return s

        index_final = n - 1
        char_init, char_term = s[0], s[-1]
        best_init, best_term = 0, index_final

        if self.isPalindrome(s, 0, n - 1):
            return s

        for index in range(n - 1):
            char_front = s[index]
            char_back = s[index_final - index]

            if (
                char_front == char_init
                and self.isPalindrome(s, 0, index)
                # and index > best_init
            ):
                best_init = index

            if (
                char_back == char_term
                and self.isPalindrome(s, index_final - index, index_final)
                # and index < best_term
            ):
                best_term = index_final - index

        size_terminal = index_final - best_term
        size_initial = best_init

        if size_initial >= size_terminal:
            palindrome = s[: best_init + 1]
            mantisa = s[best_init + 1 :]
            return mantisa[::-1] + palindrome + mantisa
        else:
            palindrome = s[best_term:]
            mantisa = s[:best_term]
            return mantisa + palindrome + mantisa[::-1]

    # end snippet solution


@pytest.fixture
def solution():
    return Solution()


big = 10 * "a"


@pytest.mark.parametrize(
    "question, answer",
    (
        ("a", "a"),
        ("ab", "bab"),
        ("aacecaaa", "aaacecaaa"),
        ("abcd", "dcbabcd"),
        ("aac", "caac"),
        ("kcaaffaack", "kcaaffaack"),
        ("aa", "aa"),
        (
            (big + "c" + big + "d" + big),
            (big + "d" + big + "c" + big + "d" + big),
        ),
    ),
)
def test_solution(solution: Solution, question: str, answer: str):

    answer_computed = solution.shortestPalindrome(question)
    if len(answer) < 1000:
        # print(answer_computed, answer)
        pass
    assert answer_computed == answer


@pytest.mark.parametrize(
    "question, answer",
    (
        ("aa", True),
        ("abbbbba", True),
        ("", True),
        (10**1 * "a", True),
    ),
)
def test_is_palindrome(solution: Solution, question: str, answer: bool):

    n = len(question)
    assert solution.isPalindrome(question, 0, n) == answer