DS Menu


Suffix Trie




A Suffix Trie is a specialized trie data structure that represents all the suffixes of a given text. It's an extremely useful structure for various string processing applications, enabling efficient solutions for problems like substring search, pattern matching, and even more complex tasks like finding the longest repeated substring. Constructing a Suffix Trie involves inserting all suffixes of a string into the trie, including a special terminal character (often denoted as $) to mark the end of each suffix.

Construction of a Suffix Trie

The construction of a Suffix Trie for a string S involves inserting all the suffixes of S into the trie. To mark the end of the string and handle cases where one suffix is a prefix of another, a unique end symbol (often denoted as "$") is appended to S.

Given a string S of length N, here's a step-by-step explanation of constructing its Suffix Trie:

  • Initialization: Start with an empty root node. The trie will eventually have N+1 leaf nodes, each representing a suffix of SS explicitly.
  • Insertion of Suffixes: For each suffix of SS, insert it into the trie:
    • Start from the first character of S and insert the entire string S+$.
    • For the next step, start from the second character of S, inserting the suffix S[1:]+$, and so on.
    • Continue this process until you've inserted the last suffix, which is just the end symbol "$".
  • Node Creation: While inserting each suffix, for each character in the suffix, check if there exists a child node corresponding to the current character:
    • If such a child exists, move to this child and continue with the next character of the suffix.
    • If such a child does not exist, create a new child node for the current character, link it to the current node, and proceed with the next character.
  • Marking the End of a Suffix: When the end of a suffix is reached (i.e., after inserting the "$"), mark the current node. This marking can be a special flag or storing the index of the starting position of the suffix in the original string S.