DS Menu


Standard Trie




A standard trie, also known as a prefix tree, is a hierarchical data structure that stores a dynamic set of strings. Unlike binary search trees, no node in the trie stores the complete key. Instead, a path from the root node to a leaf or a node with a unique value represents a key. Each node in a trie represents a single character from the set of input strings.

Properties of Standard Trie

  • Nodes: Each node of a trie represents a single character of a key. A node can have as many children as there are characters in the alphabet (in the simplest case, binary tries are considered, but for strings, this can be the number of characters in the charset, e.g., 26 for lowercase English letters).
  • Root: The root node doesn’t contain any character or represents an empty string.
  • Markers: Each node has a marker (e.g., a boolean flag) to indicate whether the node corresponds to the end of a key.

Standard Trie Structure

Structure
struct TrieNode {
    struct TrieNode* children[ALPHABET_SIZE];
    bool isEndOfWord;
}
//TrieNode Structure: Defines a trie node that has an array of pointers to its children (ALPHABET_SIZE is 26 for lowercase English letters) and a boolean flag isEndOfWord to mark the end of a word.

Operations on Standard Trie

Inserting Keys into Standard Trie

  • Start at the Root: Begin at the root node.
  • Iterate Over the Key Characters: For each character in the key to be inserted, do the following:
    • Check if the current node has a child node corresponding to the current character.
    • If it does not exist, create a new node representing this character and link it to the current node as a child.
    • Move to this child node.
  • Mark the End of the Key: Once the last character of the key has been processed, mark the current node as an end of a key to indicate that a complete key has been inserted. This is typically done by setting a boolean flag in the node.
Pseudo-code
TrieNode* getNode(void) {
    TrieNode* pNode = (TrieNode*)malloc(sizeof(TrieNode));
    if (pNode) {
        int i;
        pNode->isEndOfWord = false;
        for (i = 0; i < ALPHABET_SIZE; i++)
            pNode->children[i] = NULL;
    }
    return pNode;
}
//getNode: Dynamically allocates a new trie node and initializes its children to NULL and isEndOfWord to false.

void insert(TrieNode* root, const char* key) {
    TrieNode* pCrawl = root;
    int index;
    int level;
    int length = strlen(key);
    for (level = 0; level < length; level++) {
        index = key[level] - 'a'; // Map the character to an index
        if (!pCrawl->children[index])
            pCrawl->children[index] = getNode();
        pCrawl = pCrawl->children[index];
    }
    // mark last node as leaf (end of word)
    pCrawl->isEndOfWord = true;
}
//insert: Inserts a word into the trie. It calculates the child index for each character (assuming 'a' to 'z') and moves down the trie, creating new nodes as necessary.

Searching for Keys in Standard Trie

  • Start at the Root: Begin at the root node.
  • Iterate Over the Key Characters: For each character in the key being searched, do the following:
    • Check if the current node has a child node corresponding to the current character.
    • If such a child node exists, move to this child node and continue.
    • If such a child node does not exist, then the key does not exist in the trie.
  • Check the End of the Key: After the last character of the key is processed, check if the current node is marked as an end of a key.
    • If it is marked, the key exists in the trie.
    • If it is not marked, the key does not exist in the trie as it might be a prefix of another longer key.
Pseudo-code
bool search(TrieNode* root, const char* key) {
    TrieNode* pCrawl = root;
    int index;
    int level;
    int length = strlen(key);
    for (level = 0; level < length; level++) {
        index = key[level] - 'a';
        if (!pCrawl->children[index])
            return false;
        pCrawl = pCrawl->children[index];
    }

    return (pCrawl != NULL && pCrawl->isEndOfWord);
}
//search: Searches for a word in the trie. It moves down the trie based on the characters in the word. If it can traverse the trie exactly according to the word and the last node is marked as the end of a word, the word is found.

4. Deletion on Standard Trie

  • Mark the leaf node for the target string as non-leaf.
  • Recursively walk down the trie and explore scenarios:
    • If a node becomes childless after removal, remove it and update its parent's edge accordingly.
    • If a node still has other child nodes (indicating other strings share prefixes), retain it.
  • Continue until you reach a stable state or the root node.

Next Topic :Compressed tries