Hello Logout
 

Projects

BINARY SEARCH TREE

You are not Member of this Project.
Project Owner : Shyam.C
Created Date : Wed, 14/03/2012 - 22:35
Project Description :

 

In computer science, a binary search tree (BST), which may sometimes also be called an ordered or sorted binary tree, is a node-based binary tree data structure which has the following properties:

  • The left subtree of a node contains only nodes with keys less than the node's key.
  • The right subtree of a node contains only nodes with keys greater than the node's key.
  • Both the left and right subtrees must also be binary search trees.

Generally, the information represented by each node is a record rather than a single data element. However, for sequencing purposes, nodes are compared according to their keys rather than any part of their associated records.

The major advantage of binary search trees over other data structures is that the related sorting algorithms and search algorithms such as in-order traversal can be very efficient.

Binary search trees are a fundamental data structure used to construct more abstract data structures such as setsmultisets, and associative arrays.

 

 

Operations

Operations on a binary search tree require comparisons between nodes. These comparisons are made with calls to a comparator, which is a subroutine that computes the total order (linear order) on any two values. This comparator can be explicitly or implicitly defined, depending on the language in which the BST is implemented.

[edit]Searching

Searching a binary search tree for a specific value can be a recursive or iterative process. This explanation covers a recursive method.

We begin by examining the root node. If the tree is null, the value we are searching for does not exist in the tree. Otherwise, if the value equals the root, the search is successful. If the value is less than the root, search the left subtree. Similarly, if it is greater than the root, search the right subtree. This process is repeated until the value is found or the indicated subtree is null. If the searched value is not found before a null subtree is reached, then the item must not be present in the tree.

Here is the search algorithm in the Python programming language:

		# 'node' refers to the parent-node in this case
 def search_binary_tree(node, key):
     if node is None:
         return None  # key not found
     if key < node.key:
         return search_binary_tree(node.leftChild, key)
     elif key > node.key:
         return search_binary_tree(node.rightChild, key)
     else:  # key is equal to node key
         return node.value  # found key

… or equivalent Haskell:

		searchBinaryTree _   NullNode = Nothing
 searchBinaryTree key (Node nodeKey nodeValue (leftChild, rightChild)) =
     case compare key nodeKey of
       LT -> searchBinaryTree key leftChild
       GT -> searchBinaryTree key rightChild
       EQ -> Just nodeValue

This operation requires O(log n) time in the average case, but needs O(n) time in the worst case, when the unbalanced tree resembles a linked list (degenerate tree).

Assuming that BinarySearchTree is a class with a member function search(int) and a pointer to the root node, the algorithm is also easily implemented in terms of an iterative approach. The algorithm enters a loop, and decides whether to branch left or right depending on the value of the node at each parent node.

		bool BinarySearchTree::search(int val)
{
    Node *next = this->root();
 
    while (next != NULL) {
        if (val == next->value()) {
            return true;
        } else if (val < next->value()) {
            next = next->left();   
        } else {
            next = next->right();
        }
    } 
 
    //not found
    return false;
}

Insertion

Insertion begins as a search would begin; if the root is not equal to the value, we search the left or right subtrees as before. Eventually, we will reach an external node and add the value as its right or left child, depending on the node's value. In other words, we examine the root and recursively insert the new node to the left subtree if the new value is less than the root, or the right subtree if the new value is greater than or equal to the root.

Here's how a typical binary search tree insertion might be performed in C++:

				 /* Inserts the node pointed to by "newNode" into the subtree rooted at "treeNode" */
 void InsertNode(Node* &treeNode, Node *newNode)
 {
     if (treeNode == NULL)
       treeNode = newNode;
     else if (newNode->key < treeNode->key)
       InsertNode(treeNode->left, newNode);
     else
       InsertNode(treeNode->right, newNode);
 }

The above "destructive" procedural variant modifies the tree in place. It uses only constant space, but the previous version of the tree is lost. Alternatively, as in the following Pythonexample, we can reconstruct all ancestors of the inserted node; any reference to the original tree root remains valid, making the tree a persistent data structure:

				 def binary_tree_insert(node, key, value):
     if node is None:
         return TreeNode(None, key, value, None)
     if key == node.key:
         return TreeNode(node.left, key, value, node.right)
     if key < node.key:
         return TreeNode(binary_tree_insert(node.left, key, value), node.key, node.value, node.right)
     else:
         return TreeNode(node.left, node.key, node.value, binary_tree_insert(node.right, key, value))

The part that is rebuilt uses Θ(log n) space in the average case and O(n) in the worst case (see big-O notation).

In either version, this operation requires time proportional to the height of the tree in the worst case, which is O(log n) time in the average case over all trees, but O(n) time in the worst case.

Another way to explain insertion is that in order to insert a new node in the tree, its value is first compared with the value of the root. If its value is less than the root's, it is then compared with the value of the root's left child. If its value is greater, it is compared with the root's right child. This process continues, until the new node is compared with a leaf node, and then it is added as this node's right or left child, depending on its value.

There are other ways of inserting nodes into a binary tree, but this is the only way of inserting nodes at the leaves and at the same time preserving the BST structure.

Here is an iterative approach to inserting into a binary search tree in Java:

				private Node m_root;
 
public void insert(int data) {
    if (m_root == null) {
        m_root = new TreeNode(data, null, null);
        return;
    }
    Node root = m_root;
    while (root != null) {
        // Not the same value twice
        if (data == root.getData()) {
            return;
        } else if (data < root.getData()) {
            // insert left
            if (root.getLeft() == null) {
                root.setLeft(new TreeNode(data, null, null));
                return;
            } else {
                root = root.getLeft();
            }
        } else {
            // insert right
            if (root.getRight() == null) {
                root.setRight(new TreeNode(data, null, null));
                return;
            } else {
                root = root.getRight();
            }
        }
    }
}

Below is a recursive approach to the insertion method.

				private Node m_root;
 
public void insert(int data){
    if (m_root == null) {
        m_root = TreeNode(data, null, null);    
    }else{
        internalInsert(m_root, data);
    }
}
 
private static void internalInsert(Node node, int data){
    // Not the same value twice
    if (data == node.getValue()) {
        return;
    } else if (data < node.getValue()) {
        if (node.getLeft() == null) {
            node.setLeft(new TreeNode(data, null, null));
        }else{
            internalInsert(node.getLeft(), data);
        }
    }else{
        if (node.getRight() == null) {
            node.setRight(new TreeNode(data, null, null));
        }else{
            internalInsert(node.getRight(), data);
        }       
    }
}

Traversal

Once the binary search tree has been created, its elements can be retrieved in-order by recursively  traversing the left subtree of the root node, accessing the node itself, then recursively traversing the right subtree of the node, continuing this pattern with each node in the tree as it's recursively accessed. As with all binary trees, one may conduct a pre-order traversal or apost-order traversal, but neither are likely to be useful for binary search trees.

Types

There are many types of binary search trees. AVL trees and red-black trees are both forms of self-balancing binary search trees. A splay tree is a binary search tree that automatically moves frequently accessed elements nearer to the root. In a treap ("tree heap"), each node also holds a (randomly chosen) priority and the parent node has higher priority than its children. Tango Trees are trees optimized for fast searches.

Two other titles describing binary search trees are that of a complete and degenerate tree.

A complete tree is a tree with n levels, where for each level d <= n - 1, the number of existing nodes at level d is equal to 2d. This means all possible nodes exist at these levels. An additional requirement for a complete binary tree is that for the nth level, while every node does not have to exist, the nodes that do exist must fill from left to right.

A degenerate tree is a tree where for each parent node, there is only one associated child node. What this means is that in a performance measurement, the tree will essentially behave like a linked list data structure.


Performance comparisons

D. A. Heger (2004)[2] presented a performance comparison of binary search trees. Treap was found to have the best average performance, while red-black tree was found to have the smallest amount of performance fluctuations.


Optimal binary search trees

If we don't plan on modifying a search tree, and we know exactly how often each item will be accessed, we can construct an optimal binary search tree, which is a search tree where the average cost of looking up an item (the expected search cost) is minimized.

Even if we only have estimates of the search costs, such a system can considerably speed up lookups on average. For example, if you have a BST of English words used in a spell checker, you might balance the tree based on word frequency in text corpora, placing words like "the" near the root and words like "agerasia" near the leaves. Such a tree might be compared with Huffman trees, which similarly seek to place frequently-used items near the root in order to produce a dense information encoding; however, Huffman trees only store data elements in leaves and these elements need not be ordered.

If we do not know the sequence in which the elements in the tree will be accessed in advance, we can use splay trees which are asymptotically as good as any static search tree we can construct for any particular sequence of lookup operations.

Alphabetic trees are Huffman trees with the additional constraint on order, or, equivalently, search trees with the modification that all elements are stored in the leaves. Faster algorithms exist for optimal alphabetic binary trees (OABTs).

You are not authorized to access this content.
You are not authorized to access this content.
You are not authorized to access this content.
You are not authorized to access this content.
You are not authorized to access this content.