5
Radix Sort

Radix Sort

First thing first what is sorting ?

Sorting is any process of arranging items systematically, and has two common, yet distinct meanings: ordering: arranging items in a sequence ordered by some criterion; categorizing: grouping items with similar properties.
[source-Wikipedia]

About the sorting Algorithms

The lower bound of all the comparison based sorting algorithms( Merge sort, Heap sort, Quick sort, ...etc) is Ω(nLogn), i.e.., cannot be better than nLogn.

Counting sort/Pigeon sort is a linear time sorting algorithm that sort in O(n+k) time when elements are in range from 1 to k.

Where it(counting-sort) Fails?

when elements reaches in the range from 1 to n2 as in that case it will take O(n2) which is worst than the above mentioned sorting algorithms.

can we do better than O(nLogn) for the range 1 to n2?
The answer is Radix sort.

The Idea of Radix sort

The idea is to sort digit by digit starting from the least significant digit and moving to the most significant digit. here counting-sort is used as a subroutine to sort.

The Radix sort Algorithm

  1. For all i where i is from the least significant to the most significant digit of the number do the following
    • sort the input array using counting sort according to its i'th digit.

Example


Original, unsorted list:
170, 45, 75, 90, 802, 24, 2, 66

Sorting by least significant digit (1s place) gives: [Notice that we keep 802 before 2, because 802 occurred before 2 in the original list, and similarly for pairs 170 & 90 and 45 & 75.]
170, 90, 802, 2, 24, 45, 75, 66

Sorting by next digit (10s place) gives: [
Notice that 802 again comes before 2 as 802 comes before 2 in the previous list.]
802, 2, 24, 45, 66, 170, 75, 90

Sorting by most significant digit (100s place) gives:
2, 24, 45, 66, 75, 90, 170, 802

What is the running time of Radix Sort?
Let there be d digits in input integers. Radix Sort takes O(d*(n+b)) time where b is the base for representing numbers, for example, for decimal system, b is 10. What is the value of d? If k is the maximum possible value, then d would be O(logb(k)). So overall time complexity is O((n+b) * logb(k)). Which looks more than the time complexity of comparison based sorting algorithms for a large k. Let us first limit k. Let k <= nc where c is a constant. In that case, the complexity becomes O(nLogb(n)). But it still doesn’t beat comparison based sorting algorithms.
What if we make value of b larger?. What should be the value of b to make the time complexity linear? If we set b as n, we get the time complexity as O(n). In other words, we can sort an array of integers with range from 1 to nc if the numbers are represented in base n (or every digit takes log2(n) bits).

Is Radix Sort preferable to Comparison based sorting algorithms like Quick-Sort?
If we have log2n bits for every digit, the running time of Radix appears to be better than Quick Sort for a wide range of input numbers. The constant factors hidden in asymptotic notation are higher for Radix Sort and Quick-Sort uses hardware caches more effectively. Also, Radix sort uses counting sort as a subroutine and counting sort takes extra space to sort numbers.

Implementation of Radix Sort

Following is a simple C implementation of Radix Sort. For simplicity, the value of d is assumed to be 10.

#include<stdio.h>
#include<stdlib.h>
int getMax(int arr[], int n){    // A function to get maximum value in array arr[]; 
    int max = arr[0];      //don't forget to initialise the value of max, either this way or initialize it with a number lower than the minimum of the given array
    for (int i = 1; i < n; i++){
        if (arr[i] > max){        
            max = arr[i];
         }
    }
return max;
}

void countSort(int arr[], int n, int exp){    // A function to do counting sort of arr[] according to
                                                                   // the digit represented by exp.
    int output[n];                                         // output array
    int i, count[10] = {0};

    for (i = 0; i < n; i++){                           // Store count of occurrences in count[]
        count[ (arr[i]/exp)%10 ]++;
   }

     for (i = 1; i < 10; i++){                        // Change count[i] so that count[i] now contains actual position of this digit in output[]
        count[i] += count[i - 1];
     }

    for (i = n - 1; i >= 0; i--){                     // Build the output array
        output[count[ (arr[i]/exp)%10 ] - 1] = arr[i];                 //read carefully how the index is manipulated relating arr[], ccount, output.
        count[ (arr[i]/exp)%10 ]--;
    }

    for (i = 0; i < n; i++){                               // Copy the output array to arr[], so that arr[] now  contains sorted numbers according to current digit
        arr[i] = output[i];
    }
}

void RadixSort(int arr[], int n){                    // The function that sorts arr[] of size n using Radix Sort
      int m = getMax(arr, n);                           // Find the maximum number to know number of digits
                                                                       // number, exp is passed. exp is 10^i where i is current digit number                     
   for (int exp = 1; m/exp > 0; exp *= 10){
        countSort(arr, n, exp);
    }
}

// function to print our final sorted array
void print(int arr[], int n){
    for (int i = 0; i < n; i++){
      printf("%d ",arr[i]);               
    }
  printf("\n");
}

int main(){
    int arr[] = {170, 45, 75, 90, 802, 24, 2, 66};   
    int n = sizeof(arr)/sizeof(arr[0]);               //    sizeof(arr)=8*4 Bytes ( (size of the array) * (size of each element) )
                                                                        //sizeof(arr[0])=4 Bytes (size of an int variable).
    RadixSort(arr, n);
    print(arr, n);
    return 0;
}


Output:

2 24 45 66 75 90 170 802

for more sorting Algorithms Follow up Prateek's Sorting note.

Author

Notifications

?