Recently during an interview I got into an argument defending Radix sort to be the best way to sort. Reasons – nearly constant time. Unbelievable? Well, Let me explain –

Radix sort is linear in operation.

The computation time is dependent on –

The highest powers of 10 (base for Decimal Number system) that the largest number corresponds to

And of course on the base of the number system we’re operating with (which of course is decimal in our case).

Aswikipedia puts it, the time taken for Radix sort is – O(wN), where w = word size, which is the largest size of data that the cpu register can hold (the cpu registers store temporary data during computation). And, w tends-to-be-constant for a large n.

In the video below:

the collection contains numbers – [ 9,179,139,38,10,5,36 ], the highest being 179. The loop would run 3 times accessing each array element.

Explaining the operations in the video above:

There are totally 3 iterations

The closest power of 10 that the highest number 179 can match is 2

179 is closest to 10^2 = 100

And, maxiterations =power+1 which in this case is 3.

Had the maximum number been in the ranges of 1000s e.g. for a collection like [ 9,179,139,38,10,5,36, 1037 ], the loop would have run 4 times (3 from 10^3, plus 1).

And, why we choose powers of 10

because that’s the base of the number system we used i.e. decimals.

Had we used binaries, this would have been computed in terms of powers of 2, 8 for octals, 16 for hexadecimals and so on

Observations

The no. of iterations do not depend on the number of elements

The time complexity depends entirely on

the number system

and magnitude of the highest number in the collection

So, considering the fact that the we’ve to deal with a certain maximum number and not beyond that, won’t it be safer to conclude it is nearly constant time?

Further,

No comparison operations – In Radix Sort, you just take the LSBs–MSBs into considerations

The number of input elements doesn’t matter

For average case, the complexity equation deals with a unit power of n, where n is the number of input elements

Major flaws

Added space complexity – maintaining a separate collection to store the processed numbers

Doesn’t use hardware caches (Quicksort is the most efficient in this case, where we store numbers in the registers, although I’ve not verified this claim so far)