Shuffling

"Everyday I’m shufflin." LMFAO

Mixing up some cards. Easy, right?

Physical Shuffles

Most people when given a deck of cards to shuffle will use either the riffle or overhand shuffle.

Riffle

This shuffle involves splitting the deck into two groups and interweaving the cards from each group to reorder the deck. For a standard 52 card deck, there are $n!$ possible permutations. A single riffle, can only result in $2^{52}$ possible outcomes. That means multiple shuffles are needed to get a truly randomized deck. Bayer and Diaconis proposed seven as the optimal iterations for obtaining any arrangement for a standard deck, or $\frac 32\,log_2\,n$ shuffles for a $n$ sized deck.

Entropy from the cut and during the merge is required for the riffle to be effective. For the faro shuffle, or a perfect riffle when the deck is split exactly in half and one item is taken from each group at a time, the identity permutation is obtained after eight iterations for out-faro shuffles and after 52 iterations for in-faro for a standard deck.

Overhand

An overhand shuffle is accomplished by cutting the deck and swapping the top and bottom groups. While the algorithm is simple, the number of iterations required to achieve randomization is high because groups of cards stay together. Pemantle shows that between $n^2$ and $n^2\,log\,n$ shuffles are needed. That’s ~2,500 shuffles for a standard deck.

Modeling these physical shuffles for a program is costly because the number of iterations required to obtain an acceptable amount of randomization, especially when used for large $n$ sized collections.

The Goal

Instead of reproducing standard card shuffles, let’s approach the problem by talking about the goal.

Shuffling in some respects is the logical inverse of sorting. The desire outcome for sorting is an arrangement of the items into a specific ordered sequence. Shuffling’s goal is to arrange items into one of the $n!$ possible permutations selected at random, with each possible outcome having the same probability of selection.

In fact, sorting can be used to shuffle by assigning a unique random value to each element and sorting based on the assigned value. The time complexity becomes dependent on the sort used and the ability to generate and assign unique values to each item. How bout swapping each card with another card chosen at random instead of sorting?

Helpers

First let’s define some helper functions.

/** Returns random integer in the range of [min, max). */
function randomInt(min, max) {
  return min + Math.floor(Math.random() * (max - min));
}

/** Swap elements at index i and k (Time: O(1)).
 *  No-op when i is equal to k. */
function swap(i, k) {
  var holder;
  if (i !== k) {
    holder = this[i];
    this[i] = this[k];
    this[k] = holder;  
  }
}

Exchange

Now let’s try looping through a deck and swapping each element with another randomly chosen element (including itself).

function exchangeShuffle(arr) {
  var i = -1, len = arr.length, k;
  while(++i < len) {
    k = randomInt(0, len);
    swap.call(arr, i, k);
    console.log('k:', k, arr);
  }
  return arr;
}

> exchangeShuffle([0,1,2,3,4,5]);
k: 3 [ 3, 1, 2, 0, 4, 5 ]
k: 0 [ 1, 3, 2, 0, 4, 5 ]
k: 3 [ 1, 3, 0, 2, 4, 5 ]
k: 3 [ 1, 3, 0, 2, 4, 5 ]
k: 5 [ 1, 3, 0, 2, 5, 4 ]
k: 5 [ 1, 3, 0, 2, 5, 4 ]
[ 1, 3, 0, 2, 5, 4 ]
> exchangeShuffle([0,1,2,3,4,5]);
k: 5 [ 5, 1, 2, 3, 4, 0 ]
k: 4 [ 5, 4, 2, 3, 1, 0 ]
k: 4 [ 5, 4, 1, 3, 2, 0 ]
k: 2 [ 5, 4, 3, 1, 2, 0 ]
k: 4 [ 5, 4, 3, 1, 2, 0 ]
k: 1 [ 5, 0, 3, 1, 2, 4 ]
[ 5, 0, 3, 1, 2, 4 ]

# Time O(n)
# Space O(1)

We did it! Or did we? This approach seems to be working, but upon closer examination, it turns out the results aren’t uniformly distributed as we would expect from a good shuffle. Instead of $n!$ possible permutations, the exchange shuffle has $n^{n}$ possible swaps, resulting in over representation of certain outcomes.

The problem with this naive approach is discussed in detail in Jeff Atwood’s blog post. Also interesting is Goldstein’s paper states that the identity permutation is the most likely for large $n$.

Using Removal

The exchange shuffle didn’t come to mind when I tried to come up with my own shuffling algorithm. I thought of something a bit more along the lines of how a child might do it. Throwing them all in the air and picking one at a time to get a new order. The throwing part isn’t necessary as long as you pick the cards at random. We can call it the removal shuffle for lack of a better name.

function removalShuffle(arr) {
  var shuffled = [], i = arr.length, k;
  while (i) {
    k = randomInt(0, i);
    shuffled.push(remove.call(arr, k));
    i--;
    console.log('k:', k, shuffled, arr);
  }
  Array.prototype.push.apply(arr, shuffled);
  return arr;
}

/** Remove and return kth element (average time: O(n/2)). */
function remove(k) {
  return this.splice(k, 1)[0];
}

> removalShuffle([0,1,2,3,4,5]);
k: 5 [ 5 ] [ 0, 1, 2, 3, 4 ]
k: 2 [ 5, 2 ] [ 0, 1, 3, 4 ]
k: 1 [ 5, 2, 1 ] [ 0, 3, 4 ]
k: 1 [ 5, 2, 1, 3 ] [ 0, 4 ]
k: 1 [ 5, 2, 1, 3, 4 ] [ 0 ]
k: 0 [ 5, 2, 1, 3, 4, 0 ] []
[ 5, 2, 1, 3, 4, 0 ]
> removalShuffle([0,1,2,3,4,5]);
k: 0 [ 0 ] [ 1, 2, 3, 4, 5 ]
k: 2 [ 0, 3 ] [ 1, 2, 4, 5 ]
k: 2 [ 0, 3, 4 ] [ 1, 2, 5 ]
k: 0 [ 0, 3, 4, 1 ] [ 2, 5 ]
k: 1 [ 0, 3, 4, 1, 5 ] [ 2 ]
k: 0 [ 0, 3, 4, 1, 5, 2 ] []
[ 0, 3, 4, 1, 5, 2 ]

# Time O(n^2)
# Space O(n)

Turns out this is pretty close to the Fisher-Yates Shuffle, the basis of most modern shuffle implementations. Instead of removing elements, the original Fisher-Yates shuffle strikes out elements when selected.

Fisher-Yates

function originalFisherYatesShuffle(arr) {
  var shuffled = [], remaining = arr.length, k;
  while (remaining) {
    k = randomInt(0, remaining);
    shuffled.push(strikeOutUnstruck.call(arr, k));
    remaining--;
    console.log('k:', k, shuffled, arr);
  }
  return shuffled;
}

/** Strike out and return kth unstruck element (average time: O(n/2)). */
function strikeOutUnstruck(k) {
  var count = 0, i = -1, value;
  while (count < k+1) {
    i++;
    if (this[i] !== undefined) {
      count++;
    }
  }
  value = this[i];
  delete this[i];
  return value;  
}

> originalFisherYatesShuffle([0,1,2,3,4,5]);
k: 5 [ 5 ] [ 0, 1, 2, 3, 4,  ]
k: 1 [ 5, 1 ] [ 0, , 2, 3, 4,  ]
k: 0 [ 5, 1, 0 ] [ , , 2, 3, 4,  ]
k: 1 [ 5, 1, 0, 3 ] [ , , 2, , 4,  ]
k: 1 [ 5, 1, 0, 3, 4 ] [ , , 2, , ,  ]
k: 0 [ 5, 1, 0, 3, 4, 2 ] [ , , , , ,  ]
[ 5, 1, 0, 3, 4, 2 ]
> originalFisherYatesShuffle([0,1,2,3,4,5]);
k: 3 [ 3 ] [ 0, 1, 2, , 4, 5 ]
k: 2 [ 3, 2 ] [ 0, 1, , , 4, 5 ]
k: 2 [ 3, 2, 4 ] [ 0, 1, , , , 5 ]
k: 1 [ 3, 2, 4, 1 ] [ 0, , , , , 5 ]
k: 1 [ 3, 2, 4, 1, 5 ] [ 0, , , , ,  ]
k: 0 [ 3, 2, 4, 1, 5, 0 ] [ , , , , ,  ]
[ 3, 2, 4, 1, 5, 0 ]

# Time O(n^2)
# Space O(n)

The results of using these shuffles is uniformly distributed over the $n!$ possible permutations. However, the removal shuffle has $n$ splices (average $O(\frac n2)$ time) and the Fisher-Yates has $n$ strike out operations (also average $O(\frac n2)$ time), making both these shuffles $O(n^2)$ time complexity. We can do better than that.

The Solution

The missing piece of the puzzle is most obvious when looking at the change in sizes of the initial and shuffled arrays of the removal shuffle. Let’s take another look:

> removalShuffle([0,1,2,3,4,5]);
k: 5 [ 5 ] [ 0, 1, 2, 3, 4 ]
k: 2 [ 5, 2 ] [ 0, 1, 3, 4 ]
k: 1 [ 5, 2, 1 ] [ 0, 3, 4 ]
k: 1 [ 5, 2, 1, 3 ] [ 0, 4 ]
k: 1 [ 5, 2, 1, 3, 4 ] [ 0 ]
k: 0 [ 5, 2, 1, 3, 4, 0 ] []
[ 5, 2, 1, 3, 4, 0 ]

We can see that the size of the shuffled array grows as the size of the initial array shrinks. Instead of removing or striking out selected elements, if we swap them into a ‘processed’ region of the initial array the time complexity reduced to $O(n)$ and the shuffle can be done in-place with $O(1)$ space. This modernized version of the Fisher-Yates shuffle is often referred to as Knuth Shuffle.

Knuth

function knuthShuffle(arr) {
  var i = arr.length, k;
  while (--i) {
    k = randomInt(0, i+1);
    swap.call(arr, i, k);
    console.log(k, arr);
  }
  return arr;
}

> knuthShuffle([0,1,2,3,4,5]);
3 [ 0, 1, 2, 5, 4, 3 ]
3 [ 0, 1, 2, 4, 5, 3 ]
2 [ 0, 1, 4, 2, 5, 3 ]
1 [ 0, 4, 1, 2, 5, 3 ]
1 [ 0, 4, 1, 2, 5, 3 ]
[ 0, 4, 1, 2, 5, 3 ]
> knuthShuffle([0,1,2,3,4,5]);
2 [ 0, 1, 5, 3, 4, 2 ]
0 [ 4, 1, 5, 3, 0, 2 ]
3 [ 4, 1, 5, 3, 0, 2 ]
1 [ 4, 5, 1, 3, 0, 2 ]
0 [ 5, 4, 1, 3, 0, 2 ]
[ 5, 4, 1, 3, 0, 2 ]

# Time O(n)
# Space O(1)

There are only $n-1$ operations because the last swap would always result in a no-op swap with itself.

Usages

The modernized Fisher-Yates shuffle and its various forms is used in all sorts of libraries and languages. Here are some examples:

Python’s random library uses the standard in place form.

def shuffle(self, x, random=None):
    """x, random=random.random -> shuffle list x in place; return None.

    Optional arg random is a 0-argument function returning a random
    float in [0.0, 1.0); by default, the standard random.random.

    """

    if random is None:
        random = self.random
    _int = int
    for i in reversed(xrange(1, len(x))):
        # pick an element in x[:i+1] with which to exchange x[i]
        j = _int(random() * (i+1))
        x[i], x[j] = x[j], x[i]

Lodash’s _.shuffle uses the “inside-out” form to initialize and shuffle a new array.

function shuffle(collection) {
  collection = toIterable(collection);

  var index = -1,
      length = collection.length,
      result = Array(length);

  while (++index < length) {
    var rand = baseRandom(0, index);
    if (index != rand) {
      result[index] = result[rand];
    }
    result[rand] = collection[index];
  }
  return result;
}

Java’s util.Collection.shuffle uses the standard form. It is interesting to note that for large lists that do not implement RandomAccess, the shuffle method makes a copy to an array to prevent quadratic time complexity.

public static void shuffle(List<?> list, Random rnd) {
  int size = list.size();
  if (size < SHUFFLE_THRESHOLD || list instanceof RandomAccess) {
    for (int i=size; i>1; i--)
      swap(list, i-1, rnd.nextInt(i));
  } else {
    Object arr[] = list.toArray();

    // Shuffle array
    for (int i=size; i>1; i--)
      swap(arr, i-1, rnd.nextInt(i));

    // Dump array back into list
    // instead of using a raw type here, it's possible to capture
    // the wildcard but it will require a call to a supplementary
    // private method
    ListIterator it = list.listIterator();
    for (int i=0; i<arr.length; i++) {
      it.next();
      it.set(arr[i]);
    }
  }
}