Answers JordiB 8th October, 2009 08:02 (UTC)
It seems to me that both your algorithms are exactly the same. The difference would be that n
Answers JordiB 8th October, 2009 08:16 (UTC)

Sorry about that, I pressed an unfortunate combination of buttons that accidentally made me reply mid-sentence.

It seems to me that both your algorithms are exactly the same. The difference would be that in the first `n` is decremented after determining the random number `k`, and in the second it is decremented before this. However, this is compensated by the fact that you add 1 to `n` there.

Are these implementations capable of producing any of the permutations?

Is there any difference (significant — in terms of output or bias — or maybe just stylistic etc.) between the two implementations?

Seems to me like you are using classic Fisher-Yates shuffle, so yes.

Is there a better alternative?

I don't think so.

What limits are there to the length of the array/list that is to be shuffled?

The algorithm has linear complexity, so speed should probably not be an issue. Wikipedia says something the shuffle not being able to have more states than the number of states the random number generator can generate. This means that using 64-bit precision (I have no idea what JavaScript has), you can get all states of a 20 element shuffle. However, is this true, your shuffle will almost never be unbiased as the number of states the RNG can take is rarely a multiple of N!. However, I'm not sure these problems apply to your algorithm, because it seems to me that it would only apply if you only take one random number to generate the entire shuffle. However, you get a new random number for every element, so I think this should not be a problem.

Answers Kirit Sælensminde 8th October, 2009 08:51 (UTC)
JordiB said

It seems to me that both your algorithms are exactly the same. The difference would be that in the first `n` is decremented after determining the random number `k`, and in the second it is decremented before this. However, this is compensated by the fact that you add 1 to `n` there.

To paraphrase Knuth, you only proved their equivalence? :)


To join in the discussion you should register or log in
Answers JordiB 8th October, 2009 13:05 (UTC)
Kirit Sælensminde said
JordiB said

It seems to me that both your algorithms are exactly the same. The difference would be that in the first `n` is decremented after determining the random number `k`, and in the second it is decremented before this. However, this is compensated by the fact that you add 1 to `n` there.

To paraphrase Knuth, you only proved their equivalence? :)

What else do you want me to prove? Or do you mean that they are not actually (as I said) “the same”, because for instance they have different textual representation? What I meant is that they will produce identical outcomes (in as far as this is possible, since both have random number in them).

Answers Kirit Sælensminde 9th October, 2009 01:27 (UTC)
JordiB said

What I meant is that they will produce identical outcomes.

I don't think that they do. I think one of them is a correct implementation of Fisher/Yates, the other has a subtle bug which leads to two problems: It does an extra loop which will introduce a subtle bias in the output, and even worse it will occasionally produce an array subscript which is out of bounds.


To join in the discussion you should register or log in
Answers JordiB 9th October, 2009 09:09 (UTC)
Kirit Sælensminde said
JordiB said

What I meant is that they will produce identical outcomes.

I don't think that they do. I think one of them is a correct implementation of Fisher/Yates, the other has a subtle bug which leads to two problems: It does an extra loop which will introduce a subtle bias in the output, and even worse it will occasionally produce an array subscript which is out of bounds.

I'll refactor both methods a little bit so that they still have the same behavior (I think), but we can analyze the differences more easily (I replaced the square brackets with angle brackets, because otherwise everything would be formatted on one line):

1.  Array.prototype.shuffle1 = function() {
2.      for ( var n = this.length; n > 1; ) {
3.          var old_n = n;
4.          var k = Math.floor(Math.random()*old_n);
5.          n—;
6.          var t = this<k>;
7.          this<k> = this<n>;
8.          this<n> = t;
9.      }
10.     return this;
11. }
 1.  Array.prototype.shuffle2 = function() {
 2.      for ( var n = this.length; n > 1; ) {
 3.          n—;
 4.          var old_n = n+1;
 5.          var k = Math.floor(Math.random()*old_n);
6.          var t = this<k>;
7.          this<k> = this<n>;
8.          this<n> = t;
9.      }
10.     return this;
11. }

Lines 1 and 2 are now the same. In shuffle2, we now decrement `n` and then make `old_n` by adding one to `n` again. In both versions `old_n` should have the same value. Agreed? Now, we take the random number `k`, which should now also be the same (because both versions use `old_n`). At this point `n` is still one smaller in the second version, but this is now fixed in line 5 of shuffle1. After line 5 all variables should have the same values and since line 6-11 are also the same in both versions, the behavior should be the same. Where am I going wrong?

Answers Kirit Sælensminde 9th October, 2009 09:29 (UTC)

I think for shuffle2 you are decrementing n in the wrong place, but I haven't worked yours out. This is what I did with a pen and paper yesterday. For both I'm going to consider a list of length 2.

For shuffle 1

  1. At the loop set up we get n = 2 to start with.
  2. We enter the loop (n > 1) and get a random number in the range 0..<2 (i.e. 0 or 1 after the floor op) and n becomes 1
  3. We swap the element at 1 with either the element at 0 or 1.
  4. The termination condition on the loop fails as n is 1

We have gone through the array and either left the last element where it was or swapped it with the first item. There's a 50% chance of either outcome so it's unbiased.

For shuffle 2

  1. At the loop set up we get n = 2 to start with.
  2. We enter the loop (n > 1) and get a random number in the range 0..<n+1 which is 0..<3 (uh oh)
  3. We swap the element at 2 with either the element at 0, 1 or 2 (double uh oh — my earlier analysis was wrong as you will always get an array de-index which is out of bounds)
  4. We decrement n to give us n = 1, this time we don't re-enter the loop as n > 1 is not met (I got that wrong before, I thought it went around again)

The analysis I did yesterday was slightly off — I thought that shuffle2 would go once more around the loop and I didn't spot the out of bounds de-index on the swap target, but I think what I have written here is right.

The difference is in where n decremented, either at the top of the loop or at the bottom of the loop. I think shuffle2 can be fixed starting n off at the length - 1 and then changing the condition to n >= 1, but I haven't checked that through either.

It was something close to shuffle2 that I wrote initially and I only noticed because I happened to be looking for bias in the output (Flash doesn't give an error when you do anything out of bounds on an array, more's the pity). And why was I getting a bias? I think it's because of the order of the assignments that do the swap. This g'tees that for an array of length 2 the array is always left unchanged — thankfully a very obvious bias to spot.


To join in the discussion you should register or log in
Answers JordiB 9th October, 2009 10:00 (UTC)
Kirit Sælensminde said

I think for shuffle2 you are decrementing n in the wrong place, but I haven't worked yours out. This is what I did with a pen and paper yesterday. For both I'm going to consider a list of length 2.

For shuffle 1

  1. At the loop set up we get n = 2 to start with.
  2. We enter the loop (n > 1) and get a random number in the range 0..<2 (i.e. 0 or 1 after the floor op) and n becomes 1
  3. We swap the element at 1 with either the element at 0 or 1.
  4. The termination condition on the loop fails as n is 1

We have gone through the array and either left the last element where it was or swapped it with the first item. There's a 50% chance of either outcome so it's unbiased.

For shuffle 2

  1. At the loop set up we get n = 2 to start with.
  2. We enter the loop (n > 1) and get a random number in the range 0..<n+1 which is 0..<3 (uh oh)
  3. We swap the element at 2 with either the element at 0, 1 or 2 (double uh oh — my earlier analysis was wrong as you will always get an array de-index which is out of bounds)
  4. We decrement n to give us n = 1, this time we don't re-enter the loop as n > 1 is not met (I got that wrong before, I thought it went around again)

The analysis I did yesterday was slightly off — I thought that shuffle2 would go once more around the loop and I didn't spot the out of bounds de-index on the swap target, but I think what I have written here is right.

The difference is in where n decremented, either at the top of the loop or at the bottom of the loop. I think shuffle2 can be fixed starting n off at the length - 1 and then changing the condition to n >= 1, but I haven't checked that through either. It was something close to shuffle2 that I wrote initially and I only noticed because I happened to be looking for bias in the output (Flash doesn't give an error when you do anything out of bounds on an array, more's the pity). And why was I getting a bias? I think it's because of the order of the assignments that do the swap. This g'tees that for an array of length 2 the array is always left unchanged — thankfully a very obvious bias to spot.

Oh my god, I can't believe I made such a dumb error. You are right, of course `n` is not decremented at the top… In this case I think you are always going out of bounds. The random number might be 2, but `n` always starts out at 2 (out of bounds).