Daily Archive for February 4th, 2010

Data & Dice

The need for truly random data is quite common, but as any engineer can tell you, true random data is hard to come by. Computers are incapable of generating truly random data. Although some systems do a pretty good job of it, e.g. by collecting entropy from chaotic processes like keyboard and mouse usage, disk access, error corrections, etc., even these methods often fall short of cryptographic randomness.

There is, however, one class of number generator that is rather easy to come by and is about as close to truly random as one can come without using quantum events like radioactive decay. That device is the humble die.

Dice have been used for thousands of years to provide entropy for games, and they can also be used to provide entropy for computational needs. One particular task I often have is to generate a PIN for Second Life scripts to be used for security. This PIN is in the space of all 32-bit integers, meaning it is in the domain [-2^{32} \cdots 2^{32}-1].

The easiest way I have found to do this is using four eight-sided (octohedral) dice, commonly used for Dungeons & Dragons, in which they are referred to as “d8″. The d8 I have are, as are most, number from 1 to 8, so I just treat 8 as 0 and use the other digits as they are, meaning each d8 gives me one octal digit. Four of these then give me twelve bits, which is three nibbles, so if I roll these 4d8 three times, I’ll get 36 bits, giving me the 32 bits I need plus four extra, which I just cut off the end.

In order to guarantee good random rolls, I keep the dice in a clear plastic tub (previously used for almonds) and just give it a good shake, then hold it at an angle so the dice line up along one of the bottom edges, then read them off left-to-right. Here’s an example:

Roll 1 Roll 2 Roll 3
0 3 4 1 0 1 4 7 1 1 1 2
0 0 0 0 1 1 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 1 1 0 0 1 0 0 1 0 0 1 0 1 0
0 E 1 0 6 7 2 4 A

Note that the last four bits are discarded, meaning onle the two most significant bits of the second-to-last d8 only count, and the entire last bit doesn’t count. The output value, in this case 0x0E106724, represents a full 32 bits of random data. I also use this method to generate chat channels, which are typically kept negative to prevent client eavesdropping. In these cases, I just force the most significant bit to 1, since that will guarantee a negative number.

That’s all there is to it! I have also seen 16-sided dice, which would generate 4 bits of data, so you could just roll 8d16 (or 4d16 twice (or 2d16 four times (or 1d16 eight times))) to generate the 32-bit integer, but d16 are an awful lot harder to find than d8.

If you’re stuck with six-sided dice, you can get two bits out of them by assigning 6 to 00 and re-rolling on 4 or 5. This would require twice as many dice (or rolls) as using d8, and could also be done with d4 (where 4 would be 0 and all other values would be their normal values). This does not skew results as the chance of landing on each of the valid sides is equal (1/6). d6 would be less efficient than d4, though, since 1/3 of rolls would be invalid (4 or 5). The reason there’s no way to make use of 4 or 5 is it would introduce skew, e.g. using the values directly would mean 6 and 7 would never be possible, and trying to do something like use the 1/3 bit of entropy provided by the extra faces would be more work than it would be worth, and I’m not certain it’s probabalistically even either.

Stay tuned for more random crap!

(pun very much intended)