Subject: [Phrack] Linux and Random Source Bleaching
---[ Phrack Magazine Volume 8, Issue 54 Dec 25th, 1998, article 05 of 12
-------------------------[ Linux and Random Source Bleaching
--------[ Phunda Menta <phundie@usa.net>
----[ Introduction
Random numbers are often used in cryptography, but good random bits can be
hard to come by. Linux has two useful pseudo-devices called /dev/random and
/dev/urandom. Catting /dev/random yields a small pool of random bits obtained
from internal system state. If you cat this output to your terminal and bang
on some keys, you'll notice that you get more random bits. Disk drive
accesses, IRQ timings, and key presses; all of this stuff gets hashed into
a small pool of entropy that can be accessed directly from /dev/random.
/dev/urandom is a stream that hashes /dev/random, and gives you that hash
value; then it hashes the last hash and the pool forever. Both give a
decent source of random bits. By default, /dev/urandom uses SHA (I know
the source comments claim MD5, but if you look at the code, it is SHA).
So /dev/urandom is a decent source of pseudo-random bits. /dev/random
is better, but it is of limited size.
These are very useful, but what we really want is a hardware source of random
bits.
----[ The Hardware Solution
Most computers have sound cards these days, and a sound card is a
great source of potential entropy.
Unplug the microphone from your soundcard and cat /dev/audio to a file.
Sample maybe 2 or 300k of data. Now play it back, if it sounds like static,
you can skip ahead to cleaning up the source. You can also try plugging a
1/8th jack (or whatever you use for input) that has dead-end leads into
the mic port. Try both of these methods and find one that gives a clean
static hiss.
Chances are that on playback all you have is silence, but we want static.
Static is random, and randomness is our goal here, so grab an FM radio and
tune it to the high end, around 106 or 107 MHz. Find a frequency that gives a
good clean hiss, an analog tuner is best for this. If you have a digital
tuner and can't get the precision needed to tune-in a good static source then
get the best static you can, but you might have a harder time cleaning up this
source. If your signal has a high-pitched tone present you can clean this out
in a few different ways. The easiest is to use software to strip out that
frequency. There is a family of programs for Linux that can help with this
(Bio, Mammut, and Ceres). These programs allow very good visualization of the
signal and they also allow you to pull the signal apart and isolate different
frequencies. Chances are you will have a bunch of junk in the 60 Hz region,
probably due to EMI (electro-magnetic interference) from power supplies, along
with whatever is giving you that tone.
In either case you should shield your FM receiver and the audio cable to avoid
EMI. You may be able you shield your soundcard, but I am skeptical of the
worth of this. A lot of electronics supply houses sell shielding wrap and
preshielded cables. You can also try aluminum foil. I haven't had much luck
with aluminum foil, but some people swear by it.
Once you have your source set up, jack it into your sound card and sample it
at 44 kHz. Run the results through the Diehard testing package (a battery of
tests to evaluate the strength of random number generators). Your source
won't pass the test.
Clean up your source bytes however you need to. Strip out any 60 Hz junk with
Mammut by using the Transform|Filter options, you can then use the
Transform|Phase Shift option to slide the wave form back into place so that
there is no gap at 60 Hz. If your static source has a small amplitude, crank
it up by increasing the hardware gain, or use Mammut to change the derivative
or the effective gain, whichever you like. I have found no empirical evidence
to suggest that one way works better than the others, but, theoretically,
changing the slope may be a Bad Thing (tm). You may also want to use the
Phase Shift and Threshold options to chop up your signal. You can
resynthesize the parts and save them back out. Listening to these parts, and
graphing them can help give you an idea of what other things your source
signal is doing.
If push comes to shove, and you can't weed out all of the bias, or if you need
a more hands-free way to clean up the source (and don't have the time or skill
to write custom filters) you can just use a cryptographic hash.
After you clean up your source, take a look at it with ceres or bio, if the
output looks like video static with no noticeable patterns or hot/cold areas
then you have sufficiently cleaned up the signal, now you can move on to
bleaching the static for use as a random number stream.
As a side note, if you ever want to see what a good random distribution is
supposed to look like, you can also use output from /dev/urandom. Use sox
(stock with Redhat distros) to convert the output stream of /dev/urandom
(use a type of 'ul') to AIFF for mammut, or ceres or whatever. The
distribution given by /dev/urandom is statistically random so it will tell us
what to look for, but /dev/urandom (SHA, basically) is still pseudo-random
since complete knowledge of the previous inputs allows us to calculate all
future outputs. This is not so with static.
----[ Bleaching the data stream
The static coming out of your FM source is skewed white noise. We need to
clean it up, so we bleach it.
RFC1750 gives a slew of methods to clean up your source. One of the simplest,
effective methods of whitening a source is to XOR all the bits in a byte
together, yielding one output bit. These bits are then reconstructed into
a byte and output. This method has a few advantages. The first big advantage
is that you know precisely how many bytes you need to sample in order obtain a
certain number of output bytes. XORing is also fast, and easy to implement.
Another method of deskewing data is attributed to John von Neumann in RFC1750.
This method is called transition mapping. Transition mapping is a relatively
simple process. We take two bits from our input. If this bit sequence is 01
or 10 we output a 0 or a 1, respectively. The sequences 00 and 11 are
discarded. This method completely deskews a stream of data at the expense
of needing an unknown number of input bits. Transition mapping is also a
very fast process, and on a lightly skewed input transition mapping can yield
more output bits than XOR.
Both XOR and transition mapping are fast processes that are good enough to
deskew a set of bits such that they will pass the Diehard suite of tests,
if the input is suitably clean and random. If the input is somehow correlated,
you will have a harder time getting it to pass Diehard. I have found that
correlated sources can be cleaned up by XORing the output of an XOR
distillation with the output of a transition mapped distillation.
Slower constructions can be created out of cryptographic hash functions,
but may be trusted more by the paranoid. Hash functions are also recommended
if an attacker has the means to somehow affect your random source. If you
are worried about this attack, a good way to solve it is with appeal to
/dev/random. Use a block cipher such as 3DES to encrypt your random
source with a key and initialization vector obtained from /dev/random. If an
attacker can bias your source in a predictable way, he still has no idea
what bytes you may be using for your actual random numbers. Skew that the
attack may introduce into your hardware can first be cleaned with a process
like transition mapping and then pumped through a looped hash function or a
block cipher.
The output of a (decent) hash function or block cipher will pass the
Diehard tests.
In a heavily used machine, where the entropy pool used by /dev/random will be
updated frequently, the output from the above processes can be XORed byte
for byte with the stream from /dev/urandom. This is a simple method to mix
the streams together for added security. Another method would be to hash
N/2 bytes from /dev/urandom and N/2 bytes from your source together, where
N is the number of bytes that your hash function will yield.
All of these methods are suitable to deskew a data set, but they should not be
used blindly. Before putting the resulting bits to use, examine several
samples with Diehard and graphic or spectral tests.
I have included code to do XOR, transition mapping along with hashing
mechanisms.. I have plenty of code to do other hash and block cipher based
stuff too, but I did not include that here because the code is not
self-contained (it needs some crypto libs).
If you want to contact me about the code or if you have some comments or
suggestions, I can be reached at phundie@usa.net.
----[ References and Related stuff:
RFC1750 Randomness Recommendations for Security
http://www.kobira.co.jp/document/rfc/RFC1750.txt
Diehard Test Suite
http://stat.fsu.edu/~geo/diehard.html
Pseudo-Random Number Conditioning
http://www.clark.net/pub/cme/html/ranno.html
Linux MIDI & Sound Applications (has links to Mammut, Bio and Ceres)
http://www.bright.net/~dlphilp/linux_soundapps.html
<++> bleach/md5/md5.c
/*
***********************************************************************
** md5.c -- the source code for MD5 routines **
** RSA Data Security, Inc. MD5 Message-Digest Algorithm **
** Created: 2/17/90 RLR **
** Revised: 1/91 SRD,AJ,BSK,JT Reference C ver., 7/10 constant corr. **
***********************************************************************
*/
/*
***********************************************************************
** Copyright (C) 1990, RSA Data Security, Inc. All rights reserved. **
** **
** License to copy and use this software is granted provided that **
** it is identified as the "RSA Data Security, Inc. MD5 Message- **
** Digest Algorithm" in all material mentioning or referencing this **
** software or this function. **
** **
** License is also granted to make and use derivative works **
** provided that such works are identified as "derived from the RSA **
** Data Security, Inc. MD5 Message-Digest Algorithm" in all **
** material mentioning or referencing the derived work. **
** **
** RSA Data Security, Inc. makes no representations concerning **
** either the merchantability of this software or the suitability **
** of this software for any particular purpose. It is provided "as **
** is" without express or implied warranty of any kind. **
** **
** These notices must be retained in any copies of any part of this **
** documentation and/or software. **
***********************************************************************
*/
#include "md5.h"
/*
***********************************************************************
** Message-digest routines: **
** To form the message digest for a message M **
** (1) Initialize a context buffer mdContext using MD5Init **
** (2) Call MD5Update on mdContext and M **
** (3) Call MD5Final on mdContext **
** The message digest is now in mdContext->digest[0...15] **
***********************************************************************
*/
/* The routine MD5Init initializes the message-digest context
mdContext. All fields are set to zero.
*/
void MD5Init (mdContext)
MD5_CTX *mdContext;
{
mdContext->i[0] = mdContext->i[1] = (UINT4)0;
/* The routine MD5Update updates the message-digest context to
account for the presence of each of the characters inBuf[0..inLen-1]
in the message whose digest is being computed.
*/
void MD5Update (mdContext, inBuf, inLen)
MD5_CTX *mdContext;
unsigned char *inBuf;
unsigned int inLen;
{
UINT4 in[16];
int mdi;
unsigned int i, ii;
/* compute number of bytes mod 64 */
mdi = (int)((mdContext->i[0] >> 3) & 0x3F);
/* update number of bits */
if ((mdContext->i[0] + ((UINT4)inLen << 3)) < mdContext->i[0])
mdContext->i[1]++;
mdContext->i[0] += ((UINT4)inLen << 3);
mdContext->i[1] += ((UINT4)inLen >> 29);
while (inLen--) {
/* add new character to buffer, increment mdi */
mdContext->in[mdi++] = *inBuf++;
/* transform if necessary */
if (mdi == 0x40) {
for (i = 0, ii = 0; i < 16; i++, ii += 4)
in[i] = (((UINT4)mdContext->in[ii+3]) << 24) |
(((UINT4)mdContext->in[ii+2]) << 16) |
(((UINT4)mdContext->in[ii+1]) << 8) |
((UINT4)mdContext->in[ii]);
Transform (mdContext->buf, in);
mdi = 0;
}
}
}
/* The routine MD5Final terminates the message-digest computation and
ends with the desired message digest in mdContext->digest[0...15].
*/
void MD5Final (mdContext)
MD5_CTX *mdContext;
{
UINT4 in[16];
int mdi;
unsigned int i, ii;
unsigned int padLen;
/* save number of bits */
in[14] = mdContext->i[0];
in[15] = mdContext->i[1];
/* compute number of bytes mod 64 */
mdi = (int)((mdContext->i[0] >> 3) & 0x3F);
/* pad out to 56 mod 64 */
padLen = (mdi < 56) ? (56 - mdi) : (120 - mdi);
MD5Update (mdContext, PADDING, padLen);
/* append length in bits and transform */
for (i = 0, ii = 0; i < 14; i++, ii += 4)
in[i] = (((UINT4)mdContext->in[ii+3]) << 24) |
(((UINT4)mdContext->in[ii+2]) << 16) |
(((UINT4)mdContext->in[ii+1]) << 8) |
((UINT4)mdContext->in[ii]);
Transform (mdContext->buf, in);
/* store buffer in digest */
for (i = 0, ii = 0; i < 4; i++, ii += 4) {
mdContext->digest[ii] = (unsigned char)(mdContext->buf[i] & 0xFF);
mdContext->digest[ii+1] =
(unsigned char)((mdContext->buf[i] >> 8) & 0xFF);
mdContext->digest[ii+2] =
(unsigned char)((mdContext->buf[i] >> 16) & 0xFF);
mdContext->digest[ii+3] =
(unsigned char)((mdContext->buf[i] >> 24) & 0xFF);
}
}
/* Basic MD5 step. Transforms buf based on in.
*/
static void Transform (buf, in)
UINT4 *buf;
UINT4 *in;
{
UINT4 a = buf[0], b = buf[1], c = buf[2], d = buf[3];
/*
***********************************************************************
** End of md5.c **
******************************** (cut) ********************************
*/
<-->
<++> bleach/md5/md5c.h
/*
***********************************************************************
** md5.h -- header file for implementation of MD5 **
** RSA Data Security, Inc. MD5 Message-Digest Algorithm **
** Created: 2/17/90 RLR **
** Revised: 12/27/90 SRD,AJ,BSK,JT Reference C version **
** Revised (for MD5): RLR 4/27/91 **
** -- G modified to have y&~z instead of y&z **
** -- FF, GG, HH modified to add in last register done **
** -- Access pattern: round 2 works mod 5, round 3 works mod 3 **
** -- distinct additive constant for each step **
** -- round 4 added, working mod 7 **
***********************************************************************
*/
/*
***********************************************************************
** Copyright (C) 1990, RSA Data Security, Inc. All rights reserved. **
** **
** License to copy and use this software is granted provided that **
** it is identified as the "RSA Data Security, Inc. MD5 Message- **
** Digest Algorithm" in all material mentioning or referencing this **
** software or this function. **
** **
** License is also granted to make and use derivative works **
** provided that such works are identified as "derived from the RSA **
** Data Security, Inc. MD5 Message-Digest Algorithm" in all **
** material mentioning or referencing the derived work. **
** **
** RSA Data Security, Inc. makes no representations concerning **
** either the merchantability of this software or the suitability **
** of this software for any particular purpose. It is provided "as **
** is" without express or implied warranty of any kind. **
** **
** These notices must be retained in any copies of any part of this **
** documentation and/or software. **
***********************************************************************
*/
/* typedef a 32-bit type */
typedef unsigned long int UINT4;
/* Data structure for MD5 (Message-Digest) computation */
typedef struct {
UINT4 i[2]; /* number of _bits_ handled mod 2^64 */
UINT4 buf[4]; /* scratch buffer */
unsigned char in[64]; /* input buffer */
unsigned char digest[16]; /* actual digest after MD5Final call */
} MD5_CTX;
/*
* NIST proposed Secure Hash Standard.
*
* Written 2 September 1992, Peter C. Gutmann.
* This implementation placed in the public domain.
*
* Comments to pgut1@cs.aukuni.ac.nz
*/
#include <string.h>
#include "shs.h"
/* The SHS f()-functions */
#define f1(x,y,z) ( ( x & y ) | ( ~x & z ) ) /* Rounds 0-19 */
#define f2(x,y,z) ( x ^ y ^ z ) /* Rounds 20-39 */
#define f3(x,y,z) ( ( x & y ) | ( x & z ) | ( y & z ) ) /* Rounds 40-59 */
#define f4(x,y,z) ( x ^ y ^ z ) /* Rounds 60-79 */
/*
* Perform the SHS transformation. Note that this code, like MD5, seems to
* break some optimizing compilers - it may be necessary to split it into
* sections, eg based on the four subrounds
*/
void shsTransform (shsInfo)
SHS_INFO *shsInfo;
{
LONG W [80], temp;
int i;
/* Step A. Copy the data buffer into the local work buffer */
for (i = 0; i < 16; i++)
W [i] = shsInfo->data [i];
/* Step C. Set up first buffer */
A = shsInfo->digest [0];
B = shsInfo->digest [1];
C = shsInfo->digest [2];
D = shsInfo->digest [3];
E = shsInfo->digest [4];
local void byteReverse (buffer, byteCount)
LONG *buffer;
int byteCount;
{
LONG value;
int count;
/*
* Find out what the byte order is on this machine.
* Big endian is for machines that place the most significant byte
* first (eg. Sun SPARC). Little endian is for machines that place
* the least significant byte first (eg. VAX).
*
* We figure out the byte order by stuffing a 2 byte string into a
* short and examining the left byte. '@' = 0x40 and 'P' = 0x50
* If the left byte is the 'high' byte, then it is 'big endian'.
* If the left byte is the 'low' byte, then the machine is 'little
* endian'.
*
* -- Shawn A. Clifford (sac@eng.ufl.edu)
*/
/*
* Several bugs fixed -- Pat Myrto (pat@rwing.uucp)
*/
if ((*(unsigned short *) ("@P") >> 8) == '@')
return;
/*
* Update SHS for a block of data. This code assumes that the buffer size is
* a multiple of SHS_BLOCKSIZE bytes long, which makes the code a lot more
* efficient since it does away with the need to handle partial blocks
* between calls to shsUpdate()
*/
void shsUpdate (shsInfo, buffer, count)
SHS_INFO *shsInfo;
BYTE *buffer;
int count;
{
/* Update bitcount */
if ((shsInfo->countLo + ((LONG) count << 3)) < shsInfo->countLo)
shsInfo->countHi++; /* Carry from low to high bitCount */
shsInfo->countLo += ((LONG) count << 3);
shsInfo->countHi += ((LONG) count >> 29);
/* Process data in SHS_BLOCKSIZE chunks */
while (count >= SHS_BLOCKSIZE) {
memcpy (shsInfo->data, buffer, SHS_BLOCKSIZE);
byteReverse (shsInfo->data, SHS_BLOCKSIZE);
shsTransform (shsInfo);
buffer += SHS_BLOCKSIZE;
count -= SHS_BLOCKSIZE;
}
/*
* Handle any remaining bytes of data.
* This should only happen once on the final lot of data
*/
memcpy (shsInfo->data, buffer, count);
}
void shsFinal (shsInfo)
SHS_INFO *shsInfo;
{
int count;
LONG lowBitcount = shsInfo->countLo, highBitcount = shsInfo->countHi;
/* Compute number of bytes mod 64 */
count = (int) ((shsInfo->countLo >> 3) & 0x3F);
/*
* Set the first char of padding to 0x80.
* This is safe since there is always at least one byte free
*/
((BYTE *) shsInfo->data) [count++] = 0x80;
/* Pad out to 56 mod 64 */
if (count > 56) {
/* Two lots of padding: Pad the first block to 64 bytes */
memset ((BYTE *) shsInfo->data + count, 0, 64 - count);
byteReverse (shsInfo->data, SHS_BLOCKSIZE);
shsTransform (shsInfo);
/* Now fill the next block with 56 bytes */
memset (shsInfo->data, 0, 56);
} else
/* Pad block to 56 bytes */
memset ((BYTE *) shsInfo->data + count, 0, 56 - count);
byteReverse (shsInfo->data, SHS_BLOCKSIZE);
/* Append length in bits and transform */
shsInfo->data [14] = highBitcount;
shsInfo->data [15] = lowBitcount;
/*
* NIST proposed Secure Hash Standard.
*
* Written 2 September 1992, Peter C. Gutmann.
* This implementation placed in the public domain.
*
* Comments to pgut1@cs.aukuni.ac.nz
*/
/* Useful defines/typedefs */
#ifndef SHS_H
#define SHS_H
typedef unsigned char BYTE;
typedef unsigned long LONG;
/* The SHS block size and message digest sizes, in bytes */
typedef struct {
LONG digest [5]; /* Message digest */
LONG countLo, countHi; /* 64-bit bit count */
LONG data [16]; /* SHS data buffer */
} SHS_INFO;
/* Turn off prototypes if requested */
#if (defined(NOPROTO) && defined(PROTO))
# undef PROTO
#endif
/* Used to remove arguments in function prototypes for non-ANSI C */
#ifdef PROTO
# define OF(a) a
#else /* !PROTO */
# define OF(a) ()
#endif /* ?PROTO */
<++> bleach/transmap.c
/*
Implementation of von Neumann's transistion mapping scheme to de-skew
a series of random bits. See 5.2.2 of RFC1750 for more information.
*/
#include <stdio.h>
char reconstruct_byte(char *byte_ary);
main ()
{
char c, b1, b2, i, j;
char byte[7];
j=0;
while ( !feof(stdin) )
{
fread(&c, 1,1,stdin);
for (i=7; i>=0; i-=2)
{
b1=((c>>i)&1); /* integer representation of bit i */
b2=((c>>(i-1))&1);
if ( (b1==1) && (b2==0) ) /* translation of 10 */
{
byte[j]=1;
j++;
}
if ( (b1==0) && (b2==1) ) /* translation of 01 */
{
byte[j]=0;
j++;
}
}
if (j>7)
{
putc(reconstruct_byte(byte),stdout);
j=0;
}
}
}
char reconstruct_byte(char *byte_ary)
{
char i;
char r = 0;
for (i=0; i<=7; i++)
{
r<<=1;
r|=byte_ary[i];
}
return r;
}
<-->
<++> bleach/xor_distill.c
/* Distills entropy from a stream of skewed random bits by XORing
each bit in a byte against each other to obtain 1 output bit per
input byte. 8 such bits are reconstructed into a byte.
*/
#include <stdio.h>
char reconstruct_byte(char *byte_ary);
char xor_bits(char c);
main ()
{
char byte[7];
char c[7];
char i;
while (fread(c,1,8,stdin) == 8)
{
for (i=0; i<=7; i++)
byte[i]=xor_bits(c[i]);
putc(reconstruct_byte(byte), stdout);
}
}
char xor_bits(char c)
{
char i, f;
f=(c>>i)&1;
for (i=6; i>=0; i--)
f^=(c>>i)&1;
return f;
}
char reconstruct_byte(char *byte_ary)
{
char i;
char r = 0;
for (i=0; i<=7; i++)
{
r<<=1;
r|=byte_ary[i];
}
return r;
}
<-->