 c21c195eca
			
		
	
	c21c195eca
	
	
	
		
			
			git-svn-id: svn://svn.h5l.se/heimdal/trunk/heimdal@3559 ec53bebd-3082-4978-b11e-865c3cabbd6b
		
			
				
	
	
		
			1684 lines
		
	
	
		
			72 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			1684 lines
		
	
	
		
			72 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Network Working Group                                   D. Eastlake, 3rd
 | ||
| Request for Comments: 1750                                           DEC
 | ||
| Category: Informational                                       S. Crocker
 | ||
|                                                                Cybercash
 | ||
|                                                              J. Schiller
 | ||
|                                                                      MIT
 | ||
|                                                            December 1994
 | ||
| 
 | ||
| 
 | ||
|                 Randomness Recommendations for Security
 | ||
| 
 | ||
| Status of this Memo
 | ||
| 
 | ||
|    This memo provides information for the Internet community.  This memo
 | ||
|    does not specify an Internet standard of any kind.  Distribution of
 | ||
|    this memo is unlimited.
 | ||
| 
 | ||
| Abstract
 | ||
| 
 | ||
|    Security systems today are built on increasingly strong cryptographic
 | ||
|    algorithms that foil pattern analysis attempts. However, the security
 | ||
|    of these systems is dependent on generating secret quantities for
 | ||
|    passwords, cryptographic keys, and similar quantities.  The use of
 | ||
|    pseudo-random processes to generate secret quantities can result in
 | ||
|    pseudo-security.  The sophisticated attacker of these security
 | ||
|    systems may find it easier to reproduce the environment that produced
 | ||
|    the secret quantities, searching the resulting small set of
 | ||
|    possibilities, than to locate the quantities in the whole of the
 | ||
|    number space.
 | ||
| 
 | ||
|    Choosing random quantities to foil a resourceful and motivated
 | ||
|    adversary is surprisingly difficult.  This paper points out many
 | ||
|    pitfalls in using traditional pseudo-random number generation
 | ||
|    techniques for choosing such quantities.  It recommends the use of
 | ||
|    truly random hardware techniques and shows that the existing hardware
 | ||
|    on many systems can be used for this purpose.  It provides
 | ||
|    suggestions to ameliorate the problem when a hardware solution is not
 | ||
|    available.  And it gives examples of how large such quantities need
 | ||
|    to be for some particular applications.
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                    [Page 1]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
| Acknowledgements
 | ||
| 
 | ||
|    Comments on this document that have been incorporated were received
 | ||
|    from (in alphabetic order) the following:
 | ||
| 
 | ||
|         David M. Balenson (TIS)
 | ||
|         Don Coppersmith (IBM)
 | ||
|         Don T. Davis (consultant)
 | ||
|         Carl Ellison (Stratus)
 | ||
|         Marc Horowitz (MIT)
 | ||
|         Christian Huitema (INRIA)
 | ||
|         Charlie Kaufman (IRIS)
 | ||
|         Steve Kent (BBN)
 | ||
|         Hal Murray (DEC)
 | ||
|         Neil Haller (Bellcore)
 | ||
|         Richard Pitkin (DEC)
 | ||
|         Tim Redmond (TIS)
 | ||
|         Doug Tygar (CMU)
 | ||
| 
 | ||
| Table of Contents
 | ||
| 
 | ||
|    1. Introduction........................................... 3
 | ||
|    2. Requirements........................................... 4
 | ||
|    3. Traditional Pseudo-Random Sequences.................... 5
 | ||
|    4. Unpredictability....................................... 7
 | ||
|    4.1 Problems with Clocks and Serial Numbers............... 7
 | ||
|    4.2 Timing and Content of External Events................  8
 | ||
|    4.3 The Fallacy of Complex Manipulation..................  8
 | ||
|    4.4 The Fallacy of Selection from a Large Database.......  9
 | ||
|    5. Hardware for Randomness............................... 10
 | ||
|    5.1 Volume Required...................................... 10
 | ||
|    5.2 Sensitivity to Skew.................................. 10
 | ||
|    5.2.1 Using Stream Parity to De-Skew..................... 11
 | ||
|    5.2.2 Using Transition Mappings to De-Skew............... 12
 | ||
|    5.2.3 Using FFT to De-Skew............................... 13
 | ||
|    5.2.4 Using Compression to De-Skew....................... 13
 | ||
|    5.3 Existing Hardware Can Be Used For Randomness......... 14
 | ||
|    5.3.1 Using Existing Sound/Video Input................... 14
 | ||
|    5.3.2 Using Existing Disk Drives......................... 14
 | ||
|    6. Recommended Non-Hardware Strategy..................... 14
 | ||
|    6.1 Mixing Functions..................................... 15
 | ||
|    6.1.1 A Trivial Mixing Function.......................... 15
 | ||
|    6.1.2 Stronger Mixing Functions.......................... 16
 | ||
|    6.1.3 Diff-Hellman as a Mixing Function.................. 17
 | ||
|    6.1.4 Using a Mixing Function to Stretch Random Bits..... 17
 | ||
|    6.1.5 Other Factors in Choosing a Mixing Function........ 18
 | ||
|    6.2 Non-Hardware Sources of Randomness................... 19
 | ||
|    6.3 Cryptographically Strong Sequences................... 19
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                    [Page 2]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|    6.3.1 Traditional Strong Sequences....................... 20
 | ||
|    6.3.2 The Blum Blum Shub Sequence Generator.............. 21
 | ||
|    7. Key Generation Standards.............................. 22
 | ||
|    7.1 US DoD Recommendations for Password Generation....... 23
 | ||
|    7.2 X9.17 Key Generation................................. 23
 | ||
|    8. Examples of Randomness Required....................... 24
 | ||
|    8.1  Password Generation................................. 24
 | ||
|    8.2 A Very High Security Cryptographic Key............... 25
 | ||
|    8.2.1 Effort per Key Trial............................... 25
 | ||
|    8.2.2 Meet in the Middle Attacks......................... 26
 | ||
|    8.2.3 Other Considerations............................... 26
 | ||
|    9. Conclusion............................................ 27
 | ||
|    10. Security Considerations.............................. 27
 | ||
|    References............................................... 28
 | ||
|    Authors' Addresses....................................... 30
 | ||
| 
 | ||
| 1. Introduction
 | ||
| 
 | ||
|    Software cryptography is coming into wider use.  Systems like
 | ||
|    Kerberos, PEM, PGP, etc. are maturing and becoming a part of the
 | ||
|    network landscape [PEM].  These systems provide substantial
 | ||
|    protection against snooping and spoofing.  However, there is a
 | ||
|    potential flaw.  At the heart of all cryptographic systems is the
 | ||
|    generation of secret, unguessable (i.e., random) numbers.
 | ||
| 
 | ||
|    For the present, the lack of generally available facilities for
 | ||
|    generating such unpredictable numbers is an open wound in the design
 | ||
|    of cryptographic software.  For the software developer who wants to
 | ||
|    build a key or password generation procedure that runs on a wide
 | ||
|    range of hardware, the only safe strategy so far has been to force
 | ||
|    the local installation to supply a suitable routine to generate
 | ||
|    random numbers.  To say the least, this is an awkward, error-prone
 | ||
|    and unpalatable solution.
 | ||
| 
 | ||
|    It is important to keep in mind that the requirement is for data that
 | ||
|    an adversary has a very low probability of guessing or determining.
 | ||
|    This will fail if pseudo-random data is used which only meets
 | ||
|    traditional statistical tests for randomness or which is based on
 | ||
|    limited range sources, such as clocks.  Frequently such random
 | ||
|    quantities are determinable by an adversary searching through an
 | ||
|    embarrassingly small space of possibilities.
 | ||
| 
 | ||
|    This informational document suggests techniques for producing random
 | ||
|    quantities that will be resistant to such attack.  It recommends that
 | ||
|    future systems include hardware random number generation or provide
 | ||
|    access to existing hardware that can be used for this purpose.  It
 | ||
|    suggests methods for use if such hardware is not available.  And it
 | ||
|    gives some estimates of the number of random bits required for sample
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                    [Page 3]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|    applications.
 | ||
| 
 | ||
| 2. Requirements
 | ||
| 
 | ||
|    Probably the most commonly encountered randomness requirement today
 | ||
|    is the user password. This is usually a simple character string.
 | ||
|    Obviously, if a password can be guessed, it does not provide
 | ||
|    security.  (For re-usable passwords, it is desirable that users be
 | ||
|    able to remember the password.  This may make it advisable to use
 | ||
|    pronounceable character strings or phrases composed on ordinary
 | ||
|    words.  But this only affects the format of the password information,
 | ||
|    not the requirement that the password be very hard to guess.)
 | ||
| 
 | ||
|    Many other requirements come from the cryptographic arena.
 | ||
|    Cryptographic techniques can be used to provide a variety of services
 | ||
|    including confidentiality and authentication.  Such services are
 | ||
|    based on quantities, traditionally called "keys", that are unknown to
 | ||
|    and unguessable by an adversary.
 | ||
| 
 | ||
|    In some cases, such as the use of symmetric encryption with the one
 | ||
|    time pads [CRYPTO*] or the US Data Encryption Standard [DES], the
 | ||
|    parties who wish to communicate confidentially and/or with
 | ||
|    authentication must all know the same secret key.  In other cases,
 | ||
|    using what are called asymmetric or "public key" cryptographic
 | ||
|    techniques, keys come in pairs.  One key of the pair is private and
 | ||
|    must be kept secret by one party, the other is public and can be
 | ||
|    published to the world.  It is computationally infeasible to
 | ||
|    determine the private key from the public key [ASYMMETRIC, CRYPTO*].
 | ||
| 
 | ||
|    The frequency and volume of the requirement for random quantities
 | ||
|    differs greatly for different cryptographic systems.  Using pure RSA
 | ||
|    [CRYPTO*], random quantities are required when the key pair is
 | ||
|    generated, but thereafter any number of messages can be signed
 | ||
|    without any further need for randomness.  The public key Digital
 | ||
|    Signature Algorithm that has been proposed by the US National
 | ||
|    Institute of Standards and Technology (NIST) requires good random
 | ||
|    numbers for each signature.  And encrypting with a one time pad, in
 | ||
|    principle the strongest possible encryption technique, requires a
 | ||
|    volume of randomness equal to all the messages to be processed.
 | ||
| 
 | ||
|    In most of these cases, an adversary can try to determine the
 | ||
|    "secret" key by trial and error.  (This is possible as long as the
 | ||
|    key is enough smaller than the message that the correct key can be
 | ||
|    uniquely identified.)  The probability of an adversary succeeding at
 | ||
|    this must be made acceptably low, depending on the particular
 | ||
|    application.  The size of the space the adversary must search is
 | ||
|    related to the amount of key "information" present in the information
 | ||
|    theoretic sense [SHANNON].  This depends on the number of different
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                    [Page 4]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|    secret values possible and the probability of each value as follows:
 | ||
| 
 | ||
|                       -----
 | ||
|                        \
 | ||
|         Bits-of-info =  \  - p   * log  ( p  )
 | ||
|                         /     i       2    i
 | ||
|                        /
 | ||
|                       -----
 | ||
| 
 | ||
|    where i varies from 1 to the number of possible secret values and p
 | ||
|    sub i is the probability of the value numbered i.  (Since p sub i is
 | ||
|    less than one, the log will be negative so each term in the sum will
 | ||
|    be non-negative.)
 | ||
| 
 | ||
|    If there are 2^n different values of equal probability, then n bits
 | ||
|    of information are present and an adversary would, on the average,
 | ||
|    have to try half of the values, or 2^(n-1) , before guessing the
 | ||
|    secret quantity.  If the probability of different values is unequal,
 | ||
|    then there is less information present and fewer guesses will, on
 | ||
|    average, be required by an adversary.  In particular, any values that
 | ||
|    the adversary can know are impossible, or are of low probability, can
 | ||
|    be initially ignored by an adversary, who will search through the
 | ||
|    more probable values first.
 | ||
| 
 | ||
|    For example, consider a cryptographic system that uses 56 bit keys.
 | ||
|    If these 56 bit keys are derived by using a fixed pseudo-random
 | ||
|    number generator that is seeded with an 8 bit seed, then an adversary
 | ||
|    needs to search through only 256 keys (by running the pseudo-random
 | ||
|    number generator with every possible seed), not the 2^56 keys that
 | ||
|    may at first appear to be the case. Only 8 bits of "information" are
 | ||
|    in these 56 bit keys.
 | ||
| 
 | ||
| 3. Traditional Pseudo-Random Sequences
 | ||
| 
 | ||
|    Most traditional sources of random numbers use deterministic sources
 | ||
|    of "pseudo-random" numbers.  These typically start with a "seed"
 | ||
|    quantity and use numeric or logical operations to produce a sequence
 | ||
|    of values.
 | ||
| 
 | ||
|    [KNUTH] has a classic exposition on pseudo-random numbers.
 | ||
|    Applications he mentions are simulation of natural phenomena,
 | ||
|    sampling, numerical analysis, testing computer programs, decision
 | ||
|    making, and games.  None of these have the same characteristics as
 | ||
|    the sort of security uses we are talking about.  Only in the last two
 | ||
|    could there be an adversary trying to find the random quantity.
 | ||
|    However, in these cases, the adversary normally has only a single
 | ||
|    chance to use a guessed value.  In guessing passwords or attempting
 | ||
|    to break an encryption scheme, the adversary normally has many,
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                    [Page 5]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|    perhaps unlimited, chances at guessing the correct value and should
 | ||
|    be assumed to be aided by a computer.
 | ||
| 
 | ||
|    For testing the "randomness" of numbers, Knuth suggests a variety of
 | ||
|    measures including statistical and spectral.  These tests check
 | ||
|    things like autocorrelation between different parts of a "random"
 | ||
|    sequence or distribution of its values.  They could be met by a
 | ||
|    constant stored random sequence, such as the "random" sequence
 | ||
|    printed in the CRC Standard Mathematical Tables [CRC].
 | ||
| 
 | ||
|    A typical pseudo-random number generation technique, known as a
 | ||
|    linear congruence pseudo-random number generator, is modular
 | ||
|    arithmetic where the N+1th value is calculated from the Nth value by
 | ||
| 
 | ||
|         V    = ( V  * a + b )(Mod c)
 | ||
|          N+1      N
 | ||
| 
 | ||
|    The above technique has a strong relationship to linear shift
 | ||
|    register pseudo-random number generators, which are well understood
 | ||
|    cryptographically [SHIFT*].  In such generators bits are introduced
 | ||
|    at one end of a shift register as the Exclusive Or (binary sum
 | ||
|    without carry) of bits from selected fixed taps into the register.
 | ||
| 
 | ||
|    For example:
 | ||
| 
 | ||
|       +----+     +----+     +----+                      +----+
 | ||
|       | B  | <-- | B  | <-- | B  | <--  . . . . . . <-- | B  | <-+
 | ||
|       |  0 |     |  1 |     |  2 |                      |  n |   |
 | ||
|       +----+     +----+     +----+                      +----+   |
 | ||
|         |                     |            |                     |
 | ||
|         |                     |            V                  +-----+
 | ||
|         |                     V            +----------------> |     |
 | ||
|         V                     +-----------------------------> | XOR |
 | ||
|         +---------------------------------------------------> |     |
 | ||
|                                                               +-----+
 | ||
| 
 | ||
| 
 | ||
|        V    = ( ( V  * 2 ) + B .xor. B ... )(Mod 2^n)
 | ||
|         N+1         N         0       2
 | ||
| 
 | ||
|    The goodness of traditional pseudo-random number generator algorithms
 | ||
|    is measured by statistical tests on such sequences.  Carefully chosen
 | ||
|    values of the initial V and a, b, and c or the placement of shift
 | ||
|    register tap in the above simple processes can produce excellent
 | ||
|    statistics.
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                    [Page 6]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|    These sequences may be adequate in simulations (Monte Carlo
 | ||
|    experiments) as long as the sequence is orthogonal to the structure
 | ||
|    of the space being explored.  Even there, subtle patterns may cause
 | ||
|    problems.  However, such sequences are clearly bad for use in
 | ||
|    security applications.  They are fully predictable if the initial
 | ||
|    state is known.  Depending on the form of the pseudo-random number
 | ||
|    generator, the sequence may be determinable from observation of a
 | ||
|    short portion of the sequence [CRYPTO*, STERN].  For example, with
 | ||
|    the generators above, one can determine V(n+1) given knowledge of
 | ||
|    V(n).  In fact, it has been shown that with these techniques, even if
 | ||
|    only one bit of the pseudo-random values is released, the seed can be
 | ||
|    determined from short sequences.
 | ||
| 
 | ||
|    Not only have linear congruent generators been broken, but techniques
 | ||
|    are now known for breaking all polynomial congruent generators
 | ||
|    [KRAWCZYK].
 | ||
| 
 | ||
| 4. Unpredictability
 | ||
| 
 | ||
|    Randomness in the traditional sense described in section 3 is NOT the
 | ||
|    same as the unpredictability required for security use.
 | ||
| 
 | ||
|    For example, use of a widely available constant sequence, such as
 | ||
|    that from the CRC tables, is very weak against an adversary. Once
 | ||
|    they learn of or guess it, they can easily break all security, future
 | ||
|    and past, based on the sequence [CRC].  Yet the statistical
 | ||
|    properties of these tables are good.
 | ||
| 
 | ||
|    The following sections describe the limitations of some randomness
 | ||
|    generation techniques and sources.
 | ||
| 
 | ||
| 4.1 Problems with Clocks and Serial Numbers
 | ||
| 
 | ||
|    Computer clocks, or similar operating system or hardware values,
 | ||
|    provide significantly fewer real bits of unpredictability than might
 | ||
|    appear from their specifications.
 | ||
| 
 | ||
|    Tests have been done on clocks on numerous systems and it was found
 | ||
|    that their behavior can vary widely and in unexpected ways.  One
 | ||
|    version of an operating system running on one set of hardware may
 | ||
|    actually provide, say, microsecond resolution in a clock while a
 | ||
|    different configuration of the "same" system may always provide the
 | ||
|    same lower bits and only count in the upper bits at much lower
 | ||
|    resolution.  This means that successive reads on the clock may
 | ||
|    produce identical values even if enough time has passed that the
 | ||
|    value "should" change based on the nominal clock resolution. There
 | ||
|    are also cases where frequently reading a clock can produce
 | ||
|    artificial sequential values because of extra code that checks for
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                    [Page 7]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|    the clock being unchanged between two reads and increases it by one!
 | ||
|    Designing portable application code to generate unpredictable numbers
 | ||
|    based on such system clocks is particularly challenging because the
 | ||
|    system designer does not always know the properties of the system
 | ||
|    clocks that the code will execute on.
 | ||
| 
 | ||
|    Use of a hardware serial number such as an Ethernet address may also
 | ||
|    provide fewer bits of uniqueness than one would guess.  Such
 | ||
|    quantities are usually heavily structured and subfields may have only
 | ||
|    a limited range of possible values or values easily guessable based
 | ||
|    on approximate date of manufacture or other data.  For example, it is
 | ||
|    likely that most of the Ethernet cards installed on Digital Equipment
 | ||
|    Corporation (DEC) hardware within DEC were manufactured by DEC
 | ||
|    itself, which significantly limits the range of built in addresses.
 | ||
| 
 | ||
|    Problems such as those described above related to clocks and serial
 | ||
|    numbers make code to produce unpredictable quantities difficult if
 | ||
|    the code is to be ported across a variety of computer platforms and
 | ||
|    systems.
 | ||
| 
 | ||
| 4.2 Timing and Content of External Events
 | ||
| 
 | ||
|    It is possible to measure the timing and content of mouse movement,
 | ||
|    key strokes, and similar user events.  This is a reasonable source of
 | ||
|    unguessable data with some qualifications.  On some machines, inputs
 | ||
|    such as key strokes are buffered.  Even though the user's inter-
 | ||
|    keystroke timing may have sufficient variation and unpredictability,
 | ||
|    there might not be an easy way to access that variation.  Another
 | ||
|    problem is that no standard method exists to sample timing details.
 | ||
|    This makes it hard to build standard software intended for
 | ||
|    distribution to a large range of machines based on this technique.
 | ||
| 
 | ||
|    The amount of mouse movement or the keys actually hit are usually
 | ||
|    easier to access than timings but may yield less unpredictability as
 | ||
|    the user may provide highly repetitive input.
 | ||
| 
 | ||
|    Other external events, such as network packet arrival times, can also
 | ||
|    be used with care.  In particular, the possibility of manipulation of
 | ||
|    such times by an adversary must be considered.
 | ||
| 
 | ||
| 4.3 The Fallacy of Complex Manipulation
 | ||
| 
 | ||
|    One strategy which may give a misleading appearance of
 | ||
|    unpredictability is to take a very complex algorithm (or an excellent
 | ||
|    traditional pseudo-random number generator with good statistical
 | ||
|    properties) and calculate a cryptographic key by starting with the
 | ||
|    current value of a computer system clock as the seed.  An adversary
 | ||
|    who knew roughly when the generator was started would have a
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                    [Page 8]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|    relatively small number of seed values to test as they would know
 | ||
|    likely values of the system clock.  Large numbers of pseudo-random
 | ||
|    bits could be generated but the search space an adversary would need
 | ||
|    to check could be quite small.
 | ||
| 
 | ||
|    Thus very strong and/or complex manipulation of data will not help if
 | ||
|    the adversary can learn what the manipulation is and there is not
 | ||
|    enough unpredictability in the starting seed value.  Even if they can
 | ||
|    not learn what the manipulation is, they may be able to use the
 | ||
|    limited number of results stemming from a limited number of seed
 | ||
|    values to defeat security.
 | ||
| 
 | ||
|    Another serious strategy error is to assume that a very complex
 | ||
|    pseudo-random number generation algorithm will produce strong random
 | ||
|    numbers when there has been no theory behind or analysis of the
 | ||
|    algorithm.  There is a excellent example of this fallacy right near
 | ||
|    the beginning of chapter 3 in [KNUTH] where the author describes a
 | ||
|    complex algorithm.  It was intended that the machine language program
 | ||
|    corresponding to the algorithm would be so complicated that a person
 | ||
|    trying to read the code without comments wouldn't know what the
 | ||
|    program was doing.  Unfortunately, actual use of this algorithm
 | ||
|    showed that it almost immediately converged to a single repeated
 | ||
|    value in one case and a small cycle of values in another case.
 | ||
| 
 | ||
|    Not only does complex manipulation not help you if you have a limited
 | ||
|    range of seeds but blindly chosen complex manipulation can destroy
 | ||
|    the randomness in a good seed!
 | ||
| 
 | ||
| 4.4 The Fallacy of Selection from a Large Database
 | ||
| 
 | ||
|    Another strategy that can give a misleading appearance of
 | ||
|    unpredictability is selection of a quantity randomly from a database
 | ||
|    and assume that its strength is related to the total number of bits
 | ||
|    in the database.  For example, typical USENET servers as of this date
 | ||
|    process over 35 megabytes of information per day.  Assume a random
 | ||
|    quantity was selected by fetching 32 bytes of data from a random
 | ||
|    starting point in this data.  This does not yield 32*8 = 256 bits
 | ||
|    worth of unguessability.  Even after allowing that much of the data
 | ||
|    is human language and probably has more like 2 or 3 bits of
 | ||
|    information per byte, it doesn't yield 32*2.5 = 80 bits of
 | ||
|    unguessability.  For an adversary with access to the same 35
 | ||
|    megabytes the unguessability rests only on the starting point of the
 | ||
|    selection.  That is, at best, about 25 bits of unguessability in this
 | ||
|    case.
 | ||
| 
 | ||
|    The same argument applies to selecting sequences from the data on a
 | ||
|    CD ROM or Audio CD recording or any other large public database.  If
 | ||
|    the adversary has access to the same database, this "selection from a
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                    [Page 9]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|    large volume of data" step buys very little.  However, if a selection
 | ||
|    can be made from data to which the adversary has no access, such as
 | ||
|    system buffers on an active multi-user system, it may be of some
 | ||
|    help.
 | ||
| 
 | ||
| 5. Hardware for Randomness
 | ||
| 
 | ||
|    Is there any hope for strong portable randomness in the future?
 | ||
|    There might be.  All that's needed is a physical source of
 | ||
|    unpredictable numbers.
 | ||
| 
 | ||
|    A thermal noise or radioactive decay source and a fast, free-running
 | ||
|    oscillator would do the trick directly [GIFFORD].  This is a trivial
 | ||
|    amount of hardware, and could easily be included as a standard part
 | ||
|    of a computer system's architecture.  Furthermore, any system with a
 | ||
|    spinning disk or the like has an adequate source of randomness
 | ||
|    [DAVIS].  All that's needed is the common perception among computer
 | ||
|    vendors that this small additional hardware and the software to
 | ||
|    access it is necessary and useful.
 | ||
| 
 | ||
| 5.1 Volume Required
 | ||
| 
 | ||
|    How much unpredictability is needed?  Is it possible to quantify the
 | ||
|    requirement in, say, number of random bits per second?
 | ||
| 
 | ||
|    The answer is not very much is needed.  For DES, the key is 56 bits
 | ||
|    and, as we show in an example in Section 8, even the highest security
 | ||
|    system is unlikely to require a keying material of over 200 bits.  If
 | ||
|    a series of keys are needed, it can be generated from a strong random
 | ||
|    seed using a cryptographically strong sequence as explained in
 | ||
|    Section 6.3.  A few hundred random bits generated once a day would be
 | ||
|    enough using such techniques.  Even if the random bits are generated
 | ||
|    as slowly as one per second and it is not possible to overlap the
 | ||
|    generation process, it should be tolerable in high security
 | ||
|    applications to wait 200 seconds occasionally.
 | ||
| 
 | ||
|    These numbers are trivial to achieve.  It could be done by a person
 | ||
|    repeatedly tossing a coin.  Almost any hardware process is likely to
 | ||
|    be much faster.
 | ||
| 
 | ||
| 5.2 Sensitivity to Skew
 | ||
| 
 | ||
|    Is there any specific requirement on the shape of the distribution of
 | ||
|    the random numbers?  The good news is the distribution need not be
 | ||
|    uniform.  All that is needed is a conservative estimate of how non-
 | ||
|    uniform it is to bound performance.  Two simple techniques to de-skew
 | ||
|    the bit stream are given below and stronger techniques are mentioned
 | ||
|    in Section 6.1.2 below.
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                   [Page 10]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
| 5.2.1 Using Stream Parity to De-Skew
 | ||
| 
 | ||
|    Consider taking a sufficiently long string of bits and map the string
 | ||
|    to "zero" or "one".  The mapping will not yield a perfectly uniform
 | ||
|    distribution, but it can be as close as desired.  One mapping that
 | ||
|    serves the purpose is to take the parity of the string.  This has the
 | ||
|    advantages that it is robust across all degrees of skew up to the
 | ||
|    estimated maximum skew and is absolutely trivial to implement in
 | ||
|    hardware.
 | ||
| 
 | ||
|    The following analysis gives the number of bits that must be sampled:
 | ||
| 
 | ||
|    Suppose the ratio of ones to zeros is 0.5 + e : 0.5 - e, where e is
 | ||
|    between 0 and 0.5 and is a measure of the "eccentricity" of the
 | ||
|    distribution.  Consider the distribution of the parity function of N
 | ||
|    bit samples.  The probabilities that the parity will be one or zero
 | ||
|    will be the sum of the odd or even terms in the binomial expansion of
 | ||
|    (p + q)^N, where p = 0.5 + e, the probability of a one, and q = 0.5 -
 | ||
|    e, the probability of a zero.
 | ||
| 
 | ||
|    These sums can be computed easily as
 | ||
| 
 | ||
|                          N            N
 | ||
|         1/2 * ( ( p + q )  + ( p - q )  )
 | ||
|    and
 | ||
|                          N            N
 | ||
|         1/2 * ( ( p + q )  - ( p - q )  ).
 | ||
| 
 | ||
|    (Which one corresponds to the probability the parity will be 1
 | ||
|    depends on whether N is odd or even.)
 | ||
| 
 | ||
|    Since p + q = 1 and p - q = 2e, these expressions reduce to
 | ||
| 
 | ||
|                        N
 | ||
|         1/2 * [1 + (2e) ]
 | ||
|    and
 | ||
|                        N
 | ||
|         1/2 * [1 - (2e) ].
 | ||
| 
 | ||
|    Neither of these will ever be exactly 0.5 unless e is zero, but we
 | ||
|    can bring them arbitrarily close to 0.5.  If we want the
 | ||
|    probabilities to be within some delta d of 0.5, i.e. then
 | ||
| 
 | ||
|                             N
 | ||
|         ( 0.5 + ( 0.5 * (2e)  ) )  <  0.5 + d.
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                   [Page 11]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|    Solving for N yields N > log(2d)/log(2e).  (Note that 2e is less than
 | ||
|    1, so its log is negative.  Division by a negative number reverses
 | ||
|    the sense of an inequality.)
 | ||
| 
 | ||
|    The following table gives the length of the string which must be
 | ||
|    sampled for various degrees of skew in order to come within 0.001 of
 | ||
|    a 50/50 distribution.
 | ||
| 
 | ||
|                        +---------+--------+-------+
 | ||
|                        | Prob(1) |    e   |    N  |
 | ||
|                        +---------+--------+-------+
 | ||
|                        |   0.5   |  0.00  |    1  |
 | ||
|                        |   0.6   |  0.10  |    4  |
 | ||
|                        |   0.7   |  0.20  |    7  |
 | ||
|                        |   0.8   |  0.30  |   13  |
 | ||
|                        |   0.9   |  0.40  |   28  |
 | ||
|                        |   0.95  |  0.45  |   59  |
 | ||
|                        |   0.99  |  0.49  |  308  |
 | ||
|                        +---------+--------+-------+
 | ||
| 
 | ||
|    The last entry shows that even if the distribution is skewed 99% in
 | ||
|    favor of ones, the parity of a string of 308 samples will be within
 | ||
|    0.001 of a 50/50 distribution.
 | ||
| 
 | ||
| 5.2.2 Using Transition Mappings to De-Skew
 | ||
| 
 | ||
|    Another technique, originally due to von Neumann [VON NEUMANN], is to
 | ||
|    examine a bit stream as a sequence of non-overlapping pairs. You
 | ||
|    could then discard any 00 or 11 pairs found, interpret 01 as a 0 and
 | ||
|    10 as a 1.  Assume the probability of a 1 is 0.5+e and the
 | ||
|    probability of a 0 is 0.5-e where e is the eccentricity of the source
 | ||
|    and described in the previous section.  Then the probability of each
 | ||
|    pair is as follows:
 | ||
| 
 | ||
|             +------+-----------------------------------------+
 | ||
|             | pair |            probability                  |
 | ||
|             +------+-----------------------------------------+
 | ||
|             |  00  | (0.5 - e)^2          =  0.25 - e + e^2  |
 | ||
|             |  01  | (0.5 - e)*(0.5 + e)  =  0.25     - e^2  |
 | ||
|             |  10  | (0.5 + e)*(0.5 - e)  =  0.25     - e^2  |
 | ||
|             |  11  | (0.5 + e)^2          =  0.25 + e + e^2  |
 | ||
|             +------+-----------------------------------------+
 | ||
| 
 | ||
|    This technique will completely eliminate any bias but at the expense
 | ||
|    of taking an indeterminate number of input bits for any particular
 | ||
|    desired number of output bits.  The probability of any particular
 | ||
|    pair being discarded is 0.5 + 2e^2 so the expected number of input
 | ||
|    bits to produce X output bits is X/(0.25 - e^2).
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                   [Page 12]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|    This technique assumes that the bits are from a stream where each bit
 | ||
|    has the same probability of being a 0 or 1 as any other bit in the
 | ||
|    stream and that bits are not correlated, i.e., that the bits are
 | ||
|    identical independent distributions.  If alternate bits were from two
 | ||
|    correlated sources, for example, the above analysis breaks down.
 | ||
| 
 | ||
|    The above technique also provides another illustration of how a
 | ||
|    simple statistical analysis can mislead if one is not always on the
 | ||
|    lookout for patterns that could be exploited by an adversary.  If the
 | ||
|    algorithm were mis-read slightly so that overlapping successive bits
 | ||
|    pairs were used instead of non-overlapping pairs, the statistical
 | ||
|    analysis given is the same; however, instead of provided an unbiased
 | ||
|    uncorrelated series of random 1's and 0's, it instead produces a
 | ||
|    totally predictable sequence of exactly alternating 1's and 0's.
 | ||
| 
 | ||
| 5.2.3 Using FFT to De-Skew
 | ||
| 
 | ||
|    When real world data consists of strongly biased or correlated bits,
 | ||
|    it may still contain useful amounts of randomness.  This randomness
 | ||
|    can be extracted through use of the discrete Fourier transform or its
 | ||
|    optimized variant, the FFT.
 | ||
| 
 | ||
|    Using the Fourier transform of the data, strong correlations can be
 | ||
|    discarded.  If adequate data is processed and remaining correlations
 | ||
|    decay, spectral lines approaching statistical independence and
 | ||
|    normally distributed randomness can be produced [BRILLINGER].
 | ||
| 
 | ||
| 5.2.4 Using Compression to De-Skew
 | ||
| 
 | ||
|    Reversible compression techniques also provide a crude method of de-
 | ||
|    skewing a skewed bit stream.  This follows directly from the
 | ||
|    definition of reversible compression and the formula in Section 2
 | ||
|    above for the amount of information in a sequence.  Since the
 | ||
|    compression is reversible, the same amount of information must be
 | ||
|    present in the shorter output than was present in the longer input.
 | ||
|    By the Shannon information equation, this is only possible if, on
 | ||
|    average, the probabilities of the different shorter sequences are
 | ||
|    more uniformly distributed than were the probabilities of the longer
 | ||
|    sequences.  Thus the shorter sequences are de-skewed relative to the
 | ||
|    input.
 | ||
| 
 | ||
|    However, many compression techniques add a somewhat predicatable
 | ||
|    preface to their output stream and may insert such a sequence again
 | ||
|    periodically in their output or otherwise introduce subtle patterns
 | ||
|    of their own.  They should be considered only a rough technique
 | ||
|    compared with those described above or in Section 6.1.2.  At a
 | ||
|    minimum, the beginning of the compressed sequence should be skipped
 | ||
|    and only later bits used for applications requiring random bits.
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                   [Page 13]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
| 5.3 Existing Hardware Can Be Used For Randomness
 | ||
| 
 | ||
|    As described below, many computers come with hardware that can, with
 | ||
|    care, be used to generate truly random quantities.
 | ||
| 
 | ||
| 5.3.1 Using Existing Sound/Video Input
 | ||
| 
 | ||
|    Increasingly computers are being built with inputs that digitize some
 | ||
|    real world analog source, such as sound from a microphone or video
 | ||
|    input from a camera.  Under appropriate circumstances, such input can
 | ||
|    provide reasonably high quality random bits.  The "input" from a
 | ||
|    sound digitizer with no source plugged in or a camera with the lens
 | ||
|    cap on, if the system has enough gain to detect anything, is
 | ||
|    essentially thermal noise.
 | ||
| 
 | ||
|    For example, on a SPARCstation, one can read from the /dev/audio
 | ||
|    device with nothing plugged into the microphone jack.  Such data is
 | ||
|    essentially random noise although it should not be trusted without
 | ||
|    some checking in case of hardware failure.  It will, in any case,
 | ||
|    need to be de-skewed as described elsewhere.
 | ||
| 
 | ||
|    Combining this with compression to de-skew one can, in UNIXese,
 | ||
|    generate a huge amount of medium quality random data by doing
 | ||
| 
 | ||
|         cat /dev/audio | compress - >random-bits-file
 | ||
| 
 | ||
| 5.3.2 Using Existing Disk Drives
 | ||
| 
 | ||
|    Disk drives have small random fluctuations in their rotational speed
 | ||
|    due to chaotic air turbulence [DAVIS].  By adding low level disk seek
 | ||
|    time instrumentation to a system, a series of measurements can be
 | ||
|    obtained that include this randomness. Such data is usually highly
 | ||
|    correlated so that significant processing is needed, including FFT
 | ||
|    (see section 5.2.3).  Nevertheless experimentation has shown that,
 | ||
|    with such processing, disk drives easily produce 100 bits a minute or
 | ||
|    more of excellent random data.
 | ||
| 
 | ||
|    Partly offsetting this need for processing is the fact that disk
 | ||
|    drive failure will normally be rapidly noticed.  Thus, problems with
 | ||
|    this method of random number generation due to hardware failure are
 | ||
|    very unlikely.
 | ||
| 
 | ||
| 6. Recommended Non-Hardware Strategy
 | ||
| 
 | ||
|    What is the best overall strategy for meeting the requirement for
 | ||
|    unguessable random numbers in the absence of a reliable hardware
 | ||
|    source?  It is to obtain random input from a large number of
 | ||
|    uncorrelated sources and to mix them with a strong mixing function.
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                   [Page 14]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|    Such a function will preserve the randomness present in any of the
 | ||
|    sources even if other quantities being combined are fixed or easily
 | ||
|    guessable.  This may be advisable even with a good hardware source as
 | ||
|    hardware can also fail, though this should be weighed against any
 | ||
|    increase in the chance of overall failure due to added software
 | ||
|    complexity.
 | ||
| 
 | ||
| 6.1 Mixing Functions
 | ||
| 
 | ||
|    A strong mixing function is one which combines two or more inputs and
 | ||
|    produces an output where each output bit is a different complex non-
 | ||
|    linear function of all the input bits.  On average, changing any
 | ||
|    input bit will change about half the output bits.  But because the
 | ||
|    relationship is complex and non-linear, no particular output bit is
 | ||
|    guaranteed to change when any particular input bit is changed.
 | ||
| 
 | ||
|    Consider the problem of converting a stream of bits that is skewed
 | ||
|    towards 0 or 1 to a shorter stream which is more random, as discussed
 | ||
|    in Section 5.2 above.  This is simply another case where a strong
 | ||
|    mixing function is desired, mixing the input bits to produce a
 | ||
|    smaller number of output bits.  The technique given in Section 5.2.1
 | ||
|    of using the parity of a number of bits is simply the result of
 | ||
|    successively Exclusive Or'ing them which is examined as a trivial
 | ||
|    mixing function immediately below.  Use of stronger mixing functions
 | ||
|    to extract more of the randomness in a stream of skewed bits is
 | ||
|    examined in Section 6.1.2.
 | ||
| 
 | ||
| 6.1.1 A Trivial Mixing Function
 | ||
| 
 | ||
|    A trivial example for single bit inputs is the Exclusive Or function,
 | ||
|    which is equivalent to addition without carry, as show in the table
 | ||
|    below.  This is a degenerate case in which the one output bit always
 | ||
|    changes for a change in either input bit.  But, despite its
 | ||
|    simplicity, it will still provide a useful illustration.
 | ||
| 
 | ||
|                    +-----------+-----------+----------+
 | ||
|                    |  input 1  |  input 2  |  output  |
 | ||
|                    +-----------+-----------+----------+
 | ||
|                    |     0     |     0     |     0    |
 | ||
|                    |     0     |     1     |     1    |
 | ||
|                    |     1     |     0     |     1    |
 | ||
|                    |     1     |     1     |     0    |
 | ||
|                    +-----------+-----------+----------+
 | ||
| 
 | ||
|    If inputs 1 and 2 are uncorrelated and combined in this fashion then
 | ||
|    the output will be an even better (less skewed) random bit than the
 | ||
|    inputs.  If we assume an "eccentricity" e as defined in Section 5.2
 | ||
|    above, then the output eccentricity relates to the input eccentricity
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                   [Page 15]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|    as follows:
 | ||
| 
 | ||
|         e       = 2 * e        * e
 | ||
|          output        input 1    input 2
 | ||
| 
 | ||
|    Since e is never greater than 1/2, the eccentricity is always
 | ||
|    improved except in the case where at least one input is a totally
 | ||
|    skewed constant.  This is illustrated in the following table where
 | ||
|    the top and left side values are the two input eccentricities and the
 | ||
|    entries are the output eccentricity:
 | ||
| 
 | ||
|      +--------+--------+--------+--------+--------+--------+--------+
 | ||
|      |    e   |  0.00  |  0.10  |  0.20  |  0.30  |  0.40  |  0.50  |
 | ||
|      +--------+--------+--------+--------+--------+--------+--------+
 | ||
|      |  0.00  |  0.00  |  0.00  |  0.00  |  0.00  |  0.00  |  0.00  |
 | ||
|      |  0.10  |  0.00  |  0.02  |  0.04  |  0.06  |  0.08  |  0.10  |
 | ||
|      |  0.20  |  0.00  |  0.04  |  0.08  |  0.12  |  0.16  |  0.20  |
 | ||
|      |  0.30  |  0.00  |  0.06  |  0.12  |  0.18  |  0.24  |  0.30  |
 | ||
|      |  0.40  |  0.00  |  0.08  |  0.16  |  0.24  |  0.32  |  0.40  |
 | ||
|      |  0.50  |  0.00  |  0.10  |  0.20  |  0.30  |  0.40  |  0.50  |
 | ||
|      +--------+--------+--------+--------+--------+--------+--------+
 | ||
| 
 | ||
|    However, keep in mind that the above calculations assume that the
 | ||
|    inputs are not correlated.  If the inputs were, say, the parity of
 | ||
|    the number of minutes from midnight on two clocks accurate to a few
 | ||
|    seconds, then each might appear random if sampled at random intervals
 | ||
|    much longer than a minute.  Yet if they were both sampled and
 | ||
|    combined with xor, the result would be zero most of the time.
 | ||
| 
 | ||
| 6.1.2 Stronger Mixing Functions
 | ||
| 
 | ||
|    The US Government Data Encryption Standard [DES] is an example of a
 | ||
|    strong mixing function for multiple bit quantities.  It takes up to
 | ||
|    120 bits of input (64 bits of "data" and 56 bits of "key") and
 | ||
|    produces 64 bits of output each of which is dependent on a complex
 | ||
|    non-linear function of all input bits.  Other strong encryption
 | ||
|    functions with this characteristic can also be used by considering
 | ||
|    them to mix all of their key and data input bits.
 | ||
| 
 | ||
|    Another good family of mixing functions are the "message digest" or
 | ||
|    hashing functions such as The US Government Secure Hash Standard
 | ||
|    [SHS] and the MD2, MD4, MD5 [MD2, MD4, MD5] series.  These functions
 | ||
|    all take an arbitrary amount of input and produce an output mixing
 | ||
|    all the input bits. The MD* series produce 128 bits of output and SHS
 | ||
|    produces 160 bits.
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                   [Page 16]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|    Although the message digest functions are designed for variable
 | ||
|    amounts of input, DES and other encryption functions can also be used
 | ||
|    to combine any number of inputs.  If 64 bits of output is adequate,
 | ||
|    the inputs can be packed into a 64 bit data quantity and successive
 | ||
|    56 bit keys, padding with zeros if needed, which are then used to
 | ||
|    successively encrypt using DES in Electronic Codebook Mode [DES
 | ||
|    MODES].  If more than 64 bits of output are needed, use more complex
 | ||
|    mixing.  For example, if inputs are packed into three quantities, A,
 | ||
|    B, and C, use DES to encrypt A with B as a key and then with C as a
 | ||
|    key to produce the 1st part of the output, then encrypt B with C and
 | ||
|    then A for more output and, if necessary, encrypt C with A and then B
 | ||
|    for yet more output.  Still more output can be produced by reversing
 | ||
|    the order of the keys given above to stretch things. The same can be
 | ||
|    done with the hash functions by hashing various subsets of the input
 | ||
|    data to produce multiple outputs.  But keep in mind that it is
 | ||
|    impossible to get more bits of "randomness" out than are put in.
 | ||
| 
 | ||
|    An example of using a strong mixing function would be to reconsider
 | ||
|    the case of a string of 308 bits each of which is biased 99% towards
 | ||
|    zero.  The parity technique given in Section 5.2.1 above reduced this
 | ||
|    to one bit with only a 1/1000 deviance from being equally likely a
 | ||
|    zero or one.  But, applying the equation for information given in
 | ||
|    Section 2, this 308 bit sequence has 5 bits of information in it.
 | ||
|    Thus hashing it with SHS or MD5 and taking the bottom 5 bits of the
 | ||
|    result would yield 5 unbiased random bits as opposed to the single
 | ||
|    bit given by calculating the parity of the string.
 | ||
| 
 | ||
| 6.1.3 Diffie-Hellman as a Mixing Function
 | ||
| 
 | ||
|    Diffie-Hellman exponential key exchange is a technique that yields a
 | ||
|    shared secret between two parties that can be made computationally
 | ||
|    infeasible for a third party to determine even if they can observe
 | ||
|    all the messages between the two communicating parties.  This shared
 | ||
|    secret is a mixture of initial quantities generated by each of them
 | ||
|    [D-H].  If these initial quantities are random, then the shared
 | ||
|    secret contains the combined randomness of them both, assuming they
 | ||
|    are uncorrelated.
 | ||
| 
 | ||
| 6.1.4 Using a Mixing Function to Stretch Random Bits
 | ||
| 
 | ||
|    While it is not necessary for a mixing function to produce the same
 | ||
|    or fewer bits than its inputs, mixing bits cannot "stretch" the
 | ||
|    amount of random unpredictability present in the inputs.  Thus four
 | ||
|    inputs of 32 bits each where there is 12 bits worth of
 | ||
|    unpredicatability (such as 4,096 equally probable values) in each
 | ||
|    input cannot produce more than 48 bits worth of unpredictable output.
 | ||
|    The output can be expanded to hundreds or thousands of bits by, for
 | ||
|    example, mixing with successive integers, but the clever adversary's
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                   [Page 17]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|    search space is still 2^48 possibilities.  Furthermore, mixing to
 | ||
|    fewer bits than are input will tend to strengthen the randomness of
 | ||
|    the output the way using Exclusive Or to produce one bit from two did
 | ||
|    above.
 | ||
| 
 | ||
|    The last table in Section 6.1.1 shows that mixing a random bit with a
 | ||
|    constant bit with Exclusive Or will produce a random bit.  While this
 | ||
|    is true, it does not provide a way to "stretch" one random bit into
 | ||
|    more than one.  If, for example, a random bit is mixed with a 0 and
 | ||
|    then with a 1, this produces a two bit sequence but it will always be
 | ||
|    either 01 or 10.  Since there are only two possible values, there is
 | ||
|    still only the one bit of original randomness.
 | ||
| 
 | ||
| 6.1.5 Other Factors in Choosing a Mixing Function
 | ||
| 
 | ||
|    For local use, DES has the advantages that it has been widely tested
 | ||
|    for flaws, is widely documented, and is widely implemented with
 | ||
|    hardware and software implementations available all over the world
 | ||
|    including source code available by anonymous FTP.  The SHS and MD*
 | ||
|    family are younger algorithms which have been less tested but there
 | ||
|    is no particular reason to believe they are flawed.  Both MD5 and SHS
 | ||
|    were derived from the earlier MD4 algorithm.  They all have source
 | ||
|    code available by anonymous FTP [SHS, MD2, MD4, MD5].
 | ||
| 
 | ||
|    DES and SHS have been vouched for the the US National Security Agency
 | ||
|    (NSA) on the basis of criteria that primarily remain secret.  While
 | ||
|    this is the cause of much speculation and doubt, investigation of DES
 | ||
|    over the years has indicated that NSA involvement in modifications to
 | ||
|    its design, which originated with IBM, was primarily to strengthen
 | ||
|    it.  No concealed or special weakness has been found in DES.  It is
 | ||
|    almost certain that the NSA modification to MD4 to produce the SHS
 | ||
|    similarly strengthened the algorithm, possibly against threats not
 | ||
|    yet known in the public cryptographic community.
 | ||
| 
 | ||
|    DES, SHS, MD4, and MD5 are royalty free for all purposes.  MD2 has
 | ||
|    been freely licensed only for non-profit use in connection with
 | ||
|    Privacy Enhanced Mail [PEM].  Between the MD* algorithms, some people
 | ||
|    believe that, as with "Goldilocks and the Three Bears", MD2 is strong
 | ||
|    but too slow, MD4 is fast but too weak, and MD5 is just right.
 | ||
| 
 | ||
|    Another advantage of the MD* or similar hashing algorithms over
 | ||
|    encryption algorithms is that they are not subject to the same
 | ||
|    regulations imposed by the US Government prohibiting the unlicensed
 | ||
|    export or import of encryption/decryption software and hardware.  The
 | ||
|    same should be true of DES rigged to produce an irreversible hash
 | ||
|    code but most DES packages are oriented to reversible encryption.
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                   [Page 18]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
| 6.2 Non-Hardware Sources of Randomness
 | ||
| 
 | ||
|    The best source of input for mixing would be a hardware randomness
 | ||
|    such as disk drive timing affected by air turbulence, audio input
 | ||
|    with thermal noise, or radioactive decay.  However, if that is not
 | ||
|    available there are other possibilities.  These include system
 | ||
|    clocks, system or input/output buffers, user/system/hardware/network
 | ||
|    serial numbers and/or addresses and timing, and user input.
 | ||
|    Unfortunately, any of these sources can produce limited or
 | ||
|    predicatable values under some circumstances.
 | ||
| 
 | ||
|    Some of the sources listed above would be quite strong on multi-user
 | ||
|    systems where, in essence, each user of the system is a source of
 | ||
|    randomness.  However, on a small single user system, such as a
 | ||
|    typical IBM PC or Apple Macintosh, it might be possible for an
 | ||
|    adversary to assemble a similar configuration.  This could give the
 | ||
|    adversary inputs to the mixing process that were sufficiently
 | ||
|    correlated to those used originally as to make exhaustive search
 | ||
|    practical.
 | ||
| 
 | ||
|    The use of multiple random inputs with a strong mixing function is
 | ||
|    recommended and can overcome weakness in any particular input.  For
 | ||
|    example, the timing and content of requested "random" user keystrokes
 | ||
|    can yield hundreds of random bits but conservative assumptions need
 | ||
|    to be made.  For example, assuming a few bits of randomness if the
 | ||
|    inter-keystroke interval is unique in the sequence up to that point
 | ||
|    and a similar assumption if the key hit is unique but assuming that
 | ||
|    no bits of randomness are present in the initial key value or if the
 | ||
|    timing or key value duplicate previous values.  The results of mixing
 | ||
|    these timings and characters typed could be further combined with
 | ||
|    clock values and other inputs.
 | ||
| 
 | ||
|    This strategy may make practical portable code to produce good random
 | ||
|    numbers for security even if some of the inputs are very weak on some
 | ||
|    of the target systems.  However, it may still fail against a high
 | ||
|    grade attack on small single user systems, especially if the
 | ||
|    adversary has ever been able to observe the generation process in the
 | ||
|    past.  A hardware based random source is still preferable.
 | ||
| 
 | ||
| 6.3 Cryptographically Strong Sequences
 | ||
| 
 | ||
|    In cases where a series of random quantities must be generated, an
 | ||
|    adversary may learn some values in the sequence.  In general, they
 | ||
|    should not be able to predict other values from the ones that they
 | ||
|    know.
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                   [Page 19]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|    The correct technique is to start with a strong random seed, take
 | ||
|    cryptographically strong steps from that seed [CRYPTO2, CRYPTO3], and
 | ||
|    do not reveal the complete state of the generator in the sequence
 | ||
|    elements.  If each value in the sequence can be calculated in a fixed
 | ||
|    way from the previous value, then when any value is compromised, all
 | ||
|    future values can be determined.  This would be the case, for
 | ||
|    example, if each value were a constant function of the previously
 | ||
|    used values, even if the function were a very strong, non-invertible
 | ||
|    message digest function.
 | ||
| 
 | ||
|    It should be noted that if your technique for generating a sequence
 | ||
|    of key values is fast enough, it can trivially be used as the basis
 | ||
|    for a confidentiality system.  If two parties use the same sequence
 | ||
|    generating technique and start with the same seed material, they will
 | ||
|    generate identical sequences.  These could, for example, be xor'ed at
 | ||
|    one end with data being send, encrypting it, and xor'ed with this
 | ||
|    data as received, decrypting it due to the reversible properties of
 | ||
|    the xor operation.
 | ||
| 
 | ||
| 6.3.1 Traditional Strong Sequences
 | ||
| 
 | ||
|    A traditional way to achieve a strong sequence has been to have the
 | ||
|    values be produced by hashing the quantities produced by
 | ||
|    concatenating the seed with successive integers or the like and then
 | ||
|    mask the values obtained so as to limit the amount of generator state
 | ||
|    available to the adversary.
 | ||
| 
 | ||
|    It may also be possible to use an "encryption" algorithm with a
 | ||
|    random key and seed value to encrypt and feedback some or all of the
 | ||
|    output encrypted value into the value to be encrypted for the next
 | ||
|    iteration.  Appropriate feedback techniques will usually be
 | ||
|    recommended with the encryption algorithm.  An example is shown below
 | ||
|    where shifting and masking are used to combine the cypher output
 | ||
|    feedback.  This type of feedback is recommended by the US Government
 | ||
|    in connection with DES [DES MODES].
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                   [Page 20]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|       +---------------+
 | ||
|       |       V       |
 | ||
|       |  |     n      |
 | ||
|       +--+------------+
 | ||
|             |      |           +---------+
 | ||
|             |      +---------> |         |      +-----+
 | ||
|          +--+                  | Encrypt | <--- | Key |
 | ||
|          |           +-------- |         |      +-----+
 | ||
|          |           |         +---------+
 | ||
|          V           V
 | ||
|       +------------+--+
 | ||
|       |      V     |  |
 | ||
|       |       n+1     |
 | ||
|       +---------------+
 | ||
| 
 | ||
|    Note that if a shift of one is used, this is the same as the shift
 | ||
|    register technique described in Section 3 above but with the all
 | ||
|    important difference that the feedback is determined by a complex
 | ||
|    non-linear function of all bits rather than a simple linear or
 | ||
|    polynomial combination of output from a few bit position taps.
 | ||
| 
 | ||
|    It has been shown by Donald W. Davies that this sort of shifted
 | ||
|    partial output feedback significantly weakens an algorithm compared
 | ||
|    will feeding all of the output bits back as input.  In particular,
 | ||
|    for DES, repeated encrypting a full 64 bit quantity will give an
 | ||
|    expected repeat in about 2^63 iterations.  Feeding back anything less
 | ||
|    than 64 (and more than 0) bits will give an expected repeat in
 | ||
|    between 2**31 and 2**32 iterations!
 | ||
| 
 | ||
|    To predict values of a sequence from others when the sequence was
 | ||
|    generated by these techniques is equivalent to breaking the
 | ||
|    cryptosystem or inverting the "non-invertible" hashing involved with
 | ||
|    only partial information available.  The less information revealed
 | ||
|    each iteration, the harder it will be for an adversary to predict the
 | ||
|    sequence.  Thus it is best to use only one bit from each value.  It
 | ||
|    has been shown that in some cases this makes it impossible to break a
 | ||
|    system even when the cryptographic system is invertible and can be
 | ||
|    broken if all of each generated value was revealed.
 | ||
| 
 | ||
| 6.3.2 The Blum Blum Shub Sequence Generator
 | ||
| 
 | ||
|    Currently the generator which has the strongest public proof of
 | ||
|    strength is called the Blum Blum Shub generator after its inventors
 | ||
|    [BBS].  It is also very simple and is based on quadratic residues.
 | ||
|    It's only disadvantage is that is is computationally intensive
 | ||
|    compared with the traditional techniques give in 6.3.1 above.  This
 | ||
|    is not a serious draw back if it is used for moderately infrequent
 | ||
|    purposes, such as generating session keys.
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                   [Page 21]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|    Simply choose two large prime numbers, say p and q, which both have
 | ||
|    the property that you get a remainder of 3 if you divide them by 4.
 | ||
|    Let n = p * q.  Then you choose a random number x relatively prime to
 | ||
|    n.  The initial seed for the generator and the method for calculating
 | ||
|    subsequent values are then
 | ||
| 
 | ||
|                    2
 | ||
|         s    =  ( x  )(Mod n)
 | ||
|          0
 | ||
| 
 | ||
|                    2
 | ||
|         s    = ( s   )(Mod n)
 | ||
|          i+1      i
 | ||
| 
 | ||
|    You must be careful to use only a few bits from the bottom of each s.
 | ||
|    It is always safe to use only the lowest order bit.  If you use no
 | ||
|    more than the
 | ||
| 
 | ||
|                   log  ( log  ( s  ) )
 | ||
|                      2      2    i
 | ||
| 
 | ||
|    low order bits, then predicting any additional bits from a sequence
 | ||
|    generated in this manner is provable as hard as factoring n.  As long
 | ||
|    as the initial x is secret, you can even make n public if you want.
 | ||
| 
 | ||
|    An intersting characteristic of this generator is that you can
 | ||
|    directly calculate any of the s values.  In particular
 | ||
| 
 | ||
|                      i
 | ||
|                ( ( 2  )(Mod (( p - 1 ) * ( q - 1 )) ) )
 | ||
|       s  = ( s                                          )(Mod n)
 | ||
|        i      0
 | ||
| 
 | ||
|    This means that in applications where many keys are generated in this
 | ||
|    fashion, it is not necessary to save them all.  Each key can be
 | ||
|    effectively indexed and recovered from that small index and the
 | ||
|    initial s and n.
 | ||
| 
 | ||
| 7. Key Generation Standards
 | ||
| 
 | ||
|    Several public standards are now in place for the generation of keys.
 | ||
|    Two of these are described below.  Both use DES but any equally
 | ||
|    strong or stronger mixing function could be substituted.
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                   [Page 22]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
| 7.1 US DoD Recommendations for Password Generation
 | ||
| 
 | ||
|    The United States Department of Defense has specific recommendations
 | ||
|    for password generation [DoD].  They suggest using the US Data
 | ||
|    Encryption Standard [DES] in Output Feedback Mode [DES MODES] as
 | ||
|    follows:
 | ||
| 
 | ||
|         use an initialization vector determined from
 | ||
|              the system clock,
 | ||
|              system ID,
 | ||
|              user ID, and
 | ||
|              date and time;
 | ||
|         use a key determined from
 | ||
|              system interrupt registers,
 | ||
|              system status registers, and
 | ||
|              system counters; and,
 | ||
|         as plain text, use an external randomly generated 64 bit
 | ||
|         quantity such as 8 characters typed in by a system
 | ||
|         administrator.
 | ||
| 
 | ||
|    The password can then be calculated from the 64 bit "cipher text"
 | ||
|    generated in 64-bit Output Feedback Mode.  As many bits as are needed
 | ||
|    can be taken from these 64 bits and expanded into a pronounceable
 | ||
|    word, phrase, or other format if a human being needs to remember the
 | ||
|    password.
 | ||
| 
 | ||
| 7.2 X9.17 Key Generation
 | ||
| 
 | ||
|    The American National Standards Institute has specified a method for
 | ||
|    generating a sequence of keys as follows:
 | ||
| 
 | ||
|         s  is the initial 64 bit seed
 | ||
|          0
 | ||
| 
 | ||
|         g  is the sequence of generated 64 bit key quantities
 | ||
|          n
 | ||
| 
 | ||
|         k is a random key reserved for generating this key sequence
 | ||
| 
 | ||
|         t is the time at which a key is generated to as fine a resolution
 | ||
|             as is available (up to 64 bits).
 | ||
| 
 | ||
|         DES ( K, Q ) is the DES encryption of quantity Q with key K
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                   [Page 23]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|         g    = DES ( k, DES ( k, t ) .xor. s  )
 | ||
|          n                                  n
 | ||
| 
 | ||
|         s    = DES ( k, DES ( k, t ) .xor. g  )
 | ||
|          n+1                                n
 | ||
| 
 | ||
|    If g sub n is to be used as a DES key, then every eighth bit should
 | ||
|    be adjusted for parity for that use but the entire 64 bit unmodified
 | ||
|    g should be used in calculating the next s.
 | ||
| 
 | ||
| 8. Examples of Randomness Required
 | ||
| 
 | ||
|    Below are two examples showing rough calculations of needed
 | ||
|    randomness for security.  The first is for moderate security
 | ||
|    passwords while the second assumes a need for a very high security
 | ||
|    cryptographic key.
 | ||
| 
 | ||
| 8.1  Password Generation
 | ||
| 
 | ||
|    Assume that user passwords change once a year and it is desired that
 | ||
|    the probability that an adversary could guess the password for a
 | ||
|    particular account be less than one in a thousand.  Further assume
 | ||
|    that sending a password to the system is the only way to try a
 | ||
|    password.  Then the crucial question is how often an adversary can
 | ||
|    try possibilities.  Assume that delays have been introduced into a
 | ||
|    system so that, at most, an adversary can make one password try every
 | ||
|    six seconds.  That's 600 per hour or about 15,000 per day or about
 | ||
|    5,000,000 tries in a year.  Assuming any sort of monitoring, it is
 | ||
|    unlikely someone could actually try continuously for a year.  In
 | ||
|    fact, even if log files are only checked monthly, 500,000 tries is
 | ||
|    more plausible before the attack is noticed and steps taken to change
 | ||
|    passwords and make it harder to try more passwords.
 | ||
| 
 | ||
|    To have a one in a thousand chance of guessing the password in
 | ||
|    500,000 tries implies a universe of at least 500,000,000 passwords or
 | ||
|    about 2^29.  Thus 29 bits of randomness are needed. This can probably
 | ||
|    be achieved using the US DoD recommended inputs for password
 | ||
|    generation as it has 8 inputs which probably average over 5 bits of
 | ||
|    randomness each (see section 7.1).  Using a list of 1000 words, the
 | ||
|    password could be expressed as a three word phrase (1,000,000,000
 | ||
|    possibilities) or, using case insensitive letters and digits, six
 | ||
|    would suffice ((26+10)^6 = 2,176,782,336 possibilities).
 | ||
| 
 | ||
|    For a higher security password, the number of bits required goes up.
 | ||
|    To decrease the probability by 1,000 requires increasing the universe
 | ||
|    of passwords by the same factor which adds about 10 bits.  Thus to
 | ||
|    have only a one in a million chance of a password being guessed under
 | ||
|    the above scenario would require 39 bits of randomness and a password
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                   [Page 24]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|    that was a four word phrase from a 1000 word list or eight
 | ||
|    letters/digits.  To go to a one in 10^9 chance, 49 bits of randomness
 | ||
|    are needed implying a five word phrase or ten letter/digit password.
 | ||
| 
 | ||
|    In a real system, of course, there are also other factors.  For
 | ||
|    example, the larger and harder to remember passwords are, the more
 | ||
|    likely users are to write them down resulting in an additional risk
 | ||
|    of compromise.
 | ||
| 
 | ||
| 8.2 A Very High Security Cryptographic Key
 | ||
| 
 | ||
|    Assume that a very high security key is needed for symmetric
 | ||
|    encryption / decryption between two parties.  Assume an adversary can
 | ||
|    observe communications and knows the algorithm being used.  Within
 | ||
|    the field of random possibilities, the adversary can try key values
 | ||
|    in hopes of finding the one in use.  Assume further that brute force
 | ||
|    trial of keys is the best the adversary can do.
 | ||
| 
 | ||
| 8.2.1 Effort per Key Trial
 | ||
| 
 | ||
|    How much effort will it take to try each key?  For very high security
 | ||
|    applications it is best to assume a low value of effort.  Even if it
 | ||
|    would clearly take tens of thousands of computer cycles or more to
 | ||
|    try a single key, there may be some pattern that enables huge blocks
 | ||
|    of key values to be tested with much less effort per key.  Thus it is
 | ||
|    probably best to assume no more than a couple hundred cycles per key.
 | ||
|    (There is no clear lower bound on this as computers operate in
 | ||
|    parallel on a number of bits and a poor encryption algorithm could
 | ||
|    allow many keys or even groups of keys to be tested in parallel.
 | ||
|    However, we need to assume some value and can hope that a reasonably
 | ||
|    strong algorithm has been chosen for our hypothetical high security
 | ||
|    task.)
 | ||
| 
 | ||
|    If the adversary can command a highly parallel processor or a large
 | ||
|    network of work stations, 2*10^10 cycles per second is probably a
 | ||
|    minimum assumption for availability today.  Looking forward just a
 | ||
|    couple years, there should be at least an order of magnitude
 | ||
|    improvement.  Thus assuming 10^9 keys could be checked per second or
 | ||
|    3.6*10^11 per hour or 6*10^13 per week or 2.4*10^14 per month is
 | ||
|    reasonable.  This implies a need for a minimum of 51 bits of
 | ||
|    randomness in keys to be sure they cannot be found in a month.  Even
 | ||
|    then it is possible that, a few years from now, a highly determined
 | ||
|    and resourceful adversary could break the key in 2 weeks (on average
 | ||
|    they need try only half the keys).
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                   [Page 25]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
| 8.2.2 Meet in the Middle Attacks
 | ||
| 
 | ||
|    If chosen or known plain text and the resulting encrypted text are
 | ||
|    available, a "meet in the middle" attack is possible if the structure
 | ||
|    of the encryption algorithm allows it.  (In a known plain text
 | ||
|    attack, the adversary knows all or part of the messages being
 | ||
|    encrypted, possibly some standard header or trailer fields.  In a
 | ||
|    chosen plain text attack, the adversary can force some chosen plain
 | ||
|    text to be encrypted, possibly by "leaking" an exciting text that
 | ||
|    would then be sent by the adversary over an encrypted channel.)
 | ||
| 
 | ||
|    An oversimplified explanation of the meet in the middle attack is as
 | ||
|    follows: the adversary can half-encrypt the known or chosen plain
 | ||
|    text with all possible first half-keys, sort the output, then half-
 | ||
|    decrypt the encoded text with all the second half-keys.  If a match
 | ||
|    is found, the full key can be assembled from the halves and used to
 | ||
|    decrypt other parts of the message or other messages.  At its best,
 | ||
|    this type of attack can halve the exponent of the work required by
 | ||
|    the adversary while adding a large but roughly constant factor of
 | ||
|    effort.  To be assured of safety against this, a doubling of the
 | ||
|    amount of randomness in the key to a minimum of 102 bits is required.
 | ||
| 
 | ||
|    The meet in the middle attack assumes that the cryptographic
 | ||
|    algorithm can be decomposed in this way but we can not rule that out
 | ||
|    without a deep knowledge of the algorithm.  Even if a basic algorithm
 | ||
|    is not subject to a meet in the middle attack, an attempt to produce
 | ||
|    a stronger algorithm by applying the basic algorithm twice (or two
 | ||
|    different algorithms sequentially) with different keys may gain less
 | ||
|    added security than would be expected.  Such a composite algorithm
 | ||
|    would be subject to a meet in the middle attack.
 | ||
| 
 | ||
|    Enormous resources may be required to mount a meet in the middle
 | ||
|    attack but they are probably within the range of the national
 | ||
|    security services of a major nation.  Essentially all nations spy on
 | ||
|    other nations government traffic and several nations are believed to
 | ||
|    spy on commercial traffic for economic advantage.
 | ||
| 
 | ||
| 8.2.3 Other Considerations
 | ||
| 
 | ||
|    Since we have not even considered the possibilities of special
 | ||
|    purpose code breaking hardware or just how much of a safety margin we
 | ||
|    want beyond our assumptions above, probably a good minimum for a very
 | ||
|    high security cryptographic key is 128 bits of randomness which
 | ||
|    implies a minimum key length of 128 bits.  If the two parties agree
 | ||
|    on a key by Diffie-Hellman exchange [D-H], then in principle only
 | ||
|    half of this randomness would have to be supplied by each party.
 | ||
|    However, there is probably some correlation between their random
 | ||
|    inputs so it is probably best to assume that each party needs to
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                   [Page 26]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|    provide at least 96 bits worth of randomness for very high security
 | ||
|    if Diffie-Hellman is used.
 | ||
| 
 | ||
|    This amount of randomness is beyond the limit of that in the inputs
 | ||
|    recommended by the US DoD for password generation and could require
 | ||
|    user typing timing, hardware random number generation, or other
 | ||
|    sources.
 | ||
| 
 | ||
|    It should be noted that key length calculations such at those above
 | ||
|    are controversial and depend on various assumptions about the
 | ||
|    cryptographic algorithms in use.  In some cases, a professional with
 | ||
|    a deep knowledge of code breaking techniques and of the strength of
 | ||
|    the algorithm in use could be satisfied with less than half of the
 | ||
|    key size derived above.
 | ||
| 
 | ||
| 9. Conclusion
 | ||
| 
 | ||
|    Generation of unguessable "random" secret quantities for security use
 | ||
|    is an essential but difficult task.
 | ||
| 
 | ||
|    We have shown that hardware techniques to produce such randomness
 | ||
|    would be relatively simple.  In particular, the volume and quality
 | ||
|    would not need to be high and existing computer hardware, such as
 | ||
|    disk drives, can be used.  Computational techniques are available to
 | ||
|    process low quality random quantities from multiple sources or a
 | ||
|    larger quantity of such low quality input from one source and produce
 | ||
|    a smaller quantity of higher quality, less predictable key material.
 | ||
|    In the absence of hardware sources of randomness, a variety of user
 | ||
|    and software sources can frequently be used instead with care;
 | ||
|    however, most modern systems already have hardware, such as disk
 | ||
|    drives or audio input, that could be used to produce high quality
 | ||
|    randomness.
 | ||
| 
 | ||
|    Once a sufficient quantity of high quality seed key material (a few
 | ||
|    hundred bits) is available, strong computational techniques are
 | ||
|    available to produce cryptographically strong sequences of
 | ||
|    unpredicatable quantities from this seed material.
 | ||
| 
 | ||
| 10. Security Considerations
 | ||
| 
 | ||
|    The entirety of this document concerns techniques and recommendations
 | ||
|    for generating unguessable "random" quantities for use as passwords,
 | ||
|    cryptographic keys, and similar security uses.
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                   [Page 27]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
| References
 | ||
| 
 | ||
|    [ASYMMETRIC] - Secure Communications and Asymmetric Cryptosystems,
 | ||
|    edited by Gustavus J. Simmons, AAAS Selected Symposium 69, Westview
 | ||
|    Press, Inc.
 | ||
| 
 | ||
|    [BBS] - A Simple Unpredictable Pseudo-Random Number Generator, SIAM
 | ||
|    Journal on Computing, v. 15, n. 2, 1986, L. Blum, M. Blum, & M. Shub.
 | ||
| 
 | ||
|    [BRILLINGER] - Time Series: Data Analysis and Theory, Holden-Day,
 | ||
|    1981, David Brillinger.
 | ||
| 
 | ||
|    [CRC] - C.R.C. Standard Mathematical Tables, Chemical Rubber
 | ||
|    Publishing Company.
 | ||
| 
 | ||
|    [CRYPTO1] - Cryptography: A Primer, A Wiley-Interscience Publication,
 | ||
|    John Wiley & Sons, 1981, Alan G. Konheim.
 | ||
| 
 | ||
|    [CRYPTO2] - Cryptography:  A New Dimension in Computer Data Security,
 | ||
|    A Wiley-Interscience Publication, John Wiley & Sons, 1982, Carl H.
 | ||
|    Meyer & Stephen M. Matyas.
 | ||
| 
 | ||
|    [CRYPTO3] - Applied Cryptography: Protocols, Algorithms, and Source
 | ||
|    Code in C, John Wiley & Sons, 1994, Bruce Schneier.
 | ||
| 
 | ||
|    [DAVIS] - Cryptographic Randomness from Air Turbulence in Disk
 | ||
|    Drives, Advances in Cryptology - Crypto '94, Springer-Verlag Lecture
 | ||
|    Notes in Computer Science #839, 1984, Don Davis, Ross Ihaka, and
 | ||
|    Philip Fenstermacher.
 | ||
| 
 | ||
|    [DES] -  Data Encryption Standard, United States of America,
 | ||
|    Department of Commerce, National Institute of Standards and
 | ||
|    Technology, Federal Information Processing Standard (FIPS) 46-1.
 | ||
|    - Data Encryption Algorithm, American National Standards Institute,
 | ||
|    ANSI X3.92-1981.
 | ||
|    (See also FIPS 112, Password Usage, which includes FORTRAN code for
 | ||
|    performing DES.)
 | ||
| 
 | ||
|    [DES MODES] - DES Modes of Operation, United States of America,
 | ||
|    Department of Commerce, National Institute of Standards and
 | ||
|    Technology, Federal Information Processing Standard (FIPS) 81.
 | ||
|    - Data Encryption Algorithm - Modes of Operation, American National
 | ||
|    Standards Institute, ANSI X3.106-1983.
 | ||
| 
 | ||
|    [D-H] - New Directions in Cryptography, IEEE Transactions on
 | ||
|    Information Technology, November, 1976, Whitfield Diffie and Martin
 | ||
|    E. Hellman.
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                   [Page 28]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|    [DoD] - Password Management Guideline, United States of America,
 | ||
|    Department of Defense, Computer Security Center, CSC-STD-002-85.
 | ||
|    (See also FIPS 112, Password Usage, which incorporates CSC-STD-002-85
 | ||
|    as one of its appendices.)
 | ||
| 
 | ||
|    [GIFFORD] - Natural Random Number, MIT/LCS/TM-371, September 1988,
 | ||
|    David K. Gifford
 | ||
| 
 | ||
|    [KNUTH] - The Art of Computer Programming, Volume 2: Seminumerical
 | ||
|    Algorithms, Chapter 3: Random Numbers. Addison Wesley Publishing
 | ||
|    Company, Second Edition 1982, Donald E. Knuth.
 | ||
| 
 | ||
|    [KRAWCZYK] - How to Predict Congruential Generators, Journal of
 | ||
|    Algorithms, V. 13, N. 4, December 1992, H. Krawczyk
 | ||
| 
 | ||
|    [MD2] - The MD2 Message-Digest Algorithm, RFC1319, April 1992, B.
 | ||
|    Kaliski
 | ||
|    [MD4] - The MD4 Message-Digest Algorithm, RFC1320, April 1992, R.
 | ||
|    Rivest
 | ||
|    [MD5] - The MD5 Message-Digest Algorithm, RFC1321, April 1992, R.
 | ||
|    Rivest
 | ||
| 
 | ||
|    [PEM] - RFCs 1421 through 1424:
 | ||
|    - RFC 1424, Privacy Enhancement for Internet Electronic Mail: Part
 | ||
|    IV: Key Certification and Related Services, 02/10/1993, B. Kaliski
 | ||
|    - RFC 1423, Privacy Enhancement for Internet Electronic Mail: Part
 | ||
|    III: Algorithms, Modes, and Identifiers, 02/10/1993, D. Balenson
 | ||
|    - RFC 1422, Privacy Enhancement for Internet Electronic Mail: Part
 | ||
|    II: Certificate-Based Key Management, 02/10/1993, S. Kent
 | ||
|    - RFC 1421, Privacy Enhancement for Internet Electronic Mail: Part I:
 | ||
|    Message Encryption and Authentication Procedures, 02/10/1993, J. Linn
 | ||
| 
 | ||
|    [SHANNON] - The Mathematical Theory of Communication, University of
 | ||
|    Illinois Press, 1963, Claude E. Shannon.  (originally from:  Bell
 | ||
|    System Technical Journal, July and October 1948)
 | ||
| 
 | ||
|    [SHIFT1] - Shift Register Sequences, Aegean Park Press, Revised
 | ||
|    Edition 1982, Solomon W. Golomb.
 | ||
| 
 | ||
|    [SHIFT2] - Cryptanalysis of Shift-Register Generated Stream Cypher
 | ||
|    Systems, Aegean Park Press, 1984, Wayne G. Barker.
 | ||
| 
 | ||
|    [SHS] - Secure Hash Standard, United States of American, National
 | ||
|    Institute of Science and Technology, Federal Information Processing
 | ||
|    Standard (FIPS) 180, April 1993.
 | ||
| 
 | ||
|    [STERN] - Secret Linear Congruential Generators are not
 | ||
|    Cryptograhically Secure, Proceedings of IEEE STOC, 1987, J. Stern.
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                   [Page 29]
 | ||
| 
 | ||
| RFC 1750        Randomness Recommendations for Security    December 1994
 | ||
| 
 | ||
| 
 | ||
|    [VON NEUMANN] - Various techniques used in connection with random
 | ||
|    digits, von Neumann's Collected Works, Vol. 5, Pergamon Press, 1963,
 | ||
|    J. von Neumann.
 | ||
| 
 | ||
| Authors' Addresses
 | ||
| 
 | ||
|    Donald E. Eastlake 3rd
 | ||
|    Digital Equipment Corporation
 | ||
|    550 King Street, LKG2-1/BB3
 | ||
|    Littleton, MA 01460
 | ||
| 
 | ||
|    Phone:   +1 508 486 6577(w)  +1 508 287 4877(h)
 | ||
|    EMail:   dee@lkg.dec.com
 | ||
| 
 | ||
| 
 | ||
|    Stephen D. Crocker
 | ||
|    CyberCash Inc.
 | ||
|    2086 Hunters Crest Way
 | ||
|    Vienna, VA 22181
 | ||
| 
 | ||
|    Phone:   +1 703-620-1222(w)  +1 703-391-2651 (fax)
 | ||
|    EMail:   crocker@cybercash.com
 | ||
| 
 | ||
| 
 | ||
|    Jeffrey I. Schiller
 | ||
|    Massachusetts Institute of Technology
 | ||
|    77 Massachusetts Avenue
 | ||
|    Cambridge, MA 02139
 | ||
| 
 | ||
|    Phone:   +1 617 253 0161(w)
 | ||
|    EMail:   jis@mit.edu
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| 
 | ||
| Eastlake, Crocker & Schiller                                   [Page 30]
 | ||
| 
 |