Analysis

The problem I have chosen to solve is pseudorandom number generation (PRNG) using the LMC instruction set. I have chosen this problem because the LMC supports the two most basic mathematical operations- addition and subtraction of integers- which should be substantial enough to construct a basic PRNG algorithm. PRNG algorithms are incredibly important in computer security, being used to derive encryption and decryption keys and secret keys that are important to be unpredictable given data either side of the key or data encoded via the key.

A number of different algorithms are available for random number generation. The first is a more mathematical approach called Multiplicative Congruential Generators.

An LCG is an algorithm that applies a modulo function of divisor ‘m’ across a dividend consisting of a ‘seed’ that has been multiplied by a constant ‘a’. The value of this operation is then used as the next seed.

A varient of this known as a mixed congruential generator, where a certain constant ‘c’ is added on to the dividend. Both the mixed and multiplicative generators are examples of linear congruential generators (LCGs).

Another algorithm is called the middle-squares method. The middle-squares method works by squaring a number, and then stripping the least significant digit and as many most significant digits as required until you have a number the same digits as you started. This is then repeated on the output of the function to generate another number.

The method used fairly typically on Linux systems is ‘/dev/random’- this is a file that updates with random noise sampled from the real world, typically things like network timings, power fluctuations and other minute, inevitable and uncontrollable environmental variables.

Of these 3 methods, the LCG is most ideal for the LMC instruction set. Multiplication and remainder can both be performed using just addition and subtraction, whereas doing manipulation required for the middle-squares method will be considerably more difficult. The middle-squares method will also be very limited on the LMC due to the restrictions on the size of a single piece of data being low (-999 to 999). An implementation of /dev/random would not be possible, since the LMC does not allow access to monitoring required for noise collection.

The problem can be therefore broken down into 3 stages: multiplying the seed by the constant ‘a’, performing remainder division on the result of this and the constant ‘m’, and then putting the result of this back into the algorithm.

However, this first raises the question of what values to use for ‘m’ and ‘a’. Using a prime as ‘m’ is a good idea, because it allows every number less than ‘m’ to be selected given the right value of ‘a’. ‘a’ must be a primitive element of ‘m’ for the LCG to have this property. 997 is the largest prime number usable in the LCG, and 505 is the smallest primitive root of 997 that multiplies by any value (other than 1) to a value greater than 997. Therefore, our ‘m’ value will be 997 and our ‘a’ 505.

Given a seed of 1, we should expect a period of 996 values (every number appears once, then 1 appears again and the cycle repeats), in seemingly random order. This isn’t true randomness, and a clear flaw in this system is that numbers will only appear once per cycle, however this could be resolved using another remainder operation on the result to generate numbers inside a given range:

( (a * seed) MOD m ) MOD maximum

This would generate random numbers within the given range, but may express a bias

Any seed should yield a period of 996 values before the seed reappears. The order should be pseudorandom, and this could be tested by comparing the intervals of the numbers generated. Another more visual test however could be to map the results and watch for patterns in the path produced (the first result moves upward, the second across, the third upward and so on). This visual approach makes data that is practically random or non-random easily distinguishable.

It must be tested with different seeds to check the result always appears as random.

A successful random number generator will generate numbers with an unpredictable interval between the numbers and an unpredictable pattern for repetitions below 996 (or the period length)

Implementation

Implementation can be split into 3 parts of functionality: remainder, multiplication and then iteration.

A method to calculate the remainder involves subtracting a value until the result is non-positive, and then adding one of the result on:

DIVISOR ← INPUT

DIVIDEND ← INPUT

ASSERT DIVIDEND ≥ DIVISOR

diff ← DIVISOR

WHILE diff ≥ 0:

The next algorithm required is a method to multiply, since the LMC doesn’t natively implement this. Integer multiplication can be replicated by repeated addition of values, or the sum of a set of length ‘n’ containing only the value ‘m’ (equal n ✕ m). This can be replicated in pseudocode as below:

Seed | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |

Value | 505 | 13 | 518 | 21 | 526 | 34 | 539 | 42 | 547 | 55 | 560 | 63 |

Change | +504 | +11 | +515 | +17 | +521 | +28 | +532 | +36 | +538 | +45 | +549 | +51 |

Seed | 1 | 505 | 155 | 314 | 649 | 911 | 285 | 2 | 13 | 568 | 983 | 663 |

Value | 505 | 155 | 314 | 649 | 911 | 285 | 2 | 13 | 568 | 983 | 663 | 982 |

Change | +504 | -350 | +159 | 335 | 262 | -626 | -283 | +11 | +555 | +415 | -320 | +319 |

gap correlation by lagging the results from each other:

Two different random number LCG algorithms with ‘m’ values of 997 produce the following sequences:

1 | 505 | 155 | 314 | 649 | 911 | 285 | 2 | 13 | 568 | 983 | 663 | 982 |

1 | 987 | 161 | 155 | 143 | 120 | 12 | 927 | 160 | 154 | 144 | 120 | 12 |

The second algorithm is modified to produce the following sequence:

1 | 527 | 361 | 455 | 343 | 322 | 415 | 360 | 554 | 344 | 417 | 399 | 464 |

Part of the algorithm responsible for the top sequence looks like so:

Jump to the specified instruction if the accumulator is holding a positive value.