Skip to content
switham edited this page Jan 29, 2014 · 47 revisions

The treeprng module generates sets of random numbers repeatably, that is, such that all of a set, or any needed parts of a set, can be regenerated exactly later. Applications include

  • Software testing
  • Games with virtual worlds
  • Monte Carlo simulations
  • Computer graphics and art

WARNING:You may have noticed that this code calls a cryptographic hash function. That does NOT mean this code is secure. treeprng is NOT a cryptographic pseudorandom number generator. Do NOT use it in a crypto or security-sensitive application.

treeprng provides a systematic way to seed a pseudorandom number generator (PRNG) according to the logical structure of a task. A TreePRNG object is a virtual tree of nested Python dictionaries, matching the task, parts of the task, parts of parts, and so forth, down to the individual random numbers. The programmer gives addresses to the places in the task space, as if laying out directories and files in a filesystem or URLs on a web server.

Quick Start
Keys
Random Methods
Recommendations

Seeding the Top of the Tree
Lists
Cross-Products
Don't Mix treeprng with random

The TreePRNG Life Cycle

Why the Dict and Spent States?

Other Methods

sequence()
hash()

Quick Start

from treeprng import TreePRNG

story = TreePRNG()["The Lord of the Rings"]
frodo = story["hobbits"]["Frodo"]
hair_colors = ["blond", "red", "brown", "black", "gray", "white"]
hair_color = frodo["hair_color"].choice(hair_colors)
print "Frodo has", hair_color, "hair."

This illustrates the basics of TreePRNG: every random number has a path of dictionary keys down the tree. In this case the path to our result is the sequence of keys "The Lord of the Rings" -> "hobbits" -> "Frodo" -> "hair_color". Keys can be strings, numbers, or tuples. See the Keys section.

At the end of this path we didn't get a random number per se, but used the choice() method. All of the randomness-making methods of the Python random module are allowed. See the section on Random Methods.

We saved two TreePRNG objects in variables, story and frodo, representing subtrees. The "hobbits" subtree was used on the way to "Frodo", but not saved. A subtree can be accessed through a saved TreePRNG object, or by starting at a higher subtree and following the path between them.

Keys

Keys can be strings, ints, longs, floats, True, False, None, or tuples or lists, or nested tuples or lists of those types. As with the Python hash() function, numeric values that compare equal have the same effect (even if they are of different types, as is the case for 1 and 1.0). (Complex numbers should be handled, see this issue.)

Random Methods

All of the methods of the random.Random class that produce random numbers or random effects are available as methods of TreePRNG objects, including .getrandbits(), .choice(), .shuffle(), and .sample().

Methods that have to do with random.Random's way of seeding are disabled: .seed(), .jumpahead(), .getstate() and .setstate().

The latter restriction doesn't apply to the random.Random instance returned by .sequence().

Recommendations

Seeding the Top of the Tree

The top keys of the tree should be a combination of things like

  • A likely-unique application name, for instance in Java's com.mac-guyver.switham.games.LOTR style.
  • an ID for the version of the indexing scheme, in case you change it.
  • a user ID if relevant.
  • a long, truly-random seed for a given run (to be saved in order to re-run or continue a run).
  • a "task id," for, say, parallel runs of distributed computing applications. This should be a logical id, not an actual machine, process or thread ID, so that a task can be restarted in a different environment.

Lists

There are two ways to generate lists. The most flexible and repeatable is to follow the one-item-one-path philosophy, indexing a TreePRNG object with ints:

names_list = hobbits["names"]
n_hobbits = hobbits["how_many"].randrange(10, 20)
names = [generate_name(names_list[i]) for i in range(n_hobbits)]

This passes a TreePRNG to generate_names() for each item in the list, giving it a chance to use as many random numbers as it needs for each. It also allows individual items in the list to be accessed by index. The list could also be left entirely virtual, generating names (in this case) only when needed:

print "Greetings,", generate_name(names_list[i]) + "!"

The second way is to use a single PRNG created with the .sequence() method.

prng = hobbits["names"].sequence()
names = [generate_name(prng) for i in range(n_hobbits)]

With this method, the list must always be generated from the start, since each element depends on how many random numbers were used for the elements before it. For this reason, the .sequence() method is recommended only for long lists of uniform data that won't be accessed at random.

Ordinary multidimensional arrays can be created as lists of lists:

tictactoe = []
for y in range(3):
    row = [game["squares"][y][x].choice("XO.") for x in range(3)]
    tictactoe.append(row)

Cross-Products

If you want to make an array indexed by objects in completely different parts of the tree, you could invent a special section of the tree for every kind of pairing you might come up with. Hobbit-orc interactions, key-door interactions, sword-branch interactions, etc., then index each pair with the appropriate individual object keys. But TreePRNG provides a .hash() method to allow more direct object identification.

galadriel = elves["Galadriel"]
gandalf = wizards["Gandalf"]
gandalf_opinion_of_galadriel = gandalf["opinion_of"][galadriel.hash()]
galadriel_opinion_of_gandalf = galadriel["opinion_of"][gandalf.hash()]

This gives relationship information from two points of view. To create an address that is the same from both points of view, you can use a sorted tuple the two hashes:

def chemistry(story, a, b):
    return story["chemistry"][sorted( (a.hash(), b.hash()) )]

print chemistry(story, gandalf, galadriel).hash()
print chemistry(story, galadriel, gandalf).hash()

Don't Mix treeprng with random

A call to a random function looks right at home in code that's otherwise converted to treeprng. Not only that, it would work fine at first. That would be a slow-burning bug, be careful. A safeguard would be to remove any import for/of random, and unpack any import * from any module that imports from random.

The TreePRNG Life Cycle

You may have noticed in our examples, that some TreePRNG objects are used like dicts, but others are used like random.Random objects. A TreePRNG object can be used

  • any number of times as a dict, or
  • once with a random.Random method or the .sequence() method,

but not both.

A TreePRNG is created in an uncommitted state. If you use it as dict, it returns a new TreePRNG. The child is uncommitted, but the parent becomes committed to being a dict of TreePRNGs.

If you use an uncommitted TreePRNG by calling one of the Random methods, it does what you would expect that method to do, and then goes into a spent state, where it won't do anything else.

If you call the .sequence() method, it returns a seeded instance of (a subclass of) the random.Random class, which you can use as long as you like. But the TreePRNG object goes into the spent state.

Why the Dict and Spent States?

To try to catch code that would use the same set of random numbers for two different purposes, by mistake. Or would get different results by calling the same methods, but in different order.

TreePRNG wants you to give a different address (path of dict keys) to every single random number (actually, to every Random method call). So after setting up...

frodo = TreePRNG("Lord of the Rings")["hobbits"]["Frodo"]
hair_colors = ["brown", "black", "blond", "red", "gray", "white"]

...this works:

frodo_hair_color = frodo["hair_color"].choice(hair_colors)
frodo_height = frodo["height"].random() * 2.0 + 3.0

...and gives the same results as:

frodo_height = frodo["height"].random() * 2.0 + 3.0
frodo_hair_color = frodo["hair_color"].choice(hair_colors)

...but this is prevented by raising an exception:

frodo_hair_color = frodo.choice(hair_colors)
frodo_height = frodo.random() * 2.0 + 3.0

Other Methods

sequence()

The .sequence() method returns a PRNG that is seeded by the place in the tree. .sequence is intended to generate long, uniform sequences that don't need to be accessed randomly. For short and/or random-access lists, see Lists.

The type of PRNG can be specified with the optional prng_class parameter of the .sequence() method, or with the sequence_class parameter when initializing the TreePRNG tree.

The default prng_class is HashPRNG, a class internal to treeprng that is fast to seed but slower in generating long sequences. HashPRNG doesn't (as of this writing) support the .jumpahead(), .getstate() and .setstate() methods of random.Random. The random.Random class is slower to seed, but much quicker to generate long sequences, than HashPRNG.

hash()

hash() returns a large int to be used as a token representing a point in the tree. hash() commits the TreePRNG object to being a dict. The int is the output of the hash algorithm used in the tree. See Cross-Products for hash()'s intended use.


View or add comments on treeprng and its documentation here.

TreePRNG banner

--SteveWitham

Clone this wiki locally