Nov 21, 2011

Random Name Generator

The graphic on the right shows the 100 most common male names in America where the size of  each name is proportional to its popularity. The graphic is a ton more compelling as an interactive 3D document, but the picture conveys the main idea of how name popularity is distributed.

The difficulty in making an image like this, or anything like it really isn't in programming it. The program which was used to generate it is very simple. More difficult is acquiring the data.  In this case, I was able to acquire the data from Wolfram|Alpha in a computable format with very little work.

For exploratory programming, acquiring data to work with is probably one of the most important issues.  This is something not well addressed by most higher level programming languages. For them, the data which the code acts on is a secondary feature of the language rather than seen as an integral part of it.

Once I have access to the data, I am able to do a ton of difficult things that I would not normally be able to do. I can create a random name generator that returns realistic names back to me in the same proportion I would likely find them in the real world.

Despite the fact that the internet has brought us a ton of data to work with, finding what you want in a computable format is still very difficult.  Wasn't the semantic web supposed to fix that by assigning meaning to the data?