I would apologise for the long silence, but that would imply that I thought anyone was hanging on my every word and desperately missing me in my absence :)
Just a short one for now, which will hopefully be followed up on tomorrow.
The delightful Amy Buchanan-Hughes (who, if you didn't already know, is President of a most impressive charity) and I unintentionally fell into a nerdy word game over IM today. Like all the best quick nerd-wordy games, it evolved spontaneously and without explicit communication of the structure. The rules were simple: in turn, each player would state a five-letter word, which was a one-letter mutation of the previous player's word. So, if Amy had played "TASTE", then "WASTE" would be a legal play for me, as would "TASTY", but "TASER" would not. I don't know if she was playing by this rule as well, but I made a further limitation that I couldn't re-use a word that had already been played.
Being of a graph-theoretical bent, this got me thinking. What would the graph of all 5-letter words, with edges between 1-mutation-linked words, look like?
The answer, it seems, is; quite crowded.
(Direct link to image)
The image above was generated using the Scrabble-legal wordlist found here, some simple Python logic, and the excellent graphing facilities python-graph and Graphviz. It was whipped up reasonably quickly on my laptop, though I'm currently in the process of setting up my Raspberry Pi (did I mention they are awesome? Expect a lovingly ranty blog post soon) to generate graphs of varying configurations, and upload them to Dropbox automagically.
Some things that I learnt from this process:
- Release early, release often - sending an early draft of this picture to Amy and Laura, my ever-reliable fact-checkers, alerted me to the fact that my logic was initially suffering from an off-by-one index error, linking words to the word preceding the proper target.
- Python modules install *so* much easier on Linux than on Windows.
- ssh makes you feel like an uber-cool hacker.
- There are a hell of a lot of Scrabble-legal words that no-one in their right mind (except maybe Aaron or Abbie - for entirely differing reasons) would ever use.
- A glut of data can be a very bad thing without sensible structure and the ability to pare away extraneous data points to reveal patterns.
- Graphviz, so far, has thwarted my efforts to label nodes with graph-theoretic terms; "node", "edge", "digraph", etc. Bonus points to anyone who can spot the node in L5 above which is labelled '3099' instead of 'graph'!
- Incidentally, I named the graph structure here Ln as a mathematical in-joke. A common graph-theoretical object is Kn, the complete graph on n vertices (which, of course, you already know). I figured it would be appropriate, given the context of the graph setup, to change the letter by a single place :)
- Everything looks better with some colour:
(The above is the graph of all 4-letter words, with edges coloured by which letter is being changed - (red, blue, green, yellow), for (first, second, third, fourth). There's little semantic information to this colouration choice, but the monochrome mess was getting me down!)
Hopefully I'll blog again soon with some actual analysis of the graphs (if I can find anything interesting to say, other than "Look! Non-trivial components! This game may be very short-lived!") - or just some pretty pictures! Either way, setting up this process on my RPi is teaching me a lot, which can't be bad!
For anyone who is interested, here are the files I used/created. Nomenclature is as follows:
- Ln.dot (where n is a number) is the .dot file for the graph on n-letter words (.dot files are handled with the Graphviz suite, linked above)
- Ln.png is an image of the graph
- Lcn.dot and Lcn.png are as above, but with edges coloured by index of changed letter
- wordmap.py and wordmap.pyc are the Python files (uncompiled and compiled) that I used to create these beautiful monstrosities
Have fun, and make something more beautiful than I did!
No comments:
Post a Comment