Python Script to Generate Frequency Counts of Words in a Text

The following python script takes a text file as input and produces an unsorted list of frequency counts of words in the text as an output text file. It’s pretty simple and short, and uses only the regular expressions module re of python, which is a standard library, so this script will run in any system with a standard python installation.

from re import compile
l=compile("([\w,.'\x92]*\w)").findall(open(raw_input('Input file: '),'r').read().lower())
f=open(raw_input('Output file: '),'w')
for word in set(l):
	print>>f, word, '\t', l.count(word)

Note that ‘words’ here doesn’t mean dictionary words (with such a small script it’s not possible to check against dictionary words). Instead, ‘words’ are what you get when you split the text at regular expression word boundaries. So if you have a word like “365b1”, that’ll also be listed in the output.

Here’s an example.

Input file contains:

This is text. This is written with intentionally repeated words. This is repeated, intentionally, to produce short output. This text — with intentionally repeated words — is written to produce short text output.
This output is text. Intentionally short text.

Output file will contain:

short 			3
this 			5
text 			5
is 				5
repeated 		3
intentionally 	4
to 				2
written 		2
produce 		2
words 			2
output 			3
with 			2

The columns are tab-separated, so you can copy this into spreadsheets which should detect this as two columns and paste accordingly. Then you can draw frequency plots or whatever.
If you want, you can write a bit more code to sort the output alphabetically or by frequency count. I didn’t bother.

Engraved Text!

While working on the Inquivesta website, I decided that I wanted to get rid of as many of the elements of the original template as possible, and replace them with my own stuff. Among them were the buttons. I tried a few things. The first idea was the Mandelbrot set, in a pillow sort of bevel:

mandelbrot1     mandelbrot2     mandelbrot3

But then I didn’t like the end result the next day I opened the page. So after a lot of work, I finally came up with this:




Yes! ENGRAVED TEXT! Drool on it!

Photoshop is impossible. I can’t believe it was first released in a floppy.