Automatic Concordance File Creation

Wanna be technical writers search in vain for this pink Unicorn. You see, the ugliest part about writing a technical book is creation of the index. Normally this involves a lengthy reading process where you select each word with the mouse and do a bunch of manual clicking and keying to create some hidden codes in the document file. These hidden codes will then be processed when the users chooses to “create index” and, in theory, a perfectly formatted index will appear at that point in the document.

Let me be the first to tell you that your first three or four cuts at the index will be trash. If you are not a professional and do not keep a pile of scrap paper showing what level and under what word you want each item to appear, your index will have things luggied up pretty bad.

A concordance file is a magic thing. Each word processor has their own format, but they all use it the same way and most provide an editor to create the entries. All you have to do is get your list of words together and key them in. Sounds simple, right? Well, if the OpenSource and lower end word processors would bother to include a “unique word list” function, it would be. If you happen to be on a platform where TEA is available, then you have it made. This is one of the many tasks TEA was designed to help with.

For an editor developed by a journalist without any formal C++ training, TEA rocks. It has a bit of a quirky interface, but, if you remember the old IBM DOS editors, it’s rather nostalgic. Simply save your document as a TXT file, then open it with TEA

Functions->Analyze->Extract words

You should end up in a new buffer which has one word per line and probably a lot of blank lines. Use <CTRL><A> or the menu to select all.

Functions->Sort->case insensitively

Use <CTRL><A> or the menu to select all.

Functions->Filter->Remove duplicates

Now, manually delete the words you do not want in your index and save your new TXT file.

You can now use the concordance editor provided with your word processor to create one entry for each of these words. You will have to fudge a bit to create multiword index entries, but the bulk of it is done for you. A little work with the mouse and you will be all set. When you generate your index this time it should really be the way you want.