Auto generate an Index

Do I really need to go over everything and add \index{}?

Unfortunately yes. At best even if you write a script to automate it you will end up with a concordance, and a concordance is not an index.

In my opinion it is actually better to postpone the writing of the index to the time the book is almost ready. Writing an index is an art and most publishers employ "human indexers" to write an index that is useful and serves its function.

Since luckily, you missed it the first time round this is a good time to give it a bit of a thought and planning, before you delve into it.

The most important points to consider when developing an index, is categorization and consistency. Think of the likely readers of your book (or even the older you that will forget what the younger you wrote) and provide headings that are likely to be used when searching for information. Consider for example a historical book, describing early ships and their trade routes. It can be meaningless for example to just index the ship name by one word. Consider the following MWE:

\documentclass{article}
\usepackage{makeidx}
\makeindex
\DeclareRobustCommand{\ship}[1]{\textit{#1}\index{Steam ships!#1}}
\DeclareRobustCommand{\AUports}[1]{\textit{#1}\index{Austalia ports!#1}}
\begin{document}
  One of the early steam ships to sail to \AUports{Melbourne} 
  was the \ship{Africa}. Its maiden trip was on the 1.1.1870 and 
  its last trip ten years later on the 13.12.1880. 
\printindex
\end{document}

I have used a heading to categorize the ship as a steam ship (you can add macros as necessary) and provided a second one to classify Melbourne as a port. By creating a number of commonly used categories around your topic you can also ensure that you have a good classification system as well as providing consistency. As you might have probably noticed the ship's name is typeset in italics, to comply with the Oxford Style Guide and by creating a macro both the indexing as well as the typesetting are done correctly and efficiently.


I know this is an old thread, but I want to mention that I have just released a piece of software to address this problem. It isn't perfect, but as a "semi-automatic" solution it is a lot faster than going through with a text editor and adding the \index{} tag one at a time. It contains one program that reads your LaTex file and uses various heuristics to suggest terms that should be in an index, and another problem to rapidly browse through your LaTex file and insert the terms. It is GPL and you can download for free at https://sourceforge.net/projects/indexmeister/

The same page also has a link to a short YouTube video tutorial I made showing how the software is used. Hopefully, it will be useful to someone. I'm already using it in-house in my small publishing company.