Dynamically count and return number of words in a section

You can use texcount to count the words. It automatically produces subcounts for the sections.

Here's a new macro that calls texcount, extracts the subcount for the current section, and then inserts the word count into the document. It requires write18 to be enabled, and texcount must be in your path (or you have to include the full path to the executable in the macro).

\documentclass{article}
\newcommand\wordcount{
    \immediate\write18{texcount -sub=section \jobname.tex  | grep "Section" | sed -e 's/+.*//' | sed -n \thesection p > 'count.txt'}
(\input{count.txt}words)}

\begin{document}
\section{Introduction}
In publishing and graphic design, lorem ipsum is placeholder text (filler text) commonly used to demonstrate the graphics elements of a document or visual presentation, such as font, typography, and layout. The lorem ipsum text is typically a section of a Latin text by Cicero with words altered, added and removed that make it nonsensical in meaning and not proper Latin.

\wordcount
\section{Main Stuff}
Even though "lorem ipsum" may arouse curiosity because of its resemblance to classical Latin, it is not intended to have meaning. Where text is comprehensible in a document, people tend to focus on the textual content rather than upon overall presentation, so publishers use lorem ipsum when displaying a typeface or design elements and page layout in order to direct the focus to the publication style and not the meaning of the text. In spite of its basis in Latin, use of lorem ipsum is often referred to as greeking, from the phrase "it's all Greek to me," which indicates that this is not meant to be readable text.

 \wordcount
\section{Conclusion}
Today's popular version of lorem ipsum was first created for Aldus Corporation's first desktop publishing program Aldus PageMaker in the mid-1980s for the Apple Macintosh. Art director Laura Perry adapted older forms of the lorem text from typography samples — it was, for example, widely used in Letraset catalogs in the 1960s and 1970s (anecdotes suggest that the original use of the "Lorem ipsum" text was by Letraset, which was used for print layouts by advertising agencies as early as the 1970s.) The text was frequently used in PageMaker templates.

\wordcount
\end{document}

You asked for a LaTeX solution, but for completion I'll provide a ConTeXt solution as well, it might be helpful for someone. It does not rely on external programs.

\startluacode
    userdata = userdata or { }

    function userdata.wordcount(listname)
        filename = file.addsuffix(tex.jobname,"words")
        if lfs.isfile(filename) then
            local w = dofile(filename)
            if w then
                if type(w.categories[listname]) == "table" then
                    context(w.categories[listname].total)
                else
                    context(w.total)
                end
                context.par()
            end
        end
    end
\stopluacode

\def\wordcount{%
    \dosingleempty\dowordcount}

\def\dowordcount[#1]{%
    \ctxlua{userdata.wordcount("#1")}}

\starttext

% Set up the word count
\ctxlua{languages.words.threshold=2}
\setupspellchecking [state=start, method=2]

\setupspellchecking [list=foo]
\startsection [title=Foo]
    Foo Bar
\stopsection

\setupspellchecking [list=lorem]
\startsection [title=Lorem]
    Lorem ipsum dolor sit
\stopsection

\setupspellchecking [list=stop]

\startsubject [title=Statistics]
    Words in Foo:   \wordcount [foo]
    Words in Lorem: \wordcount [lorem]
\stopsubject

\stoptext

The result:

result

The lua function reads in the external file \jobname.words that is created by the \setupspellchecking command. It returns a a lua table with some statistical information and extracts the relevant data. \wordcount is just a nice wrapper to keep the interface “contextish”.

By default only words with at least four characters are being counted. In the example I set the threshold to two characters.

Notes: In the example the text in the section heading is counted as well. Unlike the usual ConTeXt behaviour, this word count implementation needs two context runs.

The code is a modified version of the one found in the ConTeXt distribution (s-lan-03).

If you are on windows, you can visit this site and install winedt. It has a built in word count feature (Document->word count). To count words in a para or section, you have to select that section and perform the count.

Dynamically count and return number of words in a section

Tags:

Word Count

Sweave

Related

Recent Posts