Using an exercises package to build lots of Math/Calculus exercise lists and tests

Since your question is about storing, maintaining and referencing a large set of exercises (potentially in the order of 10,000), I'm going to concentrate on that, so the style here is very basic.

It's possible to define conditionals using \newif (or through commands provided by packages such as etoolbox). For example:

\newif\ifsolutions
\newif\ifcomplete

These default to false, but can be switched on:

\solutionstrue
\completetrue

It's also useful to provide syntactic commands to mark the solution. For example:

\newcommand{\solutionname}{Solution}
\newcommand{\solution}{\par\textbf{\solutionname}:\par}

As has been mentioned in one of the other answers, it's also possible to use environments and the comment package. For multilingual support, the caption hooks can be used to redefine \solutionname as appropriate. For example:

\usepackage[USenglish]{babel}

\addto\captionsUSenglish{%
  \renewcommand\solutionname{Solution}%
}

Now an exercise can be written using these commands. For example:

$y = \sin(2x)$
\ifsolutions
 \solution
 \ifcomplete
  Intermediate steps, further details etc.
 \fi
 $y' = 2\cos(2x)$
\fi

Environments provide a more LaTeXy feel, but let's concentrate on storing and accessing the questions.

The simple method, which has already been suggested, is to put each question in a separate file and load it with \input. For example, if this exercise is in the file exercises/calculus/easy/dsin.tex then the following MWE works:

\documentclass{article}

\newif\ifsolutions
\newif\ifcomplete

\solutionstrue
\completetrue

\newcommand{\solutionname}{Solution}
\newcommand{\solution}{\par\textbf{\solutionname}:\par}

\begin{document}
\begin{enumerate}
\item \input{exercises/calculus/easy/dsin}

\end{enumerate}
\end{document}

This is a relatively generic method, which can easily be translated to other TeX formats. For example, the Plain TeX equivalent is:

\newif\ifsolutions
\newif\ifcomplete

\solutionstrue
\completetrue

\def\solutionname{Solution}
\long\def\solution{\par{\bf\solutionname}:\par}

\newcount\questionnum

\long\def\question{%
 \par
 \advance\questionnum by 1\relax
 \number\questionnum.
}

\question \input exercises/calculus/easy/dsin

\bye

The problem is that, although this structure is fine for a small number of questions, it can become unmanageable for 10,000. I mentioned datatooltk in the comments, which can read and write .dbtex files (datatool's internal format), but I don't recommend using this format directly. These files just contain LaTeX code that defines the internal registers and control sequences used by datatool to store the required data. There's no compression and it takes up a huge amount of resources. The datatooltk application works better as an intermediary that can pull filtered, shuffled or sorted data from external sources in a way that can easily be input in the document. (See the datatool performance page that compares build times for large databases.)

There are switches, such as --shuffle or --sort, which instructs datatooltk to shuffle or sort the data after it's been pulled from the data source. This uses Java, which is more efficient than TeX, but if the data is stored in a SQL database, it's even more efficient to include these steps in the actual --sql switch. (Currently, datatooltk is only configured for MySQL, but it may be possible to use something else if the necessary .jar file can be added to the class path.)

SQL databases can be optimized to improve performance. Suppose you want to randomly select 20 questions from 500. How do you perform that selection in LaTeX? First you'd need to use the shell to find out all the available files (or have an index file that can be parsed). Then you need to shuffle the list. That will take a while to do with TeX. It's more efficient to do this with SQL. (See, for example, MySQL select 10 random rows from 600K rows fast.)

If you decide to use SQL, the next thing to consider is the table structure.

  • You'll need a unique id field. With this you'll be able to specifically select certain questions rather than have a random selection. (An auto increment primary key is best.)
  • A field containing the question. (Let's call it Question.)
  • A field containing the brief answer. (Let's call it Answer.)
  • A field containing the extended answer. (Let's call it ExtendedAnswer.)
  • A field identifying the difficulty level. (Let's call it Level.) This could be an integer (1 = easy) or an enumeration (easy, medium, hard).
  • A field identifying the topic. (Let's call it Topic.) An enumeration is probably the simplest type (for example, calculus, settheory).

I'm not quite sure about the language. There are two approaches that I can think of: have fields for the other language (For example, QuestionPortuges, AnswerPortuges and ExtendedAnswerPortuges) or have a separate entry for the question in a different language with an extra field for the language.

So the above exercise example, could have

  • Question => $y = \sin(2x)$
  • Answer => $y' = 2\cos(2x)$
  • ExtendedAnswer => Intermediate steps, further details etc. \[y' = 2\cos(2x)\]
  • Level => 1
  • Topic => calculus
  • Language => english or ExtendedAnswerPortuges => Passos intermédios, etc. \[y' = 2\cos(2x)\]

Note that this doesn't include the syntactic command \solution or the conditionals \ifsolutions and \ifcomplete, which makes it easier to arrange the various parts of the question and answer.

It may be that some exercises require a particular package (such as amsmath or graphicx), so perhaps there could also be a field for the required packages. For example Packages => graphicx,amsmath.

Any images or verbatim text must be stored outside the database somewhere on the file system. They could be on TeX's path or the database table could have a field with a list of external resources or the question/answer could simply use the full path.

The datatooltk call can be done before the LaTeX run or using the shell escape. There's also a datatooltk rule for arara users. Let's suppose, I use datatooltk to pull a random selection of questions and save the results in a file called exercises.dbtex. This can then be loaded in the document using:

\DTLloaddbtex{\exercisedb}{exercises.dbtex}

If the data includes the Packages field, you can make sure all the required packages are loaded by adding the following to the preamble:

\DTLforeach*{\exercisedb}{\Packages=Packages}
{\DTLifnullorempty{\Packages}{}{\usepackage{\Packages}}}

In the main part of the document:

\begin{enumerate}
\DTLforeach*{\exercisedb}% data base
{\Question=Question,\Answer=Answer,\ExtendedAnswer=ExtendedAnswer}% assignment list
{%
  \item \Question
  \ifsolutions
   \solution
   \ifcomplete
    \ExtendedAnswer
   \else
     \Answer
   \fi
  \fi
}
\end{enumerate}

Further reading: Using the datatool Package for Exams or Assignment Sheets


What I did for my students was this.

  1. I seperated all my exercises in different folders by subject. Each folder is named like this "1_package, 2_package" etc.
  2. Each exercise in any folder is numbered by the name of the folder and the number of the exercise. For example "3_294" is the 294th exercise in the 3rd folder.
  3. In order to remember the content of each folder I made a list that shows what's inside. Let's say 1-> Integration, 2->Limits etc.
  4. I created the following (mwe) Test.tex template:
\documentclass[a4paper]{article}
  \usepackage{amsmath}
  \usepackage{enumitem}
  \newcommand{\exercise}[2]{\input{#1_package/#1_#2.tex}} 
  \begin{document}
  \begin{enumerate}
  \item \exercise{1}{3}
  \item \exercise{2}{28}
  \item \exercise{3}{294}
  \end{enumerate}
  \end{document}

The result for example could be like this: enter image description here

But I was also concerned about how could I select the proper exercise among 500+ exercises. Thus I created a list of them using the following code in a Database.tex file:

\begin{enumerate}[label=Integration.\arabic*.]
\foreach \t in {1,...,"number of the last ex. in folder 1"}{\item\exercise{1}{\t}}
\end{enumerate}
\begin{enumerate}[label=Differential Equations.\arabic*.]
\foreach \x in {1,...,"number of the last ex. in folder 2"}{\item\exercise{2}{\x}}
\end{enumerate}
\begin{enumerate}[label=Limits.\arabic*.]
\foreach \y in {1,...,"number of the last ex. in folder 3"}{\item\exercise{3}{\y}}
\end{enumerate}

............... E.T.C. The output is a list of all my exercises in all of my folders. Maybe printing that list will be more helpful in choosing the exercises.