Converting Markdown to LaTeX, in LaTeX

Here's Pandoc-based solution. You will have to enable --shell-escape for this to work, since it uses \write18. Depending on what you want, you may need to customize the Pandoc options.

\documentclass{article}

\usepackage{fancyvrb}

\newenvironment{markdown}%
    {\VerbatimEnvironment\begin{VerbatimOut}{tmp.markdown}}%
    {\end{VerbatimOut}%
        \immediate\write18{pandoc tmp.markdown -t latex -o tmp.tex}%
        \input{tmp.tex}}

\begin{document}

\begin{markdown}
# Section

Some text that goes on for a while.

A list:

* Item
* Another item 

\end{markdown}

\end{document}

I'm not sure why a Pandoc-based solution/answer on this page harvested so many upvotes, while being overly complicated.

If you use Pandoc, the method is far easier:

  1. Just write Markdown.
  2. At occasions where you want to apply LaTeX features in your final document, just sprinkle the Markdown with your LaTeX snippets...

Here is a working (not so minimal) example, where the main contents are written in Markdown, with the occasional LaTeX code sprinkled in between:

% Proof of Existing Pandoc Feature
% Kurt Pfeifle
% May 25th, 2015

# Basics

This is not just a *proof of concept*, but a basic utilization of [Pandoc][2]'s behavior
when it comes to [Markdown][1] processing.

* Pandoc, by default, passes any \LaTeX\ code snippets it identifies within the Markdown
  source file to the target document, if that target document is a \LaTeX\ one (this
  includes f.e. Beamer or PDF output.
  (It does not pass these snippets to any other output formats, but instead drops them.)
* Pandoc, by default, also passes any HTML code snippets it identifies within the
  Markdown sources to the target document, should that be HTML based (this includes f.e.
  EPUB or RevealJS output).
  (It does not pass these snippets to any other output formats, but instead drops them.)
* This allows for any specific formatting to be achieved in the target document format:
  ***(1)*** Insert two versions of the snippet, one as HTML, one as \LaTeX\. ***(2)***
  The HTML one will make it to the HTML-based targets, while \LaTeX\ is being dropped;
  the \LaTeX\ will make it into \LaTeX-based targets, while the HTML is being dropped.

It also supports linked references. **[Click here](#tab:fsttable)** to jump to a page with a
table.


## How it works

It works out of the box:

1. Just write a Markdown document, and sprinkle your \LaTeX\ code snippets in between.
1. Save the document with an `.md` suffix.
1. Run the Pandoc conversion:

    ```` {.bash}
     pandoc --from=markdown --output=my.tex my.md --to=latex --standalone
    ````

    or

    ```` {.bash}
     pandoc --from=markdown --output=my.pdf my.md                                   \
            --variable=geometry:"margin=0.5cm, paperheight=421pt, paperwidth=595pt" \
            --highlight-style=espresso
    ````

I want a few words appear as \textcolor{red}{red} or in a \textcolor{green}{different}
\textcolor{blue}{color}. Here is the Markdown source code of previous sentence; it uses
no syntax highlighting (unlike the previous two code blocks):

     [...] as \textcolor{red}{red} or in a \textcolor{green}{different}
     \textcolor{blue}{color}.

After this paragraph, I want to insert a page break. I'll add `\newpage{}` beneath, on
a line of its own.

\newpage{}

This was the Markdown source code (with context) for the page break preceeding this:

```` {.markdown}
After this paragraph, I want to insert a page break. I'll add `\newpage{}` beneath, on
a line of its own.

\newpage{}
````

## Inserting a \LaTeX\ table

Here comes a table. Its code is inserted as \LaTeX\ code into the Markdown source document:

\begin{table}[h]
\centering
\begin{tabular}{|r|l|}
  \hline
  7C0 & hexadecimal \\
  3700 & octal \\ \cline{2-2}
  11111000000 & binary \\
  \hline \hline
  1984 & decimal \\
  \hline
\end{tabular}
\caption\small\textit{\textcolor{magenta}{This table shows some data}}
\label{tab:fsttable}
\end{table}

This is the Markdown code for the previous table, including its textual context:

```` {.latex}
Here comes a table. Its code is inserted as \LaTeX\ code into the Markdown source document:

\begin{table}[h]
\centering
\begin{tabular}{|r|l|}
  \hline
  7C0 & hexadecimal \\
  3700 & octal \\ \cline{2-2}
  11111000000 & binary \\
  \hline \hline
  1984 & decimal \\
  \hline
\end{tabular}
\caption\small\textit{\textcolor{magenta}{This table shows some data}}
\label{tab:fsttable}
\end{table}

This is the Markdown code for the previous table, including its textual context:
````

## Inserting a \LaTeX\ Formula

To include a mathematical formula in Markdown, enclose it with **`$`** characters like this:

```` {.latex}
 $\frac{n!}{k!(n-k)!} = \binom{n}{k}$
````

The result:

$\frac{n!}{k!(n-k)!} = \binom{n}{k}$

# Status

There are no known (to me) bugs for this feature.

[1]: http://daringfireball.net/projects/markdown/
[2]: http://pandoc.org/

Convert it to PDF with this command:

pandoc --from=markdown --output=my.pdf my.md                                   \
       --variable=geometry:"margin=0.5cm, paperheight=421pt, paperwidth=595pt" \
       --highlight-style=espresso

The result:

Screenshot of my.pdf


A proper markdown parser is a task too complex for latex. Not because TeX is not a Turing complete language (it is), but because it would be very difficult to implement, and probably will have a very poor performance.

One idea which immediately comes to mind is to use LuaTeX, and code the markdown parser in Lua language. This sound certainly feasible.

According with Wikipedia, there are only three Markdown parsers implemented in Lua:

  • markdown.lua. It is simple, and it would be probably easy to integrate into a LuaLatex package, but currently it only outputs HTML, so a complete rewrite of the output part should be done to generate latex code instead.

  • lunamark. This is very complete. It can output a variety of formats, including LaTeX, and supports a lot of non-standard extra modules. However, it has a number of dependencies of other Lua packages, so in order to integrate it with LuaTex it should be stripped down (for example, template support could be removed, since the template will be the document which uses it, and also all output formats except LaTeX).

  • lua-discount This is a lua binding to a parser written in C, so it can be discarded because that would not be embeddable in LuaTeX.

So the Lua approach basically reduces to one of those two:

  1. Change markdown.lua to make it output LaTeX
  2. Strip down lunamark to remove all unnecessary stuff.

I'm not sure which one will be easier. Probably 1 will produce some useable in a shorter time, but 2 looks like a better long term solution.

Also, the interface between LaTeX and Lua should be defined. A sensible approach would be:

% Latex stuff....
\begin{markdown}
# First section
etc..
\end{markdown}

However I'm very new to the LuaTeX world, and (currently) I don't know how to pass to a Lua function all the text inside the markdown environment. Also we should prevent TeX to tokenise all that text.

A possible idea is to use some verbatim-like tricks to write the contents of the environment to a file, and then use lua to process that file and insert back the resulting tex in the main document (a tex.sprint() would do, wouldn't?)

I would like to hear the opinion of the LuaTeX experts here...

PS: Does this count as an answer? Or should I delete it and re-post as a question instead?