quickly converitng djot to html+mathml
posted
I write djot instead of markdown for my site. I was enticed by the promises of speed and and an actually consistent syntax. check out how djot scales in file size compared to markdown, using the two fastest featureful parsers that I personally know of (jotdown for djot and lowdown for markdown):
# of headings | lowdown | jotdown |
---|---|---|
1000 | 0.631s | 0.023s |
2000 | 4.584s | 0.085s |
3000 | 15.403s | 0.188s |
you can see that lowdown seemingly scales polynomially while jotdown scales linearly as more headings exist. probably similar with other kinds of environments. though, for my blog, I’m not really working with files of this size. the build time difference between lowdown and jotdown is negligible for me and probably always will be. oh well.
in any case, I still find djot nicer to write. there’s less rules and ambiguity in the syntax. it’s a bit awkward how I have to make a new paragraph when writing nested lists, but that’s about my only reservation. one of the things I found neat about djot is that is has a definite math syntax, when markdown parsers with a LaTeX math extension need to fiddle with that awkward double dollar sign syntax. I’ve experienced it exploding once when I was writing physics notes in markdown some years ago.
here’s how jotdown parses a math block:
$ cat math.djot
## check out this matrix
$$`
\begin{bmatrix}
1 & 0 \\
0 & 1
\end{bmatrix}
`
cool, right?
$ jotdown math.djot
<section id="check-out-this-matrix">
<h2>check out this matrix</h2>
<p><span class="math display">\[
\begin{bmatrix}
1 & 0 \\
0 & 1
\end{bmatrix}
\]</span></p>
<p>cool, right?</p>
</section>
ah. it parses the math and spits it out, but it doesn’t exactly do anything with it spare surround it with the proper LaTeX display/inline math delimiters. when making my blog post about shearing, I externalized all the mathml parsing to LaTeXML and raw-inlined the math into the djot. this was really annoying for two reasons:
- whenever I wanted to edit an equation after editing a bunch of other stuff or having closed my editor, I had to reconstruct the LaTeX from the annotation that LaTeXML makes. the program minifies and slightly changes the styling of the math, so it took a bit of time to mentally parse and edit.
- the file became incredibly difficult to navigate in general. I use soft wrapping in kakoune, not hard wrapping, and the entire blob of mathml for every equation made navigation clunky
it wasn’t until about two days ago that I learned that pandoc is able to do this fine:
$ pandoc -f djot --mathml math.djot
<h2 id="check-out-this-matrix">check out this matrix</h2>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="true" form="prefix">[</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">
\begin{bmatrix}
1 & 0 \\
0 & 1
\end{bmatrix}
</annotation></semantics></math></p>
<p>cool, right?</p>
…but if I built my site with pandoc instead of jotdown, my build time septuples since I call it for every markdown file. currently, it takes about 0.17 seconds to build my site when using jotdown and 1.3 seconds if I use pandoc instead. not great, especially if I were to add a bunch more djot files, I’d imagine!
so I quickly hacked together a little rust program that combines the jotdown library with the math-core crate for rendering mathml from LaTeX. I’ve never written rust before, so it’s not exactly perfect, but I don’t need it to be. it works for me! maybe it’ll work for you too.