Revised blog tech
© 2023-05-29 Luther Tychonievich
All rights reserved.
other posts
How and why I reworked my blogging software.

My previous blog

In 2010 I was a graduate student with an interest in learning all the major computing technologies of the time. Among that years’ explorations for me were XSLT and CSS, and as an exercise I decided to build a weblog system using them. I wrote my first post in January 2011 and posted multiple times a week from May 2011 through August 2017.

For that system I wrote posts in XML, a custom dialect that included much of XHTML and several tags of my own design, some blog-specific and some for specific elements like pseudocode blocks or fractions which I designed myself. A lengthy XSLT described how to translate that into pure XHTML and a Python script wrapped that into an Atom feed and index page. I then added a somewhat kludgy commenting system and backed the whole on web space provided by my university.

I also spent some effort on making the blog look the way I wanted, or at least as close to that as I could get it. Hyphenation was then not natively supported by any browsers; Firefox and Safari would add it the next year and other browsers not for another decade, so I used a JavaScript library reimplementation of Knuth’s heuristic approach. Margin notes, which I much prefer over footnotes, I had to engineer myself as I could not find a good solution online, nor indeed did I find a second implementation until Tufte-CSS11 I generally like what Tufte-CSS is doing, but have different visual tastes than Edward Tufte. was started three years later. Font quality, consistency, and rendering accuracy were not as nice then as they are now so I spent some effort finding open-licensed fonts I thought worked well, only to pull them a few years later when I discovered how very browser-dependent their appearance was.

Authoring: XML or Markdown

My blog posting frequency dropped dramatically after post 647 when I began teaching a five-day-a-week early-morning religion class in addition to my full-time job at the university. Around that time I also reduced the frequency with which I wrote TeX22 I started keeping my personal journal in various TeX\TeX dialects in 2004 and increased the frequency with which I wrote in Markdown33 Markdown was, and still is, the authoring format of choice for FHISO and later GEDCOM. When I would write one of my now-infrequent blog posts, I often found myself annoyed with XML.

XML was one of the earliest successful human-readable text markup languages. It works well, and has been adapted to many other applications, including purely data-oriented applications with little if any text. But it’s not very forgiving as a language to type in. Something as common as creating a new XHTML paragraph requires typing seven characters, four with the shift key, one without, and two that work either way; and the text of the paragraph goes in the middle not the end so the concept new paragraph is split into two separate moments in the conceptual authoring process. I found that many common authoring tasks I implemented using various tricks and workarounds, such as auto-completing the second half of new paragraph in my editor, copying past posts to edit into new ones instead of starting over, and copy-pasting dozens of templates, filling some in, then deleting the extras.

LaTeX was definitely better, less verbose and more author-friendly, but the toolchain for using it as a blogging tool was not very mature. The TeX toolchain has paged medium as the end goal and while TeX-to-HTML tools existed, I did not find them to be very flexible or user-friendly. Further, TeX is a Turing-complete language with an extensive library of packages and implementing my own TeX-to-HTML tool was more work than I wanted to undertake.

Markdown, on the other hand, was very author-friendly. It’s easy to type, difficult to get wrong, and relatively customizable44 Perhaps too customizable: there are dozens of Markdown dialects in wide use today and learning the dialect of any given implementation is a barrier to their effective use. The longer I went between blog posts, the more my experience writing one was unpleasant because of a lingering desire to be writing in Markdown instead of XML.

Better CSS and Math

When designing my new Markdown-based blogging system I also revisited the CSS I used for it. I’ve become much more proficient with CSS in the past 13 years, and browsers have added support for various additions that have made it more useful.

For sidenotes, I originally used two nested elements for each note, one to establish the allowable space and the other to fill the needed part of that space. Page margins were created ignoring the sidenotes, so on narrow windows they might not be fully visible even if extra space was present on the other side of the page. Since I wrote that implementation three new features have been added to CSS: arithmetic of units, viewport-relative units, and root-element relative font size units. Together these allow much more precise margin computation and with them a simpler single-element sidenote.

Hyphenation of English text is now available in every major browser, with dictionary-based hyphenation for common words that previous heuristics got wrong.

Other visual elements have also improved, for example by wider distribution of well-kerned fonts, improvements in browsers text rending algorithms, and better systems for per-user customization.

Additionally, the KaTeX project has almost solved the problem of rendering mathematics for a browser: there remains a copy-paste problem, but otherwise it works well; thus my old blog’s custom hacks for rendering factions are no longer needed. Likewise, rendering code has improved on multiple fronts with both server-side and browser-side solutions.

My current toolchain

This post was created using pandoc, which can do many things; I use it to parse Markdown and perform code syntax highlighting. I use two custom filters and custom templates to modify its default behvior.

One filter is a Lua filter for changing Markdown footnotes into sidenotes:

local notenum = 0
function Note(elem)
  notenum = notenum + 1
  local marker = pandoc.Span(pandoc.Superscript(pandoc.Str(notenum)), {class="notemarker"})
  local body = pandoc.utils.blocks_to_inlines(elem.c)
  pandoc.List.insert(body, 1,pandoc.Space())
  pandoc.List.insert(body, 1,marker)
  local wrapped = pandoc.Span(body, {class="sidenote"})
  return pandoc.List({marker, wrapped})
end

The other is a simple-looking JavaScript filter for server-side rendering of mathematical notation that depends on katex for all the interesting parts:

var pandoc = require('pandoc-filter')
var katex = require('katex')
async function action(el, _1, _2) {
  if (el.t === 'Math') {
    var [tag, math] = el.c
    return pandoc.RawInline("html", katex.renderToString(math, 
      {displayMode: tag.t === "DisplayMath"}))
  }
}
pandoc.stdio(action);

Custom templates handle putting the authored text into appropriate containers for the web view, the atom feed, and the index view. These take advantage of Pandoc’s support for YAML metadata to provide titles and brief summaries, and are run with a small shell script that populates the file modification time as an additional metadata value.