A Markdown Workflow
This was originally posted to Detritus. Something I didn’t mention in this post originally but am adding in now because I’ve just thought about it, is that I have done exactly zero additional work to get this post ready to go on the blog. The text below was copy and pasted directly from the markdown document into a markdown block on my WordPress post editor. All I’ve done is make a graphic to go at the top of the post. What you’re reading on this site came from the same source file as the PDF that I link at the end of the post.
I’ve been talking a lot recently about different versions of accessibility in the products we release. I wanted to take a few minutes to write something up about my workflow, why I’m doing what I’m doing, and the things I’ve learned along the way. I also want to acknowledge the potential drawbacks of this process and look at some things that I’d like to explore or figure out in the future.
None of this is "my" work. I’m building on things I’ve learned from friends, from endless Googling to figure stuff out, and from best guesses about how things might work. This also isn’t a tutorial of any kind. I’m not a teacher, I haven’t taken good notes while I’ve been learning this stuff, I’m just sharing some thoughts and my own ways of doing things. I don’t care if you use this yourself or not and I’m not making any judgements either way.
I’ve been helped massively along the way by Yubi, Luke Gearing, and Michael T. Lombardi. Yubi first opened my eyes to the sorry state of accessibility in TTRPGs and helped me make sure that d36 was as accessible as we knew how to make it be at the time. Luke’s post about using markdown and Pandoc to make documents got me started working in markdown, and both Luke and Michael have helped me when I’ve had questions or needed to try and talk something through or figure things out. I also recently discovered The Annotated Archive of Game Design Resources and in particular their Accessibility section, which is a fantastic resource I’ll be digging in to more over the next few weeks.
Yubi also spent a lot of time with me and a few others in my Gather this week running things through screenreaders for us and generally showing us how the PDFs etc. that Pandoc (and a few methods that Matt Sanders is working on in this direction) are falling down. I had already written this and posted it to Detritus but I’ve updated this public-facing post to reflect what we learned. I stand by this method as a means of producing multiple formats quickly and easily, but there’s still work to be done in making them as accessible as we’d like them to be.
My Workflow
Right now my workflow doesn’t look much different to the one laid out in Luke’s post. I draft in Ghostwriter using markdown (which is where I’m writing this post) and I output using Pandoc to HTML, epub, and plain text PDF. A few people have asked why I output in HTML, and the answer is so that I can do things like this.
I also use a script to import markdown to InDesign for when I do my print layouts. This applies styles in InDesign based on the styling in my markdown document and makes life very, very simple for me. You can see my whole process from start to finish here.
There were a couple of things that bugged me about the base output from Pandoc that I spent some time trying to fix, so let’s cover them quickly.
Fonts
I wanted to change the fonts in my PDF, because of course I did. This isn’t supported in basic markdown but you can do it with LaTeX, and you can mix some simple LaTeX commands into your markdown without any issues. You’ll need to tell Pandoc which PDF engine to use in order for this to work, and your choice of fonts is limited, but there is some customisation available here without using external style sheets.
After some trial and error the PDF engine that works best for me is LuaTeX. I don’t know why, so don’t ask me.
At the beginning of my markdown documents I now have a preamble/YAML header that looks like this:
---
author: Chris Bissette
title: A Markdown Workflow
mainfont: AlegreyaSans
---
In order for this to work you need to tell Pandoc which PDF engine to use when converting, which means adding something to the command when you run it. My Pandoc command looks like this:
pandoc -s source-file.md -o destination-file.pdf –pdf-engine=lualatex
There are ways to do more font customisation using CSS but I haven’t looked into that myself. For me personally, having everything within one source file is important purely because it’s lower cognitive load. If you’re interested in working with external stylesheets for this stuff, James Chip has a tutorial here.
I’ve been using this list of fonts. They will, obviously, need to be installed on your system for you to be able to use them.
With LaTeX you can dictate different fonts for headers and body text etc. and I haven’t yet figured out how to do that with markdown without using external stylesheets, but I’d like to.
I don’t dictate fonts in my epubs. The entire purpose of an epub is to allow readers to set their own font, letter size, etc. so I just leave it at default.
Internal Links
Markdown handles internal links to headers natively. You make a header like this:
# A Header
And you link to it with a reference link that looks like this:
[Link text here](#-a-header)
Nice and easy.
Unfortunately when you convert with Pandoc it just doesn’t work 95% of the time. If you look at my PDFs for Reivdene you’ll see that these links are a mess.
I figured out that the reason they aren’t working is because the file itself doesn’t tell Pandoc what it wants to be. The second I added an output format to my YAML header, the internal links worked perfectly. Now if I want to output a PDF my header looks like this:
---
author: Chris Bissette
title: A Markdown Workflow
mainfont: AlegreyaSans
output: pdf
linkcolor: blue
---
By default links don’t look any different to normal text in the PDF output, so I also define the link colour here.
This does add a couple of extra steps to my process, because if I want internal links to work I need to change the output to read "epub" before I make an epub and "html_document" before I make an HTML page, but it’s the work of 30 seconds to do it.
Table Of Contents
I mentioned earlier that you can use a few LaTeX commands in your markdown without issue. Luke mentioned in his post that at some point he’ll write up his "little list of useful LaTeX you can drop straight into markdown to make slightly nicer PDFs" and he hasn’t done that yet, but he has shared them with me and so I’m going to share them with you and hope he doesn’t mind.
At the beginning of my documents, immediately after the YAML header, I include these commands:
\maketitle
\tableofcontents
This means that my document preamble looks like this:
---
title: A Markdown Workflow
author: Chris Bissette
mainfont: AlegreyaSans
output: pdf
linkcolor: blue
---
\maketitle
\tableofcontents
maketitle creates a title page. tableofcontents generates a TOC from your headers and populates it with internal links. You can also follow this with a command to start a new page after the TOC, if you want:
---
title: A Markdown Workflow
author: Chris Bissette
mainfont: AlegreyaSans
output: pdf
linkcolor: blue
---
\maketitle
\tableofcontents
\newpage
Right now that’s basically everything I’m doing in markdown to make these PDFs. It’s very simple and I spend less time fucking around with styling etc. than when I used to write in GDocs or traditional word processors, and I get output that looks just as good as any plain text file I’d save in Word or Docs.
Why?
"Why bother?" is a big question and the main and most honest answer is that it doesn’t take any meaningful effort from me to produce these additional formats, and plain text formats are much more accessible than PDFs. The epubs and plain text PDFs this method produces still aren’t perfect – and I’ll touch on that a little later – but they’re certainly friendlier for screenreaders than your average print-ready PDF that hasn’t had any post-layout accessibility work done on it, and they don’t suffer from any colour contrast issues as they’re just black text on white backgrounds.
PDFs are an inherently inaccessible format. There’s a reason that novels aren’t published in PDF form, and it’s because they’re not designed for people to actually read from them. They’re designed to speak to printers, to ensure that when you’ve made your print ready file you can send it to your printer (printer as in "the company doing your printing" not "your HP Deskjet") and be assured that the physical product will look exactly as you intended it. For some reason RPGs have decided that these will be our primary mechanism of delivering non-print books.
Reading PDFs from screens is miserable. This is why epubs exist. Every website in the history of websites runs off HTML. The closer you get to plain text, the more you can be assured that anybody who needs to use your document will be able to access it in a way that suits them best. Screenreaders also use HTML, and part of tagging a PDF is in making sure that screenreaders pick up on things like Header tags, lists, etc. and recognise them for what they are.
The other "why?" is a more financially-motivated one, and that’s that it allows me to put out a bare-bones "aschan" release of something very quickly and easily. I don’t need to do any typesetting whatsoever, I can just write a book the way I write books and then export it to several formats in literal seconds. I can throw that up online, ask for money for it, and gauge interest. And if there’s interest, then I can expend time and money on making a "nice" layout and doing a print run. It’s faster and more efficient, and that lets me be more productive.
Learnings
I have a few main takeaways from all of this. The first is that there’s no one size fits all solution to "accessibility". It’s a process and a mindset, not a checklist. There will always be use cases you couldn’t account for and that you don’t know how to address. The best things you can do are to give a shit in the first place, and to make it as easy as possible for people to engage with your work. That’s the purpose of multiple formats.
Every time I learn something new, I also learn how much I don’t know.
This week, the biggest lesson I learned was that accessibility starts with good, clear writing. I spent a long time trying to figure out how to write suitable alt text for the maps in In The Bluelight before I realised that actually, if I just wrote the room entries in a way that’s unambiguous about how the spaces link together, I don’t need the map at all. (Obviously the maps are still included for sighted readers but now they’re not necessary, and that’s the important bit).
Which brings me to…
Drawbacks
The PDFs Pandoc spits out play nicer with screenreaders than the ones that e.g. Google Docs makes, but they’re not perfect. I still haven’t found a way to add functional alt text to images inside the documents, and I’ve been making do with captions instead. This is a "fix", but it’s not ideal.
I’ve done a lot of research about how to make this work and I don’t think there’s any way to do it that exists, which means this is always going to be a compromise if I’m going to include images in my PDFs. What this means is that I’m going to be more mindful of how I make books in future, and make sure that my work is accessible on a textual level and that images are mainly for ornament rather than being a necessary part of the work.
The other large issue is that because Pandoc uses LaTeX to generate the PDFs, and LaTeX is – for some reason – unable to generate tagged PDFs, the files Pandoc creates aren’t tagged either. This means that screenreaders don’t know that they’re looking at headers, lists, etc. This is, obviously, not ideal. A fix for this in the short term is to export the HTML and then bring that into LibreOffice to produce the PDFs and epubs, because LibreOffice respects the HTML and will produce tagged files. I’m still looking for a fix that will allow me to continue using a workflow where I only use one program (preferably a markdown editor) and can produce multiple formats easily, but in the meantime at least this method does exist.
Future
I have a few things that I want to learn to do that I haven’t actually looked into yet, so I don’t know how easy (or not) these are. But my list looks like this:
- Output mobi format. Kindles are, by far, the most common ereader, so it makes sense to provide books in their native format
- Different header and body fonts in the same PDF from one file (i.e. no external CSS)
- TOC in HTML files. The tableofcontents commands work great in PDFs and epub but don’t do anything for HTML, even though some flavours of Markdown support this. I haven’t figured out where I’m going wrong yet, but I had to manually build the TOC for In The Bluelight and it was very time consuming. I don’t want to do that again.
- LaTeX supports changing the paper size. So far I haven’t managed to make it work in pure markdown. I’d like to be able to output A5 documents. Don’t ask me why, I just want to.
- Pandoc can output to ICML, the language InCopy uses to talk to InDesign. I’m going to see about introducing that into my workflow for print rather than using the script to convert markdown to styles and see if my results are any better.
And that’s it. Thanks for reading.
You can find this post in PDF format here and look at the raw markdown file here.
This was originally posted to Detritus, where I post WIP drafts, musings on the craft, and general behind the scenes stuff. You can also support this site and my work on Ko-Fi.