Markdown and docs style checks

This has the potential to turn into one of these long bike shed discussions, but I would like to solicit opinions before we introduce something more formal.

We have a style guide and automated checks for C, python, xml, and most other languages used in the seL4 repositories. We do not yet have something explicit for markdown documents and documentation in general.

Markdown specifically is deceptively easy to get slightly wrong, which leads to not getting the desired output. I would therefore propose to add a style check for markdown documents in the seL4 repos.

This affects mostly README files, but also the tutorials and other longer-form documentation. A set style will help making these more consistent and avoid repeated discussion on what line wrap should be or which kind of heading style etc.

For simplicity, I would go with most of the defaults of https://github.com/DavidAnson/markdownlint which is easy to integrate into CI.

I would remove the following checks (some of these are already removed by default):

  • MD002 first-heading-h1/first-header-h1 - First heading should be a top-level heading
    (some markdown documents are included in others and are not top-level documents)
  • MD041 first-line-heading/first-line-h1 - First line in a file should be a top-level heading
    (same as above)
  • MD012 no-multiple-blanks - Multiple consecutive blank lines
    (multiple blank lines are not rendered and can improve readability in the source)
  • MD024 no-duplicate-heading/no-duplicate-header - Multiple headings with the same content
    (this is useful for anchors, but many of our pages don’t need anchors to every heading, and esp things like API description have legitimate header duplication in different chapters)

Default line wrap for markdownlint is 80, which I would be happy with. The longer 120 and 100 we have for C and python/Isabelle are not very readable for longer text.

Line wrap should then also apply to LaTeX docs (we might convert these to markdown or RST in the future, but let’s leave that for a separate discussion).

Any opinions/suggestions?

What problems should a style check solve? Just catching objective markdown syntax errors to prevent documents from being wrong? Does the tool provide a way to automatically apply fixes?

I think the intent is to save time on further discussions about appearance, bring consolidation. Definitely support that idea, but am also interested in what interesting options for error detection and correction there are.

What problems should a style check solve? Just catching objective markdown syntax errors to prevent documents from being wrong?

As Raf says, consistency and avoiding repeated discussions is the main objective, but there are actually quite a few things that you can get wrong in markdown, esp if you consider multiple different renderers (e.g. GitHub, jekyll, pandoc). For instance not having empty lines around lists will sometimes cause them to not be rendered as lists, similar for code fence blocks (we actually had one of these in the setup instructions). The linter can’t do that much, but it helps.

Does the tool provide a way to automatically apply fixes?

If run in an IDE like VSCode it does (for some of the problems), but I don’t think the command line version has an auto-fix mode. CI would only flag the positions.