9 Tidy Text

We create interconnected, interoperable (web) resources: we want to ensure that our research results are findable, accessible, and reusable. The world wide web has been a source of high interoperability and findability in the last 30 years with the introduction of the http protocol and the standardization of the HTML text markup language.

Our entire website is created from simple markdown text. See Chapter  [13](https://manual.dataobservatory.eu/publishing).

Figure 9.1: Our entire website is created from simple markdown text. See Chapter 13.

All our output needs to be converted to HTML, but that does not mean that we need to work in an HTML editor. However, the need of interoperability among operating systems (Windows, MacOs, Linux) and software packages (at least from Word, Libre, Google Docs to HTML, preferably to PDF, too) requires a simple, common notation.

Markdown is a much simplified HTML text notation intended to work well with word processors.

Or, if you want Word output, then instead of HTML, Word is rendered. Or PDF. Or EPUB.

9.1 Try it out

There are countless Markdown editors. Because Markdown is so simple, you can, if you want to, edit markdown files in Notepad, WordPad (Windows) or VIM (Linux).

Most word processors support markdown. For example, Google Docs has a free extension that converts and document from Docs to markdown.

There are several online Markdown editors that you can use to try writing in Markdown. Dillinger is one of the best online Markdown editors. Just open the site and start typing in the left pane. A preview of the rendered document appears in the right pane.

9.1.1 Communicate using Markdown

This is a course for new GitHub users. You need to log into your (free) GitHub account to use it.

  • Who is this for: New developers, new GitHub users, and students.
  • What you’ll learn: Use Markdown to add lists, images, and links in a comment or text file.
  • What you’ll build: We’ll update a plain text file and add Markdown formatting, and you can use this file to start your own GitHub Pages site.
  • Prerequisites: In this course you will work with pull requests as well as edit files. If these things aren’t familiar to you, we recommend you take the Introduction to GitHub course, first!
  • How long: This course is five steps long and takes less than one hour to complete.
  1. Right-click Start course and open the link in a new tab.


start-course

  1. In the new tab, follow the prompts to create a new repository.
    • For owner, choose your personal account or an organization to host the repository.
    • We recommend creating a public repository—private repositories will use Actions minutes.
  1. After your new repository is created, wait about 20 seconds, then refresh the page. Follow the step-by-step instructions in the new repository’s README.

9.1.2 Make it look good

You can simplify a Word document, for example, via uploading to Google Docs and sending it through the free extension to get a markdown document. But usually you would like to work the other way around! It is a better practice to write a text in markdown, and when ready, add it to a nice PDF, Word, or website (blog) HTML template. This way you keep your text (and citations) simple and interoperable, and you can reuse the same text many times over.

9.2 Markdown syntax

9.3 Use your favorite application

Working with tidy texts will not separate you from your favourite word processor. You can still use Grammarly in Google Docs. You only have to make sure that your text remains simple: you refrain from adding formatting to the document if they do not adhere to a common standard that connects Windows, Mac, Unix, Word, VIM, and Posit.

  • Simple text has clearly defined headings: title, subtitle, heading level 1, heading level 2, heading level 3.
  • Simple text has standardised bibliographic references and footnotes.
  • Simple text has standard section breaks and page breaks.
  • Simple text allows the automatic insertion of tidy data (as defined earlier) or interoperable graphics (preferably PNG or for web use webp).
  • Simple text uses only bold, italics, underline, and perhaps strikethrough highlighting.
  • Simple text has no table of contents because the table of contents is automatically generated from headings.
  • Simple text has no bibliography because it is automatically generated from standardized bibliographic entries.
  • Simple text does not use footers, headers, watermarks, color boxes, because these things need to be added differently on Word, PDF, or HTML.