How are popular open source projects documented?

How are popular open source projects documented?

This is a survey of the state of the art in developer documentation - how the most popular open source tools manage their documentation.

I'm currently updating the documentation of an open source tool, and I thought I'd start with surveying the prior art of how successful projects are doing documentation today.

I'm particularly interested in how these projects invite contributions, comments, bugs, and handle versioning and scaling down to mobile UIs.

I thought I'd write this up in case it's useful to others. Our projects are typically not this popular, so the tradeoffs the popular projects make sometimes won't be right for us. Their solutions might be too heavyweight. But they've certainly been successful at growing a developer audience, and probably hit a lot of the pitfalls and iterated a few times, so it's useful to see how they solve documentation.

We'll be investigating these projects' docs, somewhat arbitrarily chosen as the top most starred projects on GitHub:

  • TensorFlow
  • Visual Studio Code
  • Ansible
  • Vue.js
  • React
  • Angular
  • Keras
  • Flutter
  • Kubernetes
  • Home Assistant

This methodology misses important projects that aren't hosted on GitHub, e.g. Python's Docs, or the Linux Kernel's Docs, but the point of this is to take a quick sample, not be exhaustive/representative.

Finally, remember the technology used to build docs is perhaps the least important part of the docs: the words that are written, and the information hierarchy presented to the user, are far more important than whether the site generator uses Go or Ruby. That said, let's dive in!

TensorFlow

TensorFlow, Google's popular machine learning framework, has 145k stars on GitHub.

Example Page: https://www.tensorflow.org/api_docs/python/tf/concat

Tensorflow API docs for tf.concat method, mobile UI. The API docs are versioned, and there is a survey for rating the docs 1-5 stars
TensorFlow has an attractive mobile UI.

A fully-custom generate2.py script generates the TensorFlow docs from Markdown files in the TensorFlow Documentation Git repo, and Python Doc Comments, and Javadoc in the main TensorFlow repo. There is a Contributor Guidelines page to walk people through how to do this.

TensorFlow has no "Edit this page" button to take you to the source of the page, though there is a link to the Issue Tracker.

There is different API documentation per-version.

There is a survey: "Is this page helpful" allowing one-click voting from 1-5 stars.

Visual Studio Code

VS Code, Microsoft's popular code editor, has 98k stars on GitHub.

Example Page: https://code.visualstudio.com/api/extension-capabilities/theming

Visual Studio Code docs for theming. There is a pencil-edit-this-page button and nav dropdown.
Visual Studio Code's docs scale down nicely to mobile.

VS Code's docs include an "Edit this Page" link, which take you to the source Markdown file in GitHub. The Markdown has front-matter, like used by Jekyll and Hugo static site generators.

VS Code builds their docs with Gulp build tool, which invokes a scripts/build.sh which doesn't appear to be present in the repository – probably a Microsoft-internal file. So I'm not sure if contributors can run the site locally! Microsoft's guidance is, naturally, to edit the Markdown file using Visual Studio Code.

There is a Contributor Guide, and a survey on each page:

Was this documentation helpful? Yes / No
VS Code's feedback mechanism.

Ansible

Ansible, IBM's popular IT automation tool, has 43k stars on GitHub.

Example Page: https://docs.ansible.com/ansible/latest/modules/copy_module.html

Ansible docs for 'copy' module, mobile UI. Table of contents, and synopsis, with a search button.
I've been seriously impressed with Ansible's examples.

Ansible has extremely good documentation, in that I can usually copy/paste from their examples to solve my problems. I wish all sites had examples as good as Ansible's.

Two YAML examples of Ansible configuration of the 'copy' module in the docs page.
This page has 9(!) examples for using the copy module. Seriously impressive.

Ansible has edit links at the bottom of each doc page. Some of the links 404, I've filed an issue about that.

If you notice any issues in this documentation, you can edit this document to improve it.
Nice call to action.

There isn't a button to file a documentation bug – perhaps the doc bugs are low-quality? And there's no survey either.

Ansible uses the popular Sphinx Doc system to generate documentation, also used by Python and the Linux Kernel. Sphinx has docs written in ReStructured Text .rst files, which is like Markdown but less popular.

Ansible's docs are part of the main Ansible git repository, not a separate repository. This makes it easier to update the code at the same time as the docs.

The docs are versioned, with a dropdown to select older versions.

Vue.js

Vue.js, a popular JavaScript user interface library, has 166k stars on GitHub.

Example Page: https://vuejs.org/v2/guide/conditional.html#v-else-if

Vue Docs on mobile UI. Conditional Rendering, many code examples for the v-if directive.
Vue has attractive code highlighting colors.

Props to the Vue team for the #BlackLivesMatter banner!

Vue has a edit button at the bottom of each page, which takes you to Markdown source with YAML front-matter.

Caught a mistake or want to contribute to the documentation? Edit this on GitHub! Deployed on Netlify .

The docs are in a separate Git repository from the main vuejs project. The README.md indicates the site is built with hexo, a Node.JS-based static site generator I hadn't heard of before, and deployed automatically on merge using Netlify. I suppose it makes sense for a JS project to use a JS-based static site generator, so contributors don't have to learn another language ecosystem.

Translations exist, and are organised as forks of the original docs repository. That would make it difficult to keep them up to date, but does optimise for ease of starting a translation.

There are no surveys or links to file a bug on the documentation page. Versioned docs are accessible:

Dropdown to select 2.x, 1.0, 0.12, or 0.11 versions of Vue.js docs.
Choose your version of the Vue.JS docs

React

React, Facebook's popular web user interface library, has 150k stars on GitHub.

Example Page: https://reactjs.org/docs/forwarding-refs.html

React's mobile UI docs. Top navbar, heading, and a floating action button to show a menu.
React's documentation mobile UI.

Each React doc page has an "Edit this page" link at the bottom, which takes you to a Markdown file on GitHub with YAML Front-Matter. The docs are stored in a reactjs.org repository, separate from the main React code.

The README.md indicates that you can run locally with yarn dev which uses the Gatsby JavaScript-based static site generator. Gatsby is built on React, so perhaps it makes sense.

Docs are available for some old versions of React, but strangely not all. There's no link to file a bug about the documentation, and no survey.

Angular

Angular, Google's popular user interface framework, has 62k stars on GitHub.

Example Page: https://angular.io/guide/user-input

Angular docs on mobile UI. Nav bar, search box, heading, table of contents, and some docs.
Angular's docs look great on mobile.

Angular has a very subtle 'Pencil' icon at the top of each page, which takes you to a Markdown file in the main angular GitHub repository. Angular's docs build with yarn using dgeni, a custom doc generator used by Angular and Protractor projects.

Judging by the presence of a firebase.json file, the docs are probably deployed to Firebase.

I'm not seeing any survey on the page.

Keras

Keras, a popular high-level library for deep learning, has 49k stars on GitHub.

Example Page: https://keras.io/api/layers/initializers/

A mobile-UI-sized Keras docs, but all we see is a nav menu because they haven't optimized for mobile.
Keras is unique in this set of projects for not serving a mobile-scaled version of their site

Keras has no "Edit this page" link. This would be a neat opportunity for someone to add one, for an extremely popular library.

Keras' docs live in a subdirectory of the main keras project, as Markdown files built by the MkDocs static site generator.

Flutter

Flutter, Google's popular UI Toolkit for mobile, web, and desktop, has 94k stars on GitHub.

Example Page: https://flutter.dev/docs/cookbook/design/drawer

Flutter docs page on mobile. Nav bar, black lives matter header, prev/next buttons, bug link, edit link, header, table of contents.
Flutter's docs scale down nicely to mobile

Flutter has a subtle 'source code' button that takes you to Markdown files with YAML front-matter hosted on a flutter/website repository on GitHub, separate from the main Flutter code.

The README.md indicates the site is built with the Jekyll static site generator.

Flutter also has a "bug" link to create an issue for their documentation, which pre-fills information about the doc page.

Next/previous links exist, unlike many of the other doc sites.

No survey is presented.

Kubernetes

Kubernetes, a popular container orchestration system, has 67k stars on GitHub.

Example Page: https://kubernetes.io/docs/concepts/#kubernetes-objects

Kubernetes mobile docs. A Black Lives Matter banner up the top calling out that racism is against core values of Kubernetes, followed by header, nav bar,search, docs.
Kubernetes docs look good on Mobile. I appreciate the explicit notice of not tolerating racism.

Kubernetes docs have a strong statement about not tolerating racism in their community. I appreciate it.

Kubernetes docs have a big edit button at the top. At the bottom they have an extremely comprehensive set of feedback loops: a Yes/No helpfulness survey, Edit This Page, and Create an Issue, and even last-modified-time for the page with a link to the latest commit! I particularly appreciate the last-modified-time, as it can be a signal to whether the page is up to date.

Bottom of Kubernetes docs: feedback survey, bug filing button, edit this page button, and last-modified-time for the page.
Wow! Survey, Bug Filter, Edit Button, and Last-Modified-Time.

Kubernetes docs are hosted out of the kubernetes/website repository, separate from the main kubernetes/kubernetes repository.

All translations are part of the same repository, but they don't seem to be updated at the same time.

The docs are Markdown with YAML front matter. The README.md says the site is built using the Hugo Go-based static site generator. Kubernetes is also written in Go: again, we see a project choosing a static site generator written in the same language as the rest of the project.

Interestingly, it looks like the site was converted from Jekyll to Hugo two years ago, in a +123k line diff. That looks like a big job, though I hear Hugo has an automated converter. Looks like the rationale was: Hugo offers better multi-language support and faster build times.

We chose Hugo after months of research and conversations with other open source translation projects. [...] Hugo's multilingual support is built in and easy.
Another advantage of Hugo is that build performance scales well at size. At 250+ pages, the Kubernetes site's build times suffered significantly with Jekyll. We're excited about removing the barrier to contribution created by slow site build times.

It's always interesting to see people migrate from one platform to another, and their reasons for it! It wouldn't have been easy to migrate.

Kubernetes has a Contributors Guide for documentation. Continuous Deployment seems to be handled by Netlify.

Home Assistant

Home Assistant, a popular smart home automation framework, has 33k stars on GitHub.

Example Page: https://www.home-assistant.io/docs/

Home Assistant

Home Assistant has a prominent "Edit this page on GitHub" link, which leads to Markdown with YAML front-matter in a documentation Git repository separate from the main Home Assistant code.

The README.md indicates the site uses bundle exec rake preview to see the site, and the Rakefile calls jekyll build. So this is a Jekyll-based website. Ruby-based Jekyll is the most popular static site generator.

Continuous Deploys are handled by Netlify. There's no survey on the docs page, or direct link to file issues about the page, but there is a "Need Help?" page which links to the GitHub Issues page for the docs.

Notably, Home Assistant separates User Docs from Developer Docs. I think this is a good pattern to segment your audience.

Survey Conclusions

This has been a short tour of how the most popular open source projects manage their documentation. What have we learned?

  • Everyone's checking their code into version control as their source of truth. Nobody is using a database-backed Content Management System like Wordpress for their docs.
  • Everyone's using a static site generator, and it doesn't really matter if it's custom or a standard one. Jekyll is most popular, but a lot of projects opt for a static site generator written in the same language as their library.
  • Markdown remains extremely popular as a doc format, with all projects using Markdown except for Ansible, which uses ReStructured Text as required by Sphinx. None of the projects surveyed are using raw HTML for doc content, nor Wiki Markup, BBCode, AsciiDoc, or LaTeX. Of course there are different flavours of Markdown though...
  • Static site generators seem to have mostly converged on "YAML front matter" as being the way to configure pages.
  • "Edit This Page" buttons are popular to try to convert users into documentation contributors.
  • "File a bug about this page" is not very popular – perhaps the bugs are low-quality and triaging them is a pain?
  • Some teams are putting feedback surveys on their docs, but only some of the really big corporate-backed players who can pay people to do the analysis.
  • Netlify is a popular way to do Continuous Deployment and hosting.
  • Almost all sites are caring about responsive design, scaling down to mobile-phone size. I wonder how many people are reading API docs on their phone. It's probably not a trivial number! I'd love to see some audience numbers on this.
  • Some projects keep their docs in the same git repository, and many don't. A best practice hasn't emerged here.
  • Many projects allow you to view API docs for old versions, though this is not consistently supported.
  • Nobody seems to have comments on their docs any more. Comments used to be very popular. PHP's user-contributed comments on docs were often famous for being more useful than the docs themselves. Perhaps nobody wants to moderate comments any more? Maybe everyone had Disqus comments then Disqus started putting ads on their page?
  • Everyone is doing translations differently: some not at all, some as forks of the main repo, some as subdirectories.

Further Work / Questions not answered here

  • How do these sites handle search? Static Site Generators don't typically output a search index.
  • Do different patterns emerge for smaller, or less popular open source projects?
  • What about the projects that aren't on GitHub, for whatever reason? Perhaps they're too big, or older, or just prefer being hosted elsewhere?
  • If documentation is in a separate repository from the main code, how are they kept in sync?
  • What do professional Tech Writers think are the best practices?
  • Is there a way we could standardise these patterns, so every project doesn't have to reinvent best practices? What would that look like?
  • What's the next big thing in documentation, that all of these sites are missing?
Mark Hansen

Mark Hansen

I'm a Software Engineering Manager working on Google Maps in Sydney, Australia. I write about software {engineering, management, profiling}, data visualisation, and transport.
Sydney, Australia