Docusaurus

Interview with Sebastien Lorber: Open Source Developer at Facebook, Docusaurus

By Ben Rometsch on September 9, 2021

A lot of things were inspired by Gatsby. We tried to build a similar tool, but we'll focus on documentation.

ben-rometsch picture
Ben RometschHost Interview
Sebastien LorberOpen Source Developer at Facebook
--:--
--:--

Show Transcript:

Thank you, Sebastien Lorber, for joining us. He is working on Docusaurus, which is a bit of a heavyweight in its field. Sebastien, welcome. Thank you for your time. Can you give us a brief overview of who you are and what Docusaurus is?

Thanks for inviting me. I’m a React developer. I’ve been using React for several years. I was an early adopter of the technology. I’ve been a Java/Scala developer in enterprise projects. I didn’t like frontend management in the beginning, but with React, everything changed. I’ve been a freelancer for a few years. I’ve had the opportunity to work with Facebook Open Source under the Docusaurus project, which is a static site generator that powers many websites from Facebook projects, but also a lot of other websites now. For example, the sites of a Babel, Jest, ReactNative and Redux are all implemented in Docusaurus. It’s mostly a tool for documentation. It’s not obvious but the name of it somehow plays.

Talk a little bit about the genesis of the project. Did it come about as an internal Facebook tool or was there another thing that gave birth to it?

I was not there in the beginning, but a few years ago, I think Facebook had the program. They published a lot of open-source projects. The thing is they were copying the GQL template over and over again from one project to another to create the documentation website. They felt it was time to create a tool so that you don’t have to do this every time you want to create a new project. You should reduce the maintenance cost of creating those documentation websites. We don’t care too much at the beginning with the project on how the documentation website looks like.

If somehow all look quite a bit the same, it’s not a big deal. At least you are able to change a bit the theme of the site so that you can set your own colors. The more interesting part is the documentation. People are browsing the site to be able to check the documentation. To reduce the maintenance cost of the documentation websites, the best way is to write the content, have everything being built for you, and being able to publish a production-ready website in a few minutes by just writing Magnum files.

Was there a lot of direction from the React side? Because there are React-based static site generators in existence as well. Is it more because it’s specifically around solving the problem of documentation?

First, there are two versions of the Docusaurus. Both are using React, but they have quite different architectures. Version 1 is being deprecated soon, so most users are already using version 2. One important difference between the two projects is that one is using React only on the server side. It’s using React, but does a templating language in the net process, not in the browser. Version 2 now, I’d write React on the client. You are browsing using a single-page application, where it was not the case before. To answer your question, we can see how Docusaurus compares to Gatsby or Next.js which are also static site generators. Next.js is more than a static site generator but you can still do a comparison.

Docusaurus is very similar to Gatsby conceptually, but it is simpler because it’s removing parts that we don’t need. For example, we don’t need as much customization that is possible. We don’t need a very complex plugin system. We don’t plan to build a Docusaurus cloud to be able to host the Docusaurus sites efficiently. It’s out of the scope of Docusaurus. We just want to build a static site. Although we don’t need the GraphQL layer, for example, in the Gatsby sites you know that there is some indirection between how you put data in Gatsby and how you consume it in the pages. You have to write the GraphQL queries. It’s not mandatory that people do this, and there are some good reasons to do so.

For documentation websites where you are mostly based on Magnum, you don’t need that because most of your content is static. It’s not coming from a CMS. We don’t plan to support incremental builds and things like that. The base indirection layer is a bit of a kill for such a use case. Docusaurus is Gatsby with many parts removed to make it simpler, and also with features that are oriented to other contents. For now, the focus is mostly a set of plugins that read Magnum files. We don’t have as much focus on the CMS integrations because most people that are using Docusaurus are linear developers and are used to contribute using Git Workflow, pull requests and editing Magnum files.

What has been the trajectory of the project? Coming out Facebook, it had a bit of a head start to most people’s open-source projects. Was there anything in particular that accelerated its popularity?

Being adopted by a Facebook project like React Native and other open-source projects like Jest, Babel, somehow what made the tool popular. If you say it’s adopted the Docusaurus 1, but we now see much more adoption with Docusaurus 2 because we have a better user experience with the usage of React in the client. We just released the beta and we already saw a lot of traction for the project. I think Docusaurus 2 is used five more times than Docusaurus 1.

Has it been in beta for quite a long time?

It has been in alpha for two years, and we just released the beta.

It’s quite a big deal.

Many people in the community don’t understand why we were in alpha for so long. We wanted to reach feature quality with version 1, which has a set of features so that all version 1 users will be able to upgrade to the Docusaurus 2. One of those features is the internationalization support. If you want to translate your site, it should be supported. The problem is it took a bit of time to implement because it was not so easy due to the modular nature of Docusaurus 2 because now, we have a plugin architecture.

Once it was released, we wanted to look for the beta on a few sites to make sure that it was okay, and also write a migration plan for the version 1 site to be able to upgrade if those sites were translated. A few of them are aware. The Jest website was translated. Once we did this, we decided that we wanted to be in beta, but we wanted to upgrade to webpack 5 before the beta. Now that we are in beta, we will release the official version 2.

I would like to improve the timing system of Docusaurus 2 because now, we have much more customization options to be able to achieve a custom design and customize a bit the user experience. The thing is this card is a bit fragile. If you have customized a lot on your Docusaurus site when you upgrade from one version to another, the thing is somehow the aviation phase of the temp customization system is not very clear. Somehow you copy files from the existing temp, it overrides the temp files, and you can modify the copy.

If we do some breaking changes on our internal code and you upgrade Docusaurus, maybe the code that you provide will no longer work and you will have to somehow liquefy it, and that’s part of your customizations to the new copy. This is a bit annoying because it adds friction to the upgrade. Some Docusaurus users are not all front-end developers who are experienced with those things. Some are academicians and maybe they did some customizations a few years ago and they try to upgrade. They’re a bit annoyed by the fact that they can’t upgrade easily because of some errors that get reported.

Looking at GitHub, it’s a Facebook-owned project. I think you’re the first guest we’ve had on the show who has been working on a project that has a very large benefactor. I guess they’re using the project internally pretty heavily, so that makes them immediate end-users of the platform. In terms of the direction of the project, how’s that managed?

I have a lot of freedom. Initially, when I was contacted by Facebook to work on this project, I had the documents, so I knew about the major features that I had to work on for my contract with Facebook. Basically, it was the internationalization system and stabilization of existing APIs, and ensuring that every version 1 user will be able to upgrade to version 2. My Facebook manager put a lot of trust in my work. If I feel that there is something that must be worked on, I am able to work on it. I don’t know if it’s very useful. I think this is specific to this project. The fact that I am a freelancer and not a Facebook internal employee, but at least I feel I have a lot of trust from Facebook to do the right thing. They are not micro-managing me in any way to tell me what exactly should be on my to-do list.

Is it just yourself that looks after the community side? It’s a pretty massive open-source project. You’ve got hundreds and hundreds of contributors. How do you manage that?

Some users are rushing on implementing something. When this happens, I try to ask the user to submit a request for command and write some detailed API of what he plans to implement. It depends. Some features are small, so we just enter the review on the product and we are able to match. If this is close to being manageable, I like to edit the pull request of the contributors. I’ve been used to this structure now. For example, some reviews are not so easy to communicate to the contributor. It’s fine if we both work on the same pull request and I do some comments on this branch. When I met the pull request, it’s faster when submitting a review command and waiting for the contributor to come back and find time to fix the issues before merging. It depends, I think.

Has anything ever happened around people wanting or trying to commercialize the platform other than Facebook?

I don’t think there is a commercial project around Docusaurus so far. As I am a contractor and I’m not a Facebook employee, it’s a bit major opportunity to see if there is a potential for it. There are some things that we could build around Docusaurus, maybe a platform so that you can give your GitHub repository and everything is built and deployed for you in the cloud. It’s not like the tools like Netlify and Vercel are very complicated to use. For some users, they are not used to these workflows mostly because they’re not frontend developers and those tools are mostly used by frontend developers. Even a service that you can use by posting a new URL in a form and submitting it, and then you have your site online would be interesting. In that way, you can have zero configuration and just another folder with Magnum files and put your site online in a few minutes.

There is a need that we have seen internally at Facebook. For example, there were teams that were using wikidoc tools before where you add an edit button and you could modify the documentation in real-time. The thing is when you are using a static site generator, you need some preview. Having to submit a pull request and wait for the employee to see your changes online is not a very good addition experience. Most static sites like Next.js and Gatsby add services around to be able to display a preview of your edits. The CMS allows you to do edit in the form input and view a preview in real-time of your new content. There is an opportunity to build a tool like that around Docusaurus so that the additional friction is reduced, and users that are not technical will not have to submit a pull request to edit the content of a project.

We use MkDocs, which is a Python tool, so it felt more natural to me having written the API in Python. Do you guys talk to each other and share ideas?

We talked about that. I don’t remember his name, but there is a guy on Twitter that is working on a material temp for MkDocs. We talked a bit, but I didn’t know about MkDocs a few months ago. There are many doc tools that it’s hard to know all of them. I discovered MkDocs a few weeks ago but it’s quite similar to the philosophy of Docusaurus. Basically, you write the content and you add the site online. We have some differences. Maybe we’ll offer more customization options regarding the team while most users of MkDocs are using the templates and do some customizations on top of it.

We are a static site generator allowing you to build custom pages in React. For example, I’ve seen that you have interviewed someone from QuestDB. They have a beautiful Docusaurus site. If you look at it, you’ll see that there is a page that is implemented in Docusaurus. They also have pages for Java files that are in Docusaurus. You can add a lot of things inside that are not related to it if you want to build. In the end, it’s React. In the same way, you can put anything in Gatsby, Next.js and Docusaurus including a part of your site that will require authentication, or a part of your site that will be very dynamic like integrating the full product in your Docusaurus site if you want.

MkDocs is not really made for that. Another defense is that it’s not building a single-page application. When you click on the link in the sidebar, you will have a full page reload. It limits the ability for MkDocs to be state-full because every time you click on the link, the state of the things on the page will reset. This is a different experience in terms of navigation. Some people will say that they prefer the approach of MkDocs because it’s more lightweight. You are just generating pages and creating links between those pages. This is reported by Tools to be more performant. If you try MkDocs on Lighthouse, you will have a better score than Docusaurus.

Personally, I don’t think it’s fair because we are reloading React on the client-side on Docusaurus. This is more than just HTML and CSS and a few lines of JavaScript. At the same time, it enhances the experience. It’s not just the first-page load, it’s the whole experience. Once you have loaded React in Docusaurus, their experience is always nice. Something important to mention is that before React is loaded on the client-side in Docusaurus, the page is still available. Somehow the tooling like Lighthouse doesn’t give us the best score because they wait for React to be available on the client to think the page is complete. You can disable JavaScript on the Docusaurus site and most of the site will work. Some things are not working, but it’s my goal to make it possible to use Docusaurus with minimal JavaScript. Even for the search widget, I would like to make it work with JavaScript by using some fallback way that will display a page or with search results that are created independently.

I hadn’t considered how much benefit you get from having a single-page application where static site generation is concerned because it gives you way more flexibility in terms of what you can achieve.

You can put a lot of dynamics in your application. You can have a state that is kept between you and navigation. If you want to do that with a tool that reloads the page every time you navigate, you will have to use local storage and things like that to be able to keep the state around. There are some in-progress browser APIs that may be able to improve that. Maybe someday we will not need the single-page application anymore. Honestly, I don’t know but there are some interesting proposals to look at to see how this evolves.

People know that implementing a single-page application is more expensive than doing server-side rendering. You have to create an API. You have to use React and you have to use a lot more JavaScript in the browser. I think people do that because somehow, they want the experience that the single page application navigation brings. For example, when you click on the link, you don’t see a print screen and a page that reloads. We also are able to use some optimization techniques.

For example, if you’re overarching, then you can prefetch the JavaScript that will be required on the next page, so that when you click, the React is already loaded and you just have to run it under client and it feels quite instantaneous. Maybe in the future versions of React, we may be able to even reduce the code size for content-centric websites like Docusaurus. We feature several components. With concurrent mode, you will be able to start rendering the next page even before the user clicks on the link because you predict that the click will happen and you start to do some work at the time so that when the user clicks, it can happen immediately.

We’re talking about being able to view the page and have it usefully available without JavaScript. Is that something that you’ve built yourself? You said the page partially renders without JavaScript being available on the browser.

This is not related to React. Before React is loaded, there is already some market and CSS on the page. If you are careful when you implement the features on your UI, it will be able to work without JavaScript. For example, back in the old days, you didn’t use any JavaScript and you are still able to submit forms because it was supported natively by the CMS. We are at a tipping point somehow where we see React becoming a progressive announcement of the pages instead of being a requirement.

If you are on a train and you try to load the web page, if your network goes down, it should still work even if some JavaScript didn’t work because the HTML that is statically served will be able to work in a fallback somehow. You will be able to submit forms. Most content sites will keep working because those sites mostly have links and they don’t have a lot of dynamic features. HTML link works even if the navigation is not single-page application navigation, you will still be able to click on the link and move from one page to another.

Were there any particular projects that you look to for inspiration on those sorts of design patterns? In terms of this idea of having a combination of static site generation and a React JavaScript single-page application, were there other projects that you took inspiration from?

I’m not the one that created the initial version of Docusaurus 2, so I’m not the best to answer this question, but I think a lot of things were inspired by Gatsby. We tried to build a similar tool, but we’ll focus on documentation. Gatsby has a temp system. I think Next.js has a plugin system too. It’s possible to create an experience like Docusaurus on top of another set of static site generators. We discussed that a lot of times internally, and we decided to create our own static site generator because we are able to modify the core of it to our customers’ needs without waiting for features to be implemented in those projects. As they are much bigger, it takes time and we don’t have full control over the priority. It’s not impossible, but someday Docusaurus will be implemented on top of another existing tool because the value of Docusaurus is not to implement the best static site generator. Maybe Gatsby and Next.js are doing a better job to do that.

We are mostly focusing on the documentation features. Implementing our own static site generator somehow duplicates the work that has already been done by other projects. This is a trade-off but for now, this is what we choose to do. It’s how to be sure that it is a good decision to do because somehow, we unleashed a result of what we have built, and we don’t know exactly what would be the problems we would have if we have chosen a different path.

How do you recognize the success goals for the project? What elements of the project are you tracking to measure your progress?

For example, the stats on GitHub is not very relevant in my opinion because most of the stats were around when Docusaurus version 1 was released. Docusaurus 2 is a quite different project. Those stats don’t mean anything. I like to track the npm-stats because it gives a better overview of how much the project is used. I’m not sure it’s also too relevant because there are a lot of things that can tell the stats. I’ve seen someone posted a blog post on how we fake downloads of open-source projects to somehow fake it and let users think that it was widely used while it was not.

This is something you can do, and I don’t think the stats of Docusaurus affect it, but it’s possible that someone does something that has an impact on the stats. It’s difficult to be sure that the metrics are very relevant. It’s still interesting to watch the trend and see how Docusaurus 2 is used compared to other tools. For example, comparing to Docusaurus 1, so we can see now that it’s used five more times than Docusaurus 1. What I like to try to pursue is we have a stage where we accept any Docusaurus site that was built. I did a way to filter the Docusaurus sites by feature.

If your site is online, we have a custom domain name and some production content that you can add to your site there. If we receive a lot of submissions and we have a lot of sites here, it means some of the projects are used by a lot of sites. I’ve started to monitor the usage of Docusaurus on Twitter. It’s nice to have a name like Docusaurus because when you search for it on Twitter, you only find relevant results that are about the project and not something else. I’m monitoring the work of Docusaurus for a while now, and I’ve seen a lot of users that are trying Docusaurus that are writing blog posts. I can see that there are a lot of users in Japan that are writing articles about Docusaurus on platforms that I never heard of that are for developers.

I’m using TweetDeck, and there is an option to translate the tweets. I try to translate it and see what they talked about in Japanese or Chinese to see if the feedbacks are good on that. I also see a lot of Docusaurus sites that are published and feedbacks from the community here. Not all the sites are on the showcase page but we don’t want to force the users to add their site here. If they don’t submit them, we don’t have them automatically. If you’re reading and you have a Docusaurus site, please add it because it’s nice to see the site to build with it.

I’ve been looking through these and it’s amazing how much variety of styles and designs there are. When we chose MkDocs, we wanted to get the documentation out there. The combination of that tool and feeding the material theme, I was like, “That’s fine. That’s good enough. We just need to get the content out there.” I have been thinking about moving to something that is a bit more flexible so we can make it look a little bit more like our own. We redesigned our site in Gatsby. That’s the benefit of MkDocs is just like, “Here’s some mock down. Go away. I don’t have time to worry about that.”

I don’t have a lot of practical experience with it, but I feel the tool is simple to get started. If you want to start a dev server or build a production website, it’s quite fast. Docusaurus is a bit slower because it’s doing a lot of things in addition, but also you have more flexibility. MkDocs is a nice tool and you have to choose the trade-offs that you want. Hopefully, someday we will be as fast to start to build that server and installing dependencies and figure out that we need much more things.

MkDocs doesn’t even have a webpack type pre-process. There’s nothing like that. It’s like, “I would like to start the development server,” and it takes a tenth of a second. What’s next for the project? Are there big features that you would like to get into version 3?

The first thing is we are going to release version 2. I don’t even know what version 3 would look like, but we should first figure out how we will enter the break-in change. Do we bump the major version of the project or minor? It’s not totally clear because it took so much time to release version 2 of Docusaurus, which has a different architecture. Maybe we don’t want to be in Docusaurus 11 or 12 in one year because we did some break-in changes. I think we will adjust incrementally and stay in version 2 for a while. For the version 2 launch, I would like to make the temp system more robust so that users can easily customize the site. For now, it’s possible that it may require a bit more maintenance of our time when you try to upgrade.

How do people normally fork the project when they want to make big customizations to it?

It’s not a full fork of the project, it’s more like they can create plugins to create additional pages. The plugin system allows them to inject the propelled data that may come from external sources to fill those pages. There are some good sources that can be Magnum files, but maybe also CMS. This is not something that we have documented very well so far. I don’t have a lot of experience with CMS tools, but it is definitely impossible to query data from a CMS and display it on your own page. If you want to customize an existing page, for example, we have Freeman plugins, the page plugin, which will create a page for each React or Magnum file. There is also the DAX plug-in that is creating a set of documentation and try to bring them together through a sidebar and previews in the next navigation.

There is a blog that will allow you to create a blog index and being able to browse the blog posts independently and add tags to the blog post. These are the Freeman plugins. We have the concept of a team that implements the UI for the content plugins. The users are able to customize the temp that’s somehow overriding some parts of the temp. For example, the temp is a set of React components and you can say, “I don’t want the default implementation for this component. I want to write my own component for this specific part of the temp.” You are able to get a copy of the source code and modify this copy to your need, and then your user customization on your site. The problem is that when users do that, we are having break-in changes in the component of the temp. When they upgrade, the break-in changes are annoying for them and they have to recopy again the content of the file, and that part changes.

What other projects are you inspired by? Have you got any members of your community who you’d like to give a thank you to that have been working hard on it?

We have a lot of people in the community that we can thank. For example, Alex, who has been working on many aspects of the UI of Docusaurus and also the Facebook companies that have worked on the project and supported it. We have a lot of external contributors that are helpful that are not working full-time on the project. It’s hard to be able to understand everything about the project. I myself have a lot of things to figure out. People are reporting that something is not working with Yamdu. If you don’t know anything about Yamdu, it’s a bit complicated to enter that. Things like webpack are quite complicated also. When other contributors that have some experience with webpack try to help, it’s always welcome. We have a lot of people that have helped with TypeScript, webpack and Yamdu. It is nice to have contributions like that because it’s hard to be an expert on all those tools at once. Sometimes having the feedback of someone that has more niche expertise than you is very nice.

What was the feedback like on the beta release? I was reading the post on Hacker News and there was a lot of positive feedback on that.

In general, we have a lot of positive feedbacks. I see on Twitter that a lot of people are ideal users of Docusaurus. They have some pain points and we try to solve them. We are still in beta, so I think it’s normal to have some. It’s hard to compare with other tools. I’m monitoring other tools like MkDocs, and I also see a lot of positive feedbacks. It’s hard to say whether Docusaurus or MkDocs is the best tool for a certain use case. A lot of tools are great and you have to choose according to your traders.

They work so well with the tooling that you can generally get free of charge. The combination of GitHub, GitHub Action, Netlify or GitHub Pages, those tools work really well together. Once you’ve got it set up, it stays out of your way. It doesn’t break and does what you need it to do. They seem to be perfect for the tools that most people use.

I’m not a fan of GitHub Pages because I feel it’s a bit more complicated than a normal down tool like Vercel and Netlify. It’s very easy to deploy a static site with a nice workflow. For example, you open a pull request with a documentation change and you get to preview in your pull request when you can inspect that everything looks fine. This is not possible with GitHub Pages, so this is not the solution I would recommend now. Maybe a few years ago, it was very nice because it was free and it’s easily integrated into GitHub, but it’s not the most performant integration. If you try to run it in an automated way in some tier, you have to set up an SSH key while other tools don’t require that.

Although it’s not the most performant hosting platform, if you use GitHub Pages, I don’t know the performance of a GitHub Pages, but maybe you could put Cloudflare on top of it to get better performance. Other platforms like Netlify and Vercel may be a better choice by default because it’s very easy. You have to fill the form with fields, and you have an app online on a very good CDN and all the workflows in the pull request are there for you. What I don’t like about GitHub Pages is that you don’t have the appropriate views in the pull request. Also, you don’t have any server side features. If you want to create a server side with React, for example, you want to change the URL of a page and you want to move it when you add it to another. You can’t create a server-side redirect on GitHub Pages, so you have to put Cloudflare on top of it and write a redirect here at the proxy level. You can do that in GitHub Pages.

We use Vercel, and it is an amazing product. We’re a big fan of it. The branch previews and stuff and rolling back the different versions is so good. I feel like GitHub goes through a bit of a pattern of building a feature and releasing it, and forgetting about it and coming back to it.

I don’t think it does evolve a lot. It remains something that was useful for the Jekyll project. Even Docusaurus sites that deploy on it have put some a file like .nojekyll.

I think you have to put your domain name in a flat file or something. It’s weird. It feels a little bit like a few years old now, whereas with Vercel or Netlify, and then you’re done basically, which is an amazing experience. Sebastien, thank you so much for your time, and thanks for your work on the project. Maybe we’ll convert ours out. I appreciate it. Congratulations on the feedback for version 2.

Thank you.

Have a good day. Take care.

You, too. Bye.

About Sebastien Lorber

As a contractor, I help companies be more productive with React and React Native.