We hit 10,000 stars. It’s the most popular project I’ve ever had and it keeps taking up.
Check out our open-source Feature Flagging system - Flagsmith on Github! I'd appreciate your feedback ❤️
In this episode of The Craft of Open Source, I was able to catch-up with Tom Moor, Founder of Outline. As I've met with more and more founders, lots of projects have started as closed projects to be later switched over to open source. That's how things began with Outline, which is an open source knowledge base for companies to help them organize materials internally. With Atlassian recently shutting down their self-hosted products, lots of projects in this space have seen growth and demand from teams looking for alternatives.
Another trend that we are noticing is teams choosing to run BSL licenses. While there is a lot of debate on this in the open source community, it has proved a good strategy for projects that are looking to ultimately monetize and become a company over time. For the Outline team, they are still running as a side project and keeping their options open.
My favorite quote of Tom's from the episode came when we were talking about the connection between teams using a product and those who are willing to pay for it:
"The reality is the crossover between those teams and the ones that would pay for your hosted product is relatively small a lot of the time. They have a good reason for it whether it's a security or they're in a country where your hosted service is prohibitively expensive. It’s honestly a lot. We have a lot of installations in China and South America. That's a big reason for the support that we get. The monetary conversion doesn't make sense sometimes."
Hope you enjoy!
Tell me a little bit about you, your background and the genesis of Outline.
The genesis of Outline is similar to a lot of software, I would say, scratching a personal itch and an itch of a friend of mine who was working at Coinbase who was originally written to be used internally at Coinbase. I was not working at Coinbase but I was also writing a knowledge base that was more focused on the external-facing knowledge base for support and help. We were having that problem on our team and we were bouncing from product-to-product with none of them fulfilling our needs. We got together and realized that we were building similar software. We thought, “We should join forces on this. Let’s do a better side project together.” That's what we did. We work on that together since 2019. He ended up starting another company. I took it out full-time. Not too long before that point, we decided to make the thing open-source and it grew from there.
How long had you been working on it while it was closed?
I would say under a year, it was a closed-source if I had to guess. It wasn't as good then either for early code. It’s a very messy code. Not that it's perfect now or anything but it was not open-source from day one.
You were working on it together. Did you originally think that it would be a commercial business? Was it something that you wanted to use in your day jobs?
We both had hope that it was something that we could monetize. I should clarify because I've said open-source several times already, the main license for outline is the BSL license which is open-source in spirit but not open-source technically. It's not an LSI license.
Talk about that a bit more. I've not heard of that license.
It came out of one of the databases that Amazon cloned and then sold as a product.
Not elastic, was it?
I don't think it was elastic. It says it in the license name because the license has the name of the database in it.
That's where it started. More projects have taken it on since then. Why I say it's open-source in spirit is because it is the same as an Apache License or something like that in terms of freedom of use except for one single provision which is you cannot run a competing, hosted service. That's the only way that the project sustains itself. You can modify it and run it yourself. You can do anything, just don't run a competing hosted service.
Could I run an on-premise support business if I wanted to?
As long as you're not running a single instance that many other people are using.
If it was on a private intranet somewhere that one organization was accessing.
That is the use case for even publishing the code to begin with. We want to encourage people to run it internally.
You'd got a product and you thought it had some legs and the code was good enough to make the public light of day. Did you put a lot of energy into launching it as an open-source project? Did you quietly put it on GitHub? How did that work?
Mainly quiet, honestly. One of the reasons that we publish the code, to begin with, is it’s a small and quiet go-to-market strategy. It lets us get traffic in a way that we're comfortable with as developers and understand those developers. We haven't done any big launch or anything like that. Even still, it's embarrassing for me but one day maybe when it reaches version one.
In terms of the metrics that you can see on GitHub, they're strong, right?
We hit 10,000 stars. It’s the most popular project I've ever had and it keeps taking up. It's a couple of hundred a week now.
“One of the reasons that we publish the code, to begin with, is it’s a small and quiet go-to-market strategy.”
Are you finding that there's a correlation between how many stars you've got and how many decent quality pull requests you get and things like that? How much of your time are you spending looking after those things rather than pushing code?
I hadn't thought about it but I do suppose the quality of pull requests has increased a little bit as the project has been more popular. With any project, you do find that there's a lot of drive-by pull requests. If you can categorize contributions well, it almost immediately tells what type of contribution something is. There is the person that wants to up their stats. They do a one-liner. They'll find something that’s one-line and be a contributor which is fine. That could be a way into open-source. A lot of the time, I have to suggest, “Maybe that one-line you wrote should be in this other line and then that’s long arms for myself.”
You have folks that will dump an entire feature without talking it through. They don't even talk to you beforehand as the main maintainer even though we’re very clear to try and make it. Some of the issues are like, “Here's a thing we're thinking we should do but it needs this thought and design. Please talk to the team.” At some point in the future, an entire feature will be dumped as a PR that’s 3,000 lines. We’re working on a design that’s like the database scheme that dumped onto the page that doesn't work.
What's your method for dealing with that? I've been noticing this is something that I've pursued and I’ve been learning rapidly but you have this strange feeling. You’ll see that someone put a lot of time and effort into something. It's not exactly how you have done it. What do you do in that situation?
I like to think that I'm firm but fair when it comes to open-source from being some things we won't accept into the project. There's a high bar for the quality of design and implementation. We make it clear what these are. Issues are labeled well, does it contribute it as a guide, the tests are extensive and the formatting is automatic. We have these rules and expectations but we make it fad to see what they are as easy as possible to reach them.
When people are far away from that in terms of what they contribute, I’d know it. I will give them an opportunity to try and work through it but often that they'll disappear and won't reply to anything. It ends up getting closed. People are more than happy to spend a little time going back with some forwards and getting it to the point where it can be merged. Those are great. We've accepted a lot of contributions that way too. It comes down to the person on the other end, how much time and their willingness to be in the middle.
What percentage of new features are being contributed by the community now?
As the product is maturing more, it's less about raw features that get added. At the moment, we're working on 2 or 3 main things. We're working on adding collaborative editing which is mainly myself so that multiple authors can edit a document at the same time. We have translations PR open with translations into German and Portuguese and adding the framework for translation like hooking up all of the strings and so forth. That's mainly driven by a community member. We have another large PR for changing the authentication libraries so that more pluggable and that's also driven by the community. We have a couple of smaller things in. Thirty percent of that stuff is driven by the community although I made it sound 60%. Collaborative editing is a lot. I was about 10 to 20 PRs in one.
Are you leaning on some stuff to do all the conflict resolution for you or that's well-regarded as a hard problem?
I have several branches of code for this specific feature where I've tried three different solutions. I’m trying to find which one is going to work best. We're using a library called YJS which is a variation of CRDT conflict resolution which does the mathematical hard part of it but there's a lot of plumbing as it were and get a little reliable. With that type of feature, it's less than 10% reliability that makes it all kills it. Once you have the algorithm there, how does the thing connect and stay connected on that side of the house?
Have you got any instrumentation in terms of knowing how many self-hosted installations there are or anything like that? One of the things that we're working on is some telemetry when you start the server that you can opt-out to give us an idea about what footprint there is. I've been talking to Paul Dix of InfluxDB and he said when they introduced that, it caused a bit of a rumpus.
I wonder how late did they introduce that.
It wasn't straight away. Have you got anything like that?
We do have something like that. We have a beacon that you can turn off. That came at the recommendation of one of the founders of Sentry. He recommended that when we met with them. It’s an early door. It might have even been before we open-sourced it or it was very soon after. There wasn't any community pushback or anything. Nobody ever mentioned it. It posts anonymous counts of uses as documents.
It’s some aggregated metadata.
It’s a unique installation ID. If it changes IP address or something, we know it's the same installation so we're not double counting. That's the only reason. An installation and then vaguely how big is that stuff. We've got some idea who's not using it but how many folks are using it. I’m glad we did that early as well.
How are you managing the community now? Is it getting difficult to manage?
I wouldn't say difficult but it uses time. This is where I’m thinking of it as marketing and it goes to market strategy. You would spend time and money on any of those things. It's another way to spend that time and money. It's a choice. We could be spending on Google Ads or spending time making advertisements but we're replying to issues and helping people with pull requests which as engineers is a way you would prefer to spend your time anyway. Maintaining a technical community has been not too bad. The tools that have been provided have got better in the last few years or so.
I noticed you guys were using the discussions tab which I haven't seen much before.
It's still in beta so you still have to get in touch with GitHub PM to get access to it.
Talk about that. I think I'd never seen it before. It's like the issues tracker but different.
Behind the scenes, it’s using the same table for issues, pull requests and discussions. It's very similar to issues except for its slightly different UI. It's categorized a little differently. You have the ability to mark something as an answer in a thread like Stack Overflow style. It's a cross between Stack Overflow and a more traditional forum. The beauty of that is it's integrated into the issue system so you can turn an issue into a discussion. If you got to file a new issue then we are able to give custom options there.
There's a screen of options and one of them is like, “Are you filing a book? Do you want to go here? Are you asking a question? You should go to the Discussion Forum. Do you need support on this? You should go to Contact Us.” It gives them this fork in the road to choose from instead of everybody ending up filing issues. A lot of what makes it less difficult and more manageable is these small tools and systems that you put in place to try and make sure that you're not dealing with twenty issues, tackle every day and then having to figure out which are real issues. We still have that but it's much less.
I've noticed we're getting more people raising thoughts on how to integrate the platform into their development environment or their workflow or whatever. You feel bad closing them because they're never closed. It’s the listing.
That comes back to firm but fair. Before, we had discussions. I would have a label that was a self-hosted question, immediately close it and be specific about, “Let's discuss this but it's not an open issue. It's not a problem that needs to be solved in code or something like that.” Often, it was questioned on how to set up the software. Those cases we would closeout but make it clear that we can continue to a discussion as an issue even though it's closed.
How do you guys deal with questions about getting the self-hosted platform up and running? They can be complicated questions that can require a bit of your time to try to help on.
We generally end up in the discussions, which is good. I do get the impression that once you answer something, 50% of the people find it and you don't have to answer it again. Answering those things in public is a good start. There are certain things that we can't do in that forum. If you run the software as described and you follow the Read Me, you're not going to have any issues. The issues that arise when people try to do custom things and they want to run it differently, it’s down to them to figure that out anyway. This came in and it was turn-off encryption. The answer is no. You can't turn off encryption. You could go and modify the software as turn-off encryption but we don't have an option to do that. It doesn't seem like a terribly wise option to have either.
In terms of the statistics, other than stars and your beacon. Are you using any other techniques to track your progress and growth? One thing that I've found is hard. I wish you could put a GA tag or something on your Read Me page or traffic data from GitHub is limited.
I don't tend to look up that data because it is limited. In some ways, I'm amazed they have anything. Occasionally, you'll see a referring site in there and I’m like, “That's interesting,” but they only show the top ten referrers. The top one is GitHub.com for me. The second one is Google. It’s not useful information.
Is that not something you spend much energy looking at?
I see the stars and the forks tick up. More pull requests and issues are filed. In terms of metrics since we do offer a hosted SaaS version, that's where I focus my energy and use that as a proxy for the opens. We have MRR, MAU and standard SaaS metrics.
Are there any closed components to the project at all?
Little. I don't know if this is necessarily the best way but this is a question I had when I started. I'll tell you how we do it which is how do you manage that code split for the bits that are private? The way that I do it is we have a private fork of the main project that has the billing page on it. A couple of hooks to check that the billing is up-to-date. Some are extra messages that tell people that, “Your billing is out-of-date,” then it shows a banner.
All in all, it's a couple of hundred lines and less than 1,000 lines of code. We could get rid of this with stripes hosted billing pages that they've come out with which are well-made. We could reduce it down to under 100 lines of code, I would imagine but that's the only thing that exists. We pull up the public project into the private fork and 9 times out of 10 there's no conflict because there's little code in there. That's how we keep the hosted stuff up-to-date.
We came up with the same solution. We've got one billing code in the public but we use charge B which is similar to the Stripe product. Occasionally, we get people asking us like, “What should I put for my charge B or API key?” It's like, “You don't need to worry about that.”
“Outline: If we had tens of thousands of people running the software themselves, that's only going to pay dividends for everybody down the road. That's not going to happen without a lot of benefits. ”
I don't think there's anything wrong with putting the billing code but it does make it slightly easier to break the license. That was one of the reasons that that bit is private. Other than that, there isn't a huge reason that it should be private. You could do it either way.
You started a couple of closed source companies before this project. Do you think that the simple act of pushing the code to GitHub is enough of a go-to-market activity? Did you think that in and of itself gives you a big advantage having the platform being completely closed?
I don't think in of itself. You still need to do all the normal things that the company needs to do in terms of marketing. I'm a little embarrassed to say that we don't do that yet. It's a matter of time and size. Time will come. I think 2021 is the year of the marketing outline.
You've got a huge wave without that. It’s what I'm getting at.
It's certainly enough to get started and enough to prove your concept and people are interested. One thing that I found interesting is once we hit milestones in stars, the first one was 1,000, we started having people, in particular, Venture Capitalists reach out that were tracking which project is up and coming at GitHub. We had a lot of cold reach out as the stars ticked up. That's interesting particularly if you're outside of the Silicon Valley bubble. Having a semi-popular project on GitHub could be a good way to get the attention of these VCs if that's a route that you're interested in.
We've had several projects in terms of visibility. It’s much smaller than yours but I've been surprised at how many people have contacted us. The market for development is strong at the moment. There are certainly people buying some service that gives them a list of the top 100 growing commercial like open-source projects.
How do you get onto those pages? Is it by virtue of more activity the week before?
I'm not 100% clear. I wish I was because we could fiddle it. It's often an amplifier. Somebody might mention it somewhere else and that was enough traffic that you got more stars that week then you get onto that page and then you get more stars the next few weeks. It works out as an amplifier. That's the most likely from my experience. I was going to say, they're not tracking that stuff. They're also going much further and tracking who is following and which companies those people work for. A lot of the time, these firms will know a lot more about your project than you do. They'll come to you and be like, “Did you know that there are eight Microsoft employees which are following your project?” Not a real example.
One of the first things that happened after we open-sourced a code was we got forked by DFT. I started talking to them and asking if they wanted to sponsor it and stuff like that. At that time, we had 120 stars or something. We went through and tried to find everyone that had started our projects on LinkedIn to get an idea of who was looking at it. It was interesting. I started contacting a bunch of them to say, “Would you be interested in chatting or whatever?” A lot of people thought that I was a bot.
It’s a default expectation nowadays.
On LinkedIn anyway. I was like, “It's me.” Eighty percent of them were like, “Sorry, I started but we're not going to use it.” There are loads of interesting and valuable information buried in different bits of GitHub but you don't get any of it for free. You've got to do a lot of work to try and enrich that data out of it. It started making me think that there was an opportunity for a project. Even tracking stars or whatever.
I think there's a project or a startup to get that data. A lot of these houses are doing it internally rather than buying something. It's a way that they can get an edge. They also do lots of other things like scraping Pinterest. Scraping all things all the time is one of them.
Are you doing any marketing to push either the commercial homepage or the GitHub repository?
We started playing with advertising but nothing serious yet. On Bing Ads which is not something I ever thought I would say. The reason is that I wanted to get ads on DuckDuckGo. I like DuckDuckGo and I want it to support it. I noticed that the keywords around our product which are T-knowledge base and T-Wiki or these types of things that were underserved on DuckDuckGo and the results were poor compared to Google. I was like, “If we can get a high hitting ad on that, it’s cheap,” because nobody else was advertising on those then that would be a way to go. I noticed that the way you do that is through Microsoft Advertising Networks. I was playing around with that but not taking anything too seriously yet because mainly we don't have the attribution tracking set up all the way through the funnel. Even if we did, I wouldn't know how well I’ve invested.
We're working on that. It's always a load more work than you imagined.
One interesting thing with open-source is it does push you to rely on third parties a lot less I found whereas the 3rd party products that is. At other places I've worked, you would like, “We'll use Pusher for that. We'll put in this third party script for that.” We have the source, I'm a lot less likely to do that and make visible marketing API, a requirement of using the product or having a Pusher account that requirement of using the product. It seems to be against the grain.
Are you conscious of not wanting to raise that barrier of what you have to do before you can get the thing running locally?
This is one of the reasons there's a PR to make the authentication methods more pluggable. They all say authentication methods off Slack and Google so that we do have a requirement of a third-party account for authentication for creating a team. Beyond that, there's nothing that you need. Even that one is a big barrier for people. It's one of the most requested things from the open-source community to update.
You need to generate some credential or something.
In my experience of building products, the team management aspect is vastly overlooked in terms of its difficulty. The difficulty of email-based accounts with forgotten passwords bouncing emails and the hackability and then you'll have two people from the same team sign up and create three different accounts that you then have to merge, SSO and two-factor authentication. You hook in with Google and you get all of that or any of this authentication. That's the reason we went that route from the beginning. It was several months of work that we could save and put into the core products. As time has gone on, it would be nice to make it more pluggable and allow people to use GitHub to sign in if they would. That's something that hopefully will be closed soon. Hopefully, we'll do it. There are the technical works that are knocked down in the way.
Where do you guys stand on having one-click deployments into Heroku or something? That's something that we've been struggling to get an answer to. How easy do you want to make it?
We have a Heroku one-click. The button isn't on the Read Me for no particular reason but somebody did and contribute the app JSON to make that button work. You have to go to the Heroku website and then you can do it from there. I've been accused of making it difficult on purpose.
Do you mean from an infrastructure point of view?
I’ve dug into taking that back. Compared to a lot of services. If you look at what it takes to run Sentry, for example, there's a lot of documentation but what it takes to get that running is still a lot of work.
It's a complex product.
In comparison outline, we've gone to a lot of trouble to make it a simple product to run. You need a database and Red is like Slack OLS account or Google OLS account and that's it. We publish a Docker image that you can run so you download the Docker image. You fill out ten environment variables and you've got to working product. There isn’t a huge amount of difference between that and a Heroku button. It's the same thing. You’re skipping the downloading Docker image part like Barracuda whether it's difficult, easy to run or try and thinking about making it more difficult to run.
I would say at the beginning, I would have felt the same as you if I'm truly honest with myself. It was a bit of a worry. You would consider that we make it too easy then we will get less paying customers. After working on this for years now, I don't feel that way anymore. I truly think that if we had tens of thousands of people running the software themselves, that's only going to pay dividends for everybody down the road and that's not going to happen without a lot of benefits.
It takes a little bit of courage and not being afraid of that. The way we're coming around to this idea is if you've got 10,000 installations, that's 100,000 people using the platform that is going to go, move jobs and say, “Let's use this.”
The reality is the crossover between those teams and the ones that would pay for your hosted product is relatively small a lot of the time. They have a good reason for it whether it's a security or they're in a country where your hosted service is prohibitively expensive. It’s honestly a lot. We have a lot of installations in China and South America. That's a big reason for the support that we get. The monetary conversion doesn't make sense sometimes.
We have loads of people in India. We don't have telemetry running yet but from the requests coming in around self-hosting and stuff. Other than the collaborative editing, what else are you working on? It sounds like you alluded to the fact that you feature-complete but fully featured.
I wouldn't say feature-complete either. I like to think that we spend as much time on improving and bug fixing as a feature is what I was getting out. There's still lots to do. We have also added custom domains. To loop that back into our last conversation, one of the reasons to self-hosted is to have it on your own domain. The SaaS product offers custom domains now. It's one less reason to host it yourself if that was going to be your reason.
I hadn't thought of that.
We’re improving that side of the house. We're trying it get the feature parity with some of the competing closed-source products. The main thing after real-time editing is commenting. I'm also excited to work on an approval workflow. I've realized that sentence came out of my mouth. It’s terribly boring but it seems to be a pattern in my career at the moment that building workflows for remote teams is something I've been doing for years. It feels like you're codifying cultural best practices into the software. There's a lot of good stuff we can do around making it easy to know if the information is approved, up-to-date and help people on that front.
Is there anyone you want to give a shout-out to who's been a big help or done some big pills or anything like that?
The colleague that I mentioned at the beginning is a guy called Jori Lallo. He's the guy that started it with and he now works on an issue tracking tool called Linear which I don't know if you've come across but unfortunately not open-source. I like to think that he hates Atlassian and he went from trying to kill Confluence to JIRA. It's a pattern I've had some there. A number of friends have been helping out with the projects on their own time, with no expectations and it's appreciated.
Tim Van Damme has been doing a lot of design work. He did most of the icons that you see. It’s a customized concept that is truly an open-source icon set that he did for it. Another friend called Nun who's who built the entire group's feature, which is complicated. Speaking of non-fun features to build, group management is high on that and he built the entirety of that which is appreciated. I could go on. There are a lot of folks that I've contributed at this point.
If people want to help out, what's the best thing that they could do?
The GitHubRepo/outline most to back the Outline organization name. We do have issues that labeled Good First Issue. They tend to get taken fast. For the slightly bigger issues then dropping a note in there like, “I'm interested in tackling this. Let's talk about what it would take.” I will have a bit of backward and forwards, leave my thoughts and go from there.
Tom, thanks for your time. That's interesting. I'm going to fire up Docker.
Give it a try. It should be easy, fingers crossed, as easy as Docker things get. Thank you for having me.
Thanks for your time. It's great to chat.
I appreciate it.
Outline is building software for the modern workplace, where collaboration is valued, ideas don’t sit in silos and speed and quality are paramount.
The value of Outline comes through sharing knowledge, information, and workflows with the rest of your team – we hope you’ll come to think of it as a sort of communal long-term memory.