Building a Rust Web Browser
I have done something very foolish. I've started building a new web browser. From scratch. Not a new wrapper around Chromium or WebKit or Gecko. No, an actual new browser. Why have I done such a thing?!
For a long time I have wanted to learn Rust. I tried it some years ago and liked it, but I could tell it wasn't ready for wide use, nor did I have the time to really learn it. Fast forward a few years and Rust is an excellent systems language with a huge ecosystem of libraries and friendly developers. It is also been the most loved programming language for the fourth year in a row, according to Stack Overflow. Clearly it's time to revisit the language.
When learning a new language I first read the docs. Rust has an excellent getting started tutorial called 'The Book'. I learned a lot and started to understand how things go together. However, you can't really get a feel for a language and become effective in it without writing something big. In the past I've used a raytracer as my standard test for fast / parallel languages. However, I'd already done that and, quite frankly, there are already a ton of rust raytracers. So instead I chose something crazy. A webbrowser.
A New Webbrowser?
A real webbrowser is out of the question, of course. Real browsers are massive undertakings requiring thousands of man years of work. Over a thousand people work on Chrome alone. Making a browser isn't enough. You have to make it fast. And you have to make it work with all of the broken pages out there from the past 25 years. Plus you have to keep up with the new standards and privacy changes and an arms race against hacks and trackers. There is a reason Servo is the only new browser engine created in the past 15 years. All other browser trace their origin to WebKit/KHTML or Gecko.
But should not I let myself be stopped by the impossible. I know the specs of HTML and CSS pretty well. If I compromise on the features like "speed" and "working on broken pages" and "the full spec", then making something that can browse the web in a limited way might actually be feasible for a single individual.
Here's what I propose building: a simple desktop application that can render and navigate pages with the following restrictions
- It only renders pages that are well-formed and spec compliant. It won't handle all of the broken pages out there.
- Supports standard block and inline layout. It won't support newer specs like flexbox, grid, multi-columns, etc.
- Render to the screen or a PNG.
- No threading and error handling. A broken page will crash. It won't load resources in the background. It just loads a page and draws it synchronously.
- It will be slow.
- No GUI (menu bars, bookmarks list, etc.)
Most of the restrictions above are common sense. It will work with wellformed non-broken pages. It will do its best to render but we won't demand pixel-level accuracy to professional browsers like Firefox. And of course without threading and optimization it will be slow. The lack of a GUI has less to do with the difficulty than with the fact that there isn't a standard cross-platform GUI toolkit for Rust. On the contrary, Rust has a ton of 2d graphics APIs and picking one feels premature. The GUI (and to some extent threading) is something I might revisit, but for now they are out of scope.
Where am I?
I started this project about two weeks ago, just working on it in my spare time. After about a week I got this far. On the left is Firefox. On the right is my mini-browser. Not bad.
The app above is written in pure Rust. I started with Matt Brubeck's excellent tutorial from six years ago "Let's build a browser engine". His tutorial only gets as far as laying out blocks vertically. I've added rendering text with an inline box layout (though it's still not working correctly).
The original tutorial wrote the HTML and CSS parsers using a traditional tokenizer approach. For mine I used a parser library called pom which uses PEGs (parser expression grammars). I like PEGs because they express your intent in a way that is closer to the underlying grammar without having to write in a new syntax like LEX and YACC. PEGs also let you build up your parser incrementally with lots of small unit tests. Check out my CSS parser. It's at least 50% tests.
For opening a window I'm using minifb which gives me a minimal abstraction for windows and input. For parsing images I'm using the standard Rust image crate. For fonts I'm using font-kit, which can fetch fonts from the OS. The glyph rasterization is not very good, however, so I might change to rusttype in the future. Rusttype is a pure Rust alternative to libraries like FreeType. For drawing I'm using raqote, though it's more powerful than I need. It can do full paths but I really only next text and rectangles for 99% of my browser.
For network access I'm using the reqwest crate which can do HTTPS requests. While I could have done HTTP parsing from scratch, it's not a complex protocol, I didn't want to try to implement SSL by hand. Reqwest is a popular module used in lots of big projects so it should be very fast and stable.
Where does it end?
I've gotten farther than I thought in a short time. That is a testament to Rust. I absolutely could not have done what I've done so far with C++. I still would be debugging core-dumps. The Rust compiler ensures that my code is safe and can't crash. That said, I still fight a lot with the borrow checker. I suspect the problem isn't the language but that I'm using it wrong. After too many years of using garbage collected languages I don't pay enough attention to where and when I allocate resources.
Here's what I've go so far ( again, Firefox on the left).
I've got some more work to do.