New Firefox and Toolkit module peers

Please join me in welcoming another set of brave souls willing to help shepherd new code into Firefox and Toolkit:

  • Luke Chang
  • Ricky Chien
  • Luca Greco
  • Kate Hudson
  • Tomislav Jovanovic
  • Ray Lin
  • Fischer Liu

While going through this round of peer updates I’ve realised that it isn’t terribly clear how people become peers. I intend to rectify that in a coming blog post.

New Firefox and Toolkit module peers in Taipei!

Please join me in welcoming three new peers to the Firefox and Toolkit modules. All of them are based in Taipei and I believe that they are our first such peers which is very exciting as it means we now have more global coverage.

  • Tim Guan-tin Chien
  • KM Lee Rex
  • Fred Lin

I’ve blogged before about the things I expect from the peers and while I try to keep the lists up to date myself please feel free to point out folks you think may have been passed over.

Thoughts on the module system

For a long long time Mozilla has governed its code (and a few other things) as a series of modules. Each module covers an area of code in the source and has an owner and a list of peers, folks that are knowledgeable about that module. The full list of modules is public. In the early days the module system was everything. Module owners had almost complete freedom to evolve their module as they saw fit including choosing what features to implement and what bugs to fix. The folks who served as owners and peers came from diverse areas too. They weren’t just Mozilla employees, many were outside contributors.

Over time things have changed somewhat. Most of the decisions about what features to implement and many of the decisions about the priority of bugs to be fixed are now decided by the product management and user experience teams within Mozilla. Almost all of the module owners and most of the peers are now Mozilla employees. It might be time to think about whether the module system still works for Mozilla and if we should make any changes.

In my view the current module system provides two things that it’s worth talking about. A list of folks that are suitable reviewers for code and a path of escalation for when disagreements arise over how something should be implemented. Currently both are done on a per-module basis. The module owner is the escalation path, the module peers are reviewers for the module’s code.

The escalation path is probably something that should remain as-is. We’re always going to need experts to defer decisions to, those experts often need to be domain specific as they will need to understand the architecture of the code in question. Maintaining a set of modules with owners makes sense. But what about the peers for reviewing code?

A few years ago both the Toolkit and Firefox modules were split into sub-modules with peers for each sub-module. We were saying that we trusted folks to review some code but didn’t trust them to review other code. This frequently became a problem when code changes touched more than one sub-module, a developer would have to get review from multiple peers. That made reviews go slower than we liked. So we dropped the sub-modules. Instead every peer was trusted to review any code in the Firefox or Toolkit module. The one stipulation we gave the peers was that they should turn away reviews if they didn’t feel like they knew the code well enough to review it themselves. This change was a success. Of course for complicated work certain reviewers will always be more knowledgeable about a given area of code but for simpler fixes the number of available reviewers is much larger than it was otherwise.

I believe that we should make the same change for the entire code-base. Instead of having per-module peers simply designate every existing peer as a “Mozilla code reviewer” able to review code anywhere in the tree so long as they feel that they understand the change well enough. They should escalate architectural concerns to the module’s owner.

This does bring up the question of how do patch authors find reviewers for their code if there is just one massive list of reviewers. That is where Bugzilla’s suggested reviewers feature comes it. We have per-Bugzilla lists of reviewers who are the appropriate choices for that component. Patches that cover more than one component can choose whoever they like to do the review.

 

More new Firefox/Toolkit peers

A few things are happening which means there are a bunch of new Firefox/Toolkit peers to announce.

First since I’m now owner of both Firefox and Toolkit I’ve decided it doesn’t make much sense to have separate lists of peers. Either I trust people to review or I don’t so I’m merging the lists of peers for these modules. Practically since Toolkit already included all of the Firefox peers this just means that the following folks are now Firefox peers in addition to already being Toolkit peers:

  • Nathan Froyd
  • Axel Hecht
  • Mark Mentovai
  • Ted Mielczarek
  • Brian Nicholson
  • Neil Rashbrook
  • Gregory Szorc
  • David Teller

Second we’re updating the list of suggested reviewers in bugzilla for Toolkit. In doing so I found that we have a number of reviewers who weren’t already listed as peers so these folks are now considered full Firefox/Toolkit peers:

  • Kit Cambridge
  • Tantek Çelik
  • Mike Hommey
  • Matt Howell
  • Mike Kaply
  • François Marier
  • Nicholas Nethercote
  • Gian-Carlo Pascutto
  • Olli Pettay
  • J Ryan Stinnett
  • Andrew Sutherland
  • Gabriele Svelto
  • Jan Varga
  • Jonathan Watt

All of these folks have been doing reviews for some time now so this is largely just book-keeping but welcome to the fold anyway!

Welcome to a new Firefox/Toolkit peer, Shane Caraveo

It’s that time again when I get to announce a new Firefox/Toolkit peer. Shane has been involved with Mozilla for longer than I can remember and recently he has been doing fine work on webextensions including the new sidebar API. As usual we probably should have made Shane a peer sooner so this is a case of better late than never.

I took a moment to tell Shane what I expect of all my peers:

  • Respond to review requests promptly. If you can’t then work with the patch author or me to find an alternate reviewer.
  • Only review things you’re comfortable with. Firefox and Toolkit is a massive chunk of code and I don’t expect any of the peers to know it all. If a patch is outside your area then again work with the author or me to find an alternate reviewer.

Please congratulate Shane in the usual way, by sending him lots of patches to review!

hgchanges is down, probably for good

My little tool to help folks track when changes are made to files or directories in Mozilla’s mercurial repositories has gone down again. This time an influx of some 8000 changesets from the servo project are causing the script that does the updating to fail so I’ve turned off updating. I no longer have any time to work on this tool so I’ve also taken it offline and don’t really have any intention to bring it back up again. Sorry to the few people that this inconveniences. Please go lobby the engineering productivity folks if you still need a tool like this.

On Firefox module ownership

It has been over eleven years since I first wrote a patch for Firefox. It was reviewed by the then-Firefox module owner, Mike Connor. If you had told me then that at some point in the future I was going to be the module owner I probably would have laughed at you. I didn’t know at the time how much Mozilla would shape my life. Yet yesterday Dave Camp handed over the reigns to me and here we are.

When Dave proposed me as the new module owner he talked about how he saw Firefox as a code module, responsible for the code that ships with the app, rather than the decisions about what the app needs to do. Those are delegated to other teams with better grasps of the situation like Product and UX. I agree with this wholeheartedly. It isn’t my role as an engineer to make those kinds of decisions. The Firefox module owner is focused on the implementation of the code in the browser and pretty much nothing else.

But in the Firefox module even the implementation decisions have always been heavily delegated to the peers of the module. I don’t intend to change that. I’m here to help guide those working on the code when they need a broader view, not to put my foot down and insist on how things should happen. I’m here to direct you to the peer most able to help you when you need a reviewer or run into problems. On some occasions I will also be here to listen and advise when you think the peers are wrong. In fact I see this role as less about being a module owner and more being a module steward, staying out of the way mostly but applying a gentle hand to the tiller when needed.

Those of you keeping an eye on things will note that I’m also the Toolkit module owner. That hasn’t changed and while I have always approached that module in much the same way as I plan to approach Firefox there are a few differences. I’ll talk about those another day.

A new owner for the add-ons manager

I’ve been acting as the owner for the add-ons manager for the past little while and while I have always cared a lot about the add-ons space it is time to formerly pass over the torch. So I was pleased that Rob Helmer was willing to take it over from me.

Rob has been doing some exceptional work on making system add-ons (used as part of the go faster project) more robust and easier for Mozilla to use. He’s also been thinking lot about improvements we can make to the add-ons manager code to make it more friendly to approach.

As my last act I’m updating the suggested reviewers in bugzilla to be him, Andrew Swan (who in his own right has been doing exceptional work on the add-ons manager) and me as a last resort. Please congratulate them and direct any questions you may have about the add-ons manager towards Rob.

Improving the performance of the add-ons manager with asynchronous file I/O

The add-ons manager has a dirty secret. It uses an awful lot of synchronous file I/O. This is the kind of I/O that blocks the main thread and can cause Firefox to be janky. I’m told that that is a technical term. Asynchronous file I/O is much nicer, it means you can let the rest of the app continue to function while you wait for the I/O operation to complete. I rewrote much of the current code from scratch for Firefox 4.0 and even back then we were trying to switch to asynchronous file I/O wherever possible. But still I used mostly synchronous file I/O.

Here is the problem. For many moons we have allowed other applications to install add-ons into Firefox by dropping them into the filesystem or registry somewhere. We also have to do things like updating and installing non-restartless add-ons during startup when their files aren’t in use. And we have to know the full set of non-restartless add-ons that we are going to activate quite early in startup so the startup function for the add-ons manager has to do all those installs and a scan of the extension folders before returning back to the code startup up the browser, and that means being synchronous.

The other problem is that for the things that we could conceivably use async I/O, like installs and updates of restartless add-ons during runtime we need to use the same code for loading and parsing manifests, extracting zip files and others that we need to be synchronous during startup. So we can either write a second version that is asynchronous so we can have nice performance at runtime or use the synchronous version so we only have one version to test and maintain. Keeping things synchronous was where things fell in the end.

That’s always bugged me though. Runtime is the most important time to use asynchronous I/O. We shouldn’t be janking the browser when installing a large add-on particularly on mobile and so we have taken some steps since Firefox 4 to make parts of the code asynchronous. But there is still a bunch there.

The plan

The thing is that there isn’t actually a reason we can’t use the asynchronous I/O functions. All we have to do is make sure that startup is blocked until everything we need is complete. It’s pretty easy to spin an event loop from JavaScript to wait for asynchronous operations to complete so why not do that in the startup function and then start making things asynchronous?

Performances is pretty important for the add-ons manager startup code, the longer we spend in startup the more it hurts us. Would this switch slow things down? I assumed that there would be some losses due to other things happening during an event loop tick that otherwise wouldn’t have but that the file I/O operations should take around the same time. And here is the clever bit. Because it is asynchronous I could fire off operations to run in parallel. Why check the modification time of every file in a directory one file at a time when you can just request the times for every file and wait until they all complete?

There are really a tonne of things that could affect whether this would be faster or slower and no amount of theorising was convincing me either way and last night this had finally been bugging me for long enough that I grabbed a bottle of wine, fired up the music and threw together a prototype.

The results

It took me a few hours to switch most of the main methods to use Task.jsm, switch much of the likely hot code to use OS.File and to run in parallel where possible and generally cover all the main parts that run on every startup and when add-ons have changed.

The challenge was testing. Default talos runs don’t include any add-ons (or maybe one or two) and I needed a few different profiles to see how things behaved in different situations. It was possible that startups with no add-ons would be affected quite differently to startups with many add-ons. So I had to figure out how to add extensions to the default talos profiles for my try runs and fired off try runs for the cases where there were no add-ons, 200 unpacked add-ons with a bunch of files and 200 packed add-ons. I then ran all those a second time with deleting extensions.json between each run to force the database to be loaded and rebuilt. So six different talos runs for the code without my changes and then another six with my changes and I triggered ten runs per test and went to bed.

The first thing I did this morning was check the performance results. The first ready was with 200 packed add-ons in the profile, should be a good check of the file scanning. How did it do? Amazing! Incredible! A greater than 50% performance improvement across the board! That’s astonishing! No really that’s pretty astonishing. It would have to mean the add-ons manager takes up at least 50% of the browser startup time and I’m pretty sure it doesn’t. Oh right I’m accidentally comparing to the test run with 200 packed add-ons and a database reset with my async code. Well I’d expect that to be slower.

Ok, let’s get it right. How did it really do? Abysmally! Like incredibly badly. Across the board in every test run startup is significantly slower with the asynchronous I/O than without. With no add-ons in the profile the new code incurs a 20% performance hit. In the case with 200 unpacked add-ons? An almost 1000% hit!

What happened?

Ok so that wasn’t the best result but at least it will stop bugging me now. I figure there are two things going on here. The first is that OS.File might look like you can fire off I/O operations in parallel but in fact you can’t. Every call you make goes into a queue and the background worker thread doesn’t start on one operation until the previous has completed. So while the I/O operations themselves might take about the same time you have the added overhead of passing messages between the background thread and promises. I probably should have checked that before I started! Oh, and promises. Task.jsm and OS.File make heavy use of promises and I have to say I’m sold on using them for async code. But. Everytime you wait for a promise you have to wait at least one tick of the event loop longer than you would with a simple callback. That’s great if you want responsive UI but during startup every event loop tick costs time since other code might be running that you don’t care about.

I still wonder if we could get more threads for OS.File whether it would speed things up but that’s beyond where I want to play with things for now so I guess this is where this fun experiment ends. Although now I have a bunch of code converted I wonder if I can create some replacements for OS.File and Task.jsm that behave synchronously during startup and asynchronously at runtime, then we get the best of both worlds … where did that bottle of wine go?

Linting for Mozilla JavaScript code

One of the projects I’ve been really excited about recently is getting ESLint working for a lot of our JavaScript code. If you haven’t come across ESLint or linters in general before they are automated tools that scan your code and warn you about syntax errors. They can usually also be set up with a set of rules to enforce code styles and warn about potential bad practices. The devtools and Hello folks have been using eslint for a while already and Gijs asked why we weren’t doing this more generally. This struck a chord with me and a few others and so we’ve been spending some time over the past few weeks getting our in-tree support for ESLint to work more generally and fixing issues with browser and toolkit JavaScript code in particular to make them lintable.

One of the hardest challenges with this is that we have a lot of non-standard JavaScript in our tree. This includes things like preprocessing as well as JS features that either have since been standardized with a different syntax (generator functions for example) or have been dropped from the standardization path (array comprehensions). This is bad for developers as editors can’t make sense of our code and provide good syntax highlighting and refactoring support when it doesn’t match standard JavaScript. There are also plans to remove the non-standard JS features so we have to remove that stuff anyway.

So a lot of the work done so far has been removing all this non-standard stuff so that ESLint can pass with only a very small set of style rules defined. Soon we’ll start increasing the rules we check in browser and toolkit.

How do I lint?

From the command line this is simple. Make sure and run ./mach eslint --setup to install eslint and some related packages then just ./mach eslint <directory or file> to lint a specific area. You can also lint the entire tree. For now you may need to periodically run setup again as we add new dependencies, at some point we may make mach automatically detect that you need to re-run it.

You can also add ESLint support to many code editors and as of today you can add ESLint support into hg!

Why lint?

Aside from just ensuring we have standard JavaScript in the tree linting offers a whole bunch of benefits.

  • Linting spots JavaScript syntax errors before you have to run your code. You can configure editors to run ESLint to tell you when something is broken so you can fix it even before you save.
  • Linting can be used to enforce the sorts of style rules that keep our code consistent. Imagine no more nit comments in code review forcing you to update your patch. You can fix all those before submitting and reviewers don’t have to waste time pointing them out.
  • Linting can catch real bugs. When we turned on one of the basic rules we found a problem in shipping code.
  • With only standard JS code to deal with we open the possibility of using advanced like AST transforms for refactoring (e.g. recast). This could be very useful for switching from Cu.import to ES6 modules.
  • ESLint in particular allows us to write custom rules for doing clever things like handling head.js imports for tests.

Where are we?

There’s a full list of everything that has been done so far in the dependencies of the ESLint metabug but some highlights:

  • Removed #include preprocessing from browser.js moving all included scripts to global-scripts.inc
  • Added an ESLint plugin to allow linting the JS parts of XBL bindings
  • Fixed basic JS parsing issues in lots of b2g, browser and toolkit code
  • Created a hg extension that will warn you when committing code that fails ESLint
  • Turned on some basic linting rules
  • Mozreview is close to being able to lint your code and give review comments where things fail
  • Work is almost complete on a linting test that will turn orange on the tree when code fails to lint

What’s next?

I’m grateful to all those that have helped get things moving here but there is still more work to do. If you’re interested there’s really two ways you can help. We need to lint more files and we need to turn on more lint rules.

The .eslintignore file shows what files are currently ignored in the lint checks. Removing files and directories from that involves fixing JavaScript standards issues generally but every extra file we lint is a win for all the above reasons so it is valuable work. Also mostly straightforward once you get the hang of it, there are just a lot of files.

We also need to turn on more rules. We’ve got a rough list of the rules we want to turn on in browser and toolkit but as you might guess they aren’t on because they fail right now. Fixing up our JS to work with them is simple work but much appreciated. In some cases ESLint can also do the work for you!