An update for hgchanges

Nearly a year ago I showed off the first version of my webapp for displaying recent changes in mercurial repositories. I’ve heard directly from a number of people that use it but it’s had a few problems that I’ve spent some of my spare time trying to solve. I’m now pleased to say that a new version is up and running. I’m not sure it’s accurate to call this a rewrite since it was entirely incremental but when I look back over of the code changes there really isn’t much that wasn’t touched in some way or other. So what’s new? For you, a person using the site absolutely nothing! So what on earth have I been doing rewriting the code?

Hopefully I’ve been making the thing a whole lot more stable. Eagle eyed users may have noticed that periodically the site would stop updating, sometimes for days at a time. The site is split into two pieces. One is a django website to display the changes. The other is the important bit that reads the change data out of mercurial and puts it into a database that the website can use. I do this caching because it would be too expensive to work it out on the fly. The problem is that the process of reading the data out of mercurial meant keeping a local version of every repository on the server (some 5GB for the four currently tracked) and reading the changes using mercurial’s python API was slow and used a lot of memory. The shared hosting environment that I run this out of kept killing the process when it used too many resources. Normally it could recover but occasionally it would need a manual kick. Worse there is some mercurial bug where sometimes pulling a repository will end up with an inconsistent database. It’s rare so many people don’t see it but when you’re pulling repositories every ten minutes it starts happing every couple of weeks, again requiring manual work to fix the problem.

I did dabble with getting this into shape so I could run it on a mozilla server. PAAS seemed like the obvious option but after a lot of work it turned out that the database size restrictions there made this impossible. Since then I’ve seem so many horror stories about PAAS falling over that I think I’m glad I never got there. I also tried getting a dedicated server from Mozilla to run this on but they were quite rightly wary when I mentioned that the process used a bunch of memory and wanted harder figures before committing, unfortunately I never found out a way to give those figures.

The final solution has been ripping out the mercurial API dependency. Now rather than needing local mercurial repositories the import process instead talks to them over the web. The default mercurial web APIs aren’t great but luckily all of the repositories I care about have pushlog installed which has a good JSON API for retrieving recently pushed changesets. So now the process is to pull the recent pushlog, get the patches for every changeset and parse them to work out which files have changed. By not having to load the mercurial structures there is far less memory used and since the process now has to delay as it downloads each patch file it keeps the CPU usage low.

I’ve also made the database itself a lot more efficient. Previously I recorded every changeset for every repository, but many repositories have the same changeset in them. So by noting that the database gets a little more complicated but uses much less space and since you only have to parse the changeset once the import process becomes faster. It took me a while to get the website performing as fast as before with these changes but I think it’s there now (barring some strange slowness in Django that I can’t identify). I’ve also turned on some quite aggressive caching both on the client and server side, if someone has visited the same page as you’re trying to access recently then it should load super fast.

So, the new version is up and running at the same location as the old. Chances are you’re using it already so if you’re noticing any unexpected slowness or something not looking right please let me know.

Developer Tools meet-up in Portland

Two weeks ago the developer tools teams and a few others met in the Portland office for a very successful week of discussions and hacking. The first day was about setting the stage for the week and working out what everyone was going to work on. Dave Camp kicked us off with a review of the last six months in developer tools and talked about what is going to be important for us to focus on in 2014. We then had a little more in-depth information from each of the teams. After lunch a set of lightning talks went over some projects and ideas that people had been working on recently.

After that everyone got started prototyping new ideas, hacking on features and fixing bugs. The amount of work that happens at these meet-ups is always mind-blowing and this week was no exception, even one of our contributors got in on the action. Here is a list of the things that the team demoed on Friday:

This only covers the work demoed on Friday, a whole lot more went on during the week as a big reason for doing these meet-ups is so that groups can split off to have important discussions. We had Darrin Henein on hand to help out with UX designs for some of the tools and Kyle Huey joined us for a couple of days to help work out the final kinks in the plan for debugging workers. Lot’s of work went on to iron out some of the kinks in the new add-on SDK widgets for Australis, there were discussions about memory and performance tools as well as some talk about how to simplify child processes for Firefox OS and electrolysis.

Of course there was also ample time in the evenings for the teams to socialise. One of the downsides of being a globally distributed team is that getting to know one another and building close working relationships can be difficult over electronic forms of communication so we find that it’s very important to all come together in one place to meet face to face. We’re all looking forward to doing it again in about six months time.