Announcing LocalNS

An auto-configuring nameserver for the services you run on your local network.

I mess around with a bunch of different projects in my spare time but it’s been a long time since I’ve thought one was worth tidying up into an actual release. Maybe it will be useful to you?

The problem I faced was that I had a whole bunch of services that I installed on my network, some in docker, some as standalone servers, some behind a Traefik proxy. In the docker case I was using macvlan networking so each service had its own IP address accessible to the entire network. Once I learned how to do all that it became trivial to spin up a new local service, say InfluxDB or Grafana with a couple of lines in a docker-compose file.

The challenge was remembering the IP address for each service. That is of course the job of a DNS server and while I played with various standard DNS server options it always involved more manual work than I wanted and I wanted features they weren’t really designed for like being able to override publicly published DNS names with internal IP addresses when on the local network.

Traefik’s method of auto-discovering services really inspired me. It will connect to the docker daemon and listen for containers starting and stopping and based on labels on the containers route requests to them. So I built something similar that I’m calling LocalNS. It connects to docker and for containers with a specific label it detects the IP address and answers queries for its name through DNS.

So I can trivially bring up a container with something like:

~$ docker run -d \
  --network internal \
  --label 'localns.hostname=grafana.mydomain' \
  grafana/grafanaCode language: Shell Session (shell)

As soon as the container starts LocalNS detects it and starts responding to queries for the name grafana.mydomain with whatever IP address docker assigned it on the internal network.

Once I did that I realised I wanted the same for the services that Traefik was proxying. And Traefik has an API so LocalNS will also request the routing rules from Traefik and for the simple cases understand what hostnames Traefik will respond to and so expose those names over DNS.

There are a couple of other sources supported too and it shouldn’t be difficult to add more in the future. When asked for DNS names it doesn’t know LocalNS will forward the request to another DNS server.

It takes a small amount of configuration to set up LocalNS but once done it mostly configures itself as you change the services that are running elsewhere. I wouldn’t say the code is perfect right now but I’ve been using it as the nameserver for everything on my home network for two months now with no obvious issues.

You can check out the code at Github or there is documentation for how to actually use it.

Creating HTML content with a fixed aspect ratio without the padding trick

It seems to be a common problem, you want to display some content on the web with a certain aspect ratio but you don’t know the size you will be displaying at. How do you do this? CSS doesn’t really have the tools to do the job well currently (there are proposals). In my case I want to display a video and associated controls as large as possible inside a space that I don’t know the size of. The size of the video also varies depending on the one being displayed.

Padding height

The answer to this according to almost all the searching I’ve done is the padding-top/bottom trick. For reasons that I don’t understand when using relative lengths (percentages) with the CSS padding-top and padding-bottom properties the values are calculated based on the width of the element. So padding-top: 100% gives you padding equal to the width of the element. Weird. So you can fairly easily create a box with a height calculated from its width and from there display content at whatever aspect ratio you choose. But there’s an inherent problem here, you need to know the width of the box in the first place, or at least be able to constrain it based on something. In my case the aspect ratio of the video and the container are both unknown. In some cases I need to constrain the width and calculate the height, but in others I need to constrain the height and calculate the width which is where this trick fails.

object-fit

There is one straightforward solution. The CSS object-fit property allows you to scale up content to the largest size possible for the space allocated. This is perfect for my needs, except that it only works for replaced content like videos and images. In my case I also need to overlay some controls on top and I won’t know where to position them unless they are inside a box the size of the video.

The solution?

So what I need is something where I can create a box with set sizes and then scale both width and height to the largest that fit entirely in the container. What do we have on the web that can do that … oh yes, SVG. In SVG you can define the viewport for your content and any shapes you like inside with SVG coordinates and then scale the entire SVG viewport using CSS properties. I want HTML content to scale here and luckily SVG provides the foreignObject element which lets you define a rectangle in SVG coordinates that contains non-SVG content, such as HTML! So here is what I came up with:

<!DOCTYPE html>

<html>
<head>
<style type="text/css">
html,
body,
svg,
div {
  height: 100%;
  width: 100%;
  margin: 0;
  padding: 0;
}

div {
  background: red;
}
</style>
</head>
<body>
  <svg viewBox="0 0 4 3">
    <foreignObject x="0" y="0" width="100%" height="100%">
      <div></div>
    </foreignObject>
  </svg>
</body>
</html>

This is pretty straightforward. It creates an SVG document with a viewport with a 4:3 aspect ratio, a foreignObject container that fills the viewport and then a div that fills that. what you end up with is a div with a 4:3 aspect ratio. While this shows it working against the full page it seems to work anywhere with constraints on either height, width or both such as in a flex or grid layout. Obviously changing the viewBox allows you to get any aspect ratio you like, just setting it to the size of the video gives me exactly what I want.

You can see it working over on codepen.

Please watch your character encodings

I started writing this as a newsgroup post for one of Mozilla’s mailing lists, but it turned out to be too long and since this part was mainly aimed at folks who either didn’t know about or wanted a quick refresher on character encodings I decided to blog it instead. Please let me know if there are errors in here, I am by no means an expert on this stuff either and I do get caught out sometimes!

Text is tricky. Unicode supports the notion of 1,114,112 distinct characters, slightly more than a byte of memory can hold. So to store a character we have to use a way of encoding its value into bytes in memory. A straightforward encoding would just use three bytes per character. But (roughly) the larger the character value the less often it is used, and memory is precious, so often variable length encodings are used. These will use fewer bytes in memory for characters earlier in the range at the cost of using a little more memory for the rarer characters. Common encodings include UTF-8 (one byte for ASCII characters, up to four bytes for other characters) and UTF-16 (two bytes for most characters, four bytes for less used ones).

What does this mean?

It may not be possible to know the number of characters in a string purely by looking at the number of bytes of memory used.

When a string is encoded with a variable length encoding the number of bytes used by a character will vary. If the string is held in a byte buffer just dividing its length by some number will not always return the number of characters in a string. Confusingly many string implementations expose a length property, that often only tells you the number of code points, not the number of characters in a string. I bet most JavaScript developers don’t know that JavaScript suffers from this:

let test = "\u{1F42E}"; // This is the Unicode cow 🐮 (https://emojipedia.org/cow-face/)
test.length; // This returns 2!
test.charAt(0); // This returns "\ud83d"
test.charAt(1); // This returns "\udc2e"
test.substring(0, 1); // This returns "\ud83d"

Fun!

More modern versions of JavaScript do give better options, though they are probably slower than the length property (because it must decode the characters to understand the length:

Array.from(test).length; // This returns 1
test.codePointAt(0).toString(16); // This returns "1f42e"

When you encode a character into memory and pass it to some other code, that code needs to know the encoding so it can decode it correctly. Using the wrong encoder/decoder will lead to incorrect data.

Using the wrong decoder to convert a buffer of memory into characters will often fail. Take the character “ñ”. In UTF-8 this is encoded as C3 B1. Decoding that as UTF-16 will result in “쎱”. In UTF-16 however “ñ” is encoded as 00 F1. Trying to decode that as UTF-8 will fail as that is in invalid UTF-8 sequence.

Many languages thankfully use string types that have fixed encodings, in rust for example the str primitive is UTF-8 encoded. In these languages as long as you stick to the normal string types everything should just work. It isn’t uncommon though to do manipulations based on the byte representation of the characters, %-encoding a string for a URL for example, so knowing the character encoding is still important.

Some languages though have string types where the encoding may not be clear. In Gecko C++ code for example a very common string type in use is the nsCString. It is really just a set of bytes and has no defined encoding and no way of specifying one at the code level. The only way to know for sure what the string is encoded as is to track back to where it was created. If you’re unlucky it gets created in multiple places using different encodings!

Funny story. This blog posts contains a couple of the larger unicode characters. While working on the post I kept coming back to find that the character had been lost somewhere along the way and replaced with a “?”. Seems likely that there is a bug in WordPress that isn’t correctly handling character encodings. I’m not sure yet whether those characters will survive publishing this post!

These problems disproportionately affect non-English speakers.

Pretty much all of the characters that English speakers use (mostly the Latin alphabet) live in the ASCII character set which covers just 128 characters (some of these are control characters). The ASCII characters are very popular and though I can’t find references right now it is likely that the majority of strings used in digital communication are made up of only ASCII characters, particularly when you consider strings that humans don’t generally see. HTTP request and response headers generally only use ASCII characters for example.

Because of this popularity when the Unicode character set was first defined, it mapped the 128 ASCII characters to the first 128 Unicode characters. Also UTF-8 will encode those 128 characters as a single byte, any other characters get encoded as two bytes or more.

The upshot is that if you only ever work with ASCII characters, encoding or decoding as UTF-8 or ASCII yields identical results. Each character will only ever take up one byte in memory so the length of a string will just be the number of bytes used. An English speaking developer, and indeed many other developers may only ever develop and test with ASCII characters and so potentially become blind to the problems above and not notice that they aren’t handling non-ASCII characters correctly.

At Mozilla where we try hard to make Firefox work in all locales we still routinely come across bugs where non-ASCII characters haven’t been handled correctly. Quite often issues stem from a user having non-ASCII characters in their username or filesystem causing breakage if we end up decoding the path incorrectly.

This issue may start getting rarer. With the rise in emoji popularity developers are starting to see and test with more and more characters that encode as more than one byte. Even in UTF-16 many emoji encode to four bytes.

Summary

If you don’t care about non-ASCII characters then you can ignore all this. But if you care about supporting the 80% of the world that use non-ASCII characters then take care when you are doing something with strings. Make sure you are checking its length correctly when needed. If you are working with data structures that don’t have an explicit character encoding then make sure you know what encoding your data is in before doing anything with it other than passing it around.

hgchanges is down, probably for good

My little tool to help folks track when changes are made to files or directories in Mozilla’s mercurial repositories has gone down again. This time an influx of some 8000 changesets from the servo project are causing the script that does the updating to fail so I’ve turned off updating. I no longer have any time to work on this tool so I’ve also taken it offline and don’t really have any intention to bring it back up again. Sorry to the few people that this inconveniences. Please go lobby the engineering productivity folks if you still need a tool like this.

On Firefox module ownership

It has been over eleven years since I first wrote a patch for Firefox. It was reviewed by the then-Firefox module owner, Mike Connor. If you had told me then that at some point in the future I was going to be the module owner I probably would have laughed at you. I didn’t know at the time how much Mozilla would shape my life. Yet yesterday Dave Camp handed over the reigns to me and here we are.

When Dave proposed me as the new module owner he talked about how he saw Firefox as a code module, responsible for the code that ships with the app, rather than the decisions about what the app needs to do. Those are delegated to other teams with better grasps of the situation like Product and UX. I agree with this wholeheartedly. It isn’t my role as an engineer to make those kinds of decisions. The Firefox module owner is focused on the implementation of the code in the browser and pretty much nothing else.

But in the Firefox module even the implementation decisions have always been heavily delegated to the peers of the module. I don’t intend to change that. I’m here to help guide those working on the code when they need a broader view, not to put my foot down and insist on how things should happen. I’m here to direct you to the peer most able to help you when you need a reviewer or run into problems. On some occasions I will also be here to listen and advise when you think the peers are wrong. In fact I see this role as less about being a module owner and more being a module steward, staying out of the way mostly but applying a gentle hand to the tiller when needed.

Those of you keeping an eye on things will note that I’m also the Toolkit module owner. That hasn’t changed and while I have always approached that module in much the same way as I plan to approach Firefox there are a few differences. I’ll talk about those another day.

New Firefox reviewers

I’m delighted to announce that I’ve just made Andrew Swan (aswan on IRC) a reviewer for Firefox code. That also reminds me that I failed to announce when I did the same for Rob Helmer (rhelmer). Please inundate them with patches.

There are a few key things I look for when promoting folks to reviewers, surprisingly none of them are an understanding of the full breadth of code in Firefox or Toolkit:

  • Get to reviews soon. You don’t have to complete the review necessarily, do a first pass or if you really have no time hand off to someone else quickly.
  • Be courteous, you are a personal connection to Mozilla and you should be as welcoming as Mozilla is supposed to be.
  • Know your limits. Don’t review anything if you don’t understand the code it deals with, help the reviewee find an alternate reviewer.
  • You don’t get to demand the reviewee re-write the patch in your style. If their code meets style guidelines, fixes the bug and is efficient enough don’t waste people’s time by re-architecting it.

Pretty much all of this is demonstrated by seeing what code the potential reviewer is writing and letting them review smaller patches by already well-versed contributors.

Improving the performance of the add-ons manager with asynchronous file I/O

The add-ons manager has a dirty secret. It uses an awful lot of synchronous file I/O. This is the kind of I/O that blocks the main thread and can cause Firefox to be janky. I’m told that that is a technical term. Asynchronous file I/O is much nicer, it means you can let the rest of the app continue to function while you wait for the I/O operation to complete. I rewrote much of the current code from scratch for Firefox 4.0 and even back then we were trying to switch to asynchronous file I/O wherever possible. But still I used mostly synchronous file I/O.

Here is the problem. For many moons we have allowed other applications to install add-ons into Firefox by dropping them into the filesystem or registry somewhere. We also have to do things like updating and installing non-restartless add-ons during startup when their files aren’t in use. And we have to know the full set of non-restartless add-ons that we are going to activate quite early in startup so the startup function for the add-ons manager has to do all those installs and a scan of the extension folders before returning back to the code startup up the browser, and that means being synchronous.

The other problem is that for the things that we could conceivably use async I/O, like installs and updates of restartless add-ons during runtime we need to use the same code for loading and parsing manifests, extracting zip files and others that we need to be synchronous during startup. So we can either write a second version that is asynchronous so we can have nice performance at runtime or use the synchronous version so we only have one version to test and maintain. Keeping things synchronous was where things fell in the end.

That’s always bugged me though. Runtime is the most important time to use asynchronous I/O. We shouldn’t be janking the browser when installing a large add-on particularly on mobile and so we have taken some steps since Firefox 4 to make parts of the code asynchronous. But there is still a bunch there.

The plan

The thing is that there isn’t actually a reason we can’t use the asynchronous I/O functions. All we have to do is make sure that startup is blocked until everything we need is complete. It’s pretty easy to spin an event loop from JavaScript to wait for asynchronous operations to complete so why not do that in the startup function and then start making things asynchronous?

Performances is pretty important for the add-ons manager startup code, the longer we spend in startup the more it hurts us. Would this switch slow things down? I assumed that there would be some losses due to other things happening during an event loop tick that otherwise wouldn’t have but that the file I/O operations should take around the same time. And here is the clever bit. Because it is asynchronous I could fire off operations to run in parallel. Why check the modification time of every file in a directory one file at a time when you can just request the times for every file and wait until they all complete?

There are really a tonne of things that could affect whether this would be faster or slower and no amount of theorising was convincing me either way and last night this had finally been bugging me for long enough that I grabbed a bottle of wine, fired up the music and threw together a prototype.

The results

It took me a few hours to switch most of the main methods to use Task.jsm, switch much of the likely hot code to use OS.File and to run in parallel where possible and generally cover all the main parts that run on every startup and when add-ons have changed.

The challenge was testing. Default talos runs don’t include any add-ons (or maybe one or two) and I needed a few different profiles to see how things behaved in different situations. It was possible that startups with no add-ons would be affected quite differently to startups with many add-ons. So I had to figure out how to add extensions to the default talos profiles for my try runs and fired off try runs for the cases where there were no add-ons, 200 unpacked add-ons with a bunch of files and 200 packed add-ons. I then ran all those a second time with deleting extensions.json between each run to force the database to be loaded and rebuilt. So six different talos runs for the code without my changes and then another six with my changes and I triggered ten runs per test and went to bed.

The first thing I did this morning was check the performance results. The first ready was with 200 packed add-ons in the profile, should be a good check of the file scanning. How did it do? Amazing! Incredible! A greater than 50% performance improvement across the board! That’s astonishing! No really that’s pretty astonishing. It would have to mean the add-ons manager takes up at least 50% of the browser startup time and I’m pretty sure it doesn’t. Oh right I’m accidentally comparing to the test run with 200 packed add-ons and a database reset with my async code. Well I’d expect that to be slower.

Ok, let’s get it right. How did it really do? Abysmally! Like incredibly badly. Across the board in every test run startup is significantly slower with the asynchronous I/O than without. With no add-ons in the profile the new code incurs a 20% performance hit. In the case with 200 unpacked add-ons? An almost 1000% hit!

What happened?

Ok so that wasn’t the best result but at least it will stop bugging me now. I figure there are two things going on here. The first is that OS.File might look like you can fire off I/O operations in parallel but in fact you can’t. Every call you make goes into a queue and the background worker thread doesn’t start on one operation until the previous has completed. So while the I/O operations themselves might take about the same time you have the added overhead of passing messages between the background thread and promises. I probably should have checked that before I started! Oh, and promises. Task.jsm and OS.File make heavy use of promises and I have to say I’m sold on using them for async code. But. Everytime you wait for a promise you have to wait at least one tick of the event loop longer than you would with a simple callback. That’s great if you want responsive UI but during startup every event loop tick costs time since other code might be running that you don’t care about.

I still wonder if we could get more threads for OS.File whether it would speed things up but that’s beyond where I want to play with things for now so I guess this is where this fun experiment ends. Although now I have a bunch of code converted I wonder if I can create some replacements for OS.File and Task.jsm that behave synchronously during startup and asynchronously at runtime, then we get the best of both worlds … where did that bottle of wine go?

Linting for Mozilla JavaScript code

One of the projects I’ve been really excited about recently is getting ESLint working for a lot of our JavaScript code. If you haven’t come across ESLint or linters in general before they are automated tools that scan your code and warn you about syntax errors. They can usually also be set up with a set of rules to enforce code styles and warn about potential bad practices. The devtools and Hello folks have been using eslint for a while already and Gijs asked why we weren’t doing this more generally. This struck a chord with me and a few others and so we’ve been spending some time over the past few weeks getting our in-tree support for ESLint to work more generally and fixing issues with browser and toolkit JavaScript code in particular to make them lintable.

One of the hardest challenges with this is that we have a lot of non-standard JavaScript in our tree. This includes things like preprocessing as well as JS features that either have since been standardized with a different syntax (generator functions for example) or have been dropped from the standardization path (array comprehensions). This is bad for developers as editors can’t make sense of our code and provide good syntax highlighting and refactoring support when it doesn’t match standard JavaScript. There are also plans to remove the non-standard JS features so we have to remove that stuff anyway.

So a lot of the work done so far has been removing all this non-standard stuff so that ESLint can pass with only a very small set of style rules defined. Soon we’ll start increasing the rules we check in browser and toolkit.

How do I lint?

From the command line this is simple. Make sure and run ./mach eslint --setup to install eslint and some related packages then just ./mach eslint <directory or file> to lint a specific area. You can also lint the entire tree. For now you may need to periodically run setup again as we add new dependencies, at some point we may make mach automatically detect that you need to re-run it.

You can also add ESLint support to many code editors and as of today you can add ESLint support into hg!

Why lint?

Aside from just ensuring we have standard JavaScript in the tree linting offers a whole bunch of benefits.

  • Linting spots JavaScript syntax errors before you have to run your code. You can configure editors to run ESLint to tell you when something is broken so you can fix it even before you save.
  • Linting can be used to enforce the sorts of style rules that keep our code consistent. Imagine no more nit comments in code review forcing you to update your patch. You can fix all those before submitting and reviewers don’t have to waste time pointing them out.
  • Linting can catch real bugs. When we turned on one of the basic rules we found a problem in shipping code.
  • With only standard JS code to deal with we open the possibility of using advanced like AST transforms for refactoring (e.g. recast). This could be very useful for switching from Cu.import to ES6 modules.
  • ESLint in particular allows us to write custom rules for doing clever things like handling head.js imports for tests.

Where are we?

There’s a full list of everything that has been done so far in the dependencies of the ESLint metabug but some highlights:

  • Removed #include preprocessing from browser.js moving all included scripts to global-scripts.inc
  • Added an ESLint plugin to allow linting the JS parts of XBL bindings
  • Fixed basic JS parsing issues in lots of b2g, browser and toolkit code
  • Created a hg extension that will warn you when committing code that fails ESLint
  • Turned on some basic linting rules
  • Mozreview is close to being able to lint your code and give review comments where things fail
  • Work is almost complete on a linting test that will turn orange on the tree when code fails to lint

What’s next?

I’m grateful to all those that have helped get things moving here but there is still more work to do. If you’re interested there’s really two ways you can help. We need to lint more files and we need to turn on more lint rules.

The .eslintignore file shows what files are currently ignored in the lint checks. Removing files and directories from that involves fixing JavaScript standards issues generally but every extra file we lint is a win for all the above reasons so it is valuable work. Also mostly straightforward once you get the hang of it, there are just a lot of files.

We also need to turn on more rules. We’ve got a rough list of the rules we want to turn on in browser and toolkit but as you might guess they aren’t on because they fail right now. Fixing up our JS to work with them is simple work but much appreciated. In some cases ESLint can also do the work for you!

Running ESLint on commit

ESLint becomes the most useful when you get warnings before even trying to land or get your code reviewed. You can add support to your code editor but not all editors support this so I’ve written a mercurial extension which gives you warnings any time you commit code that fails lint checks. It uses the same rules we run elsewhere. It doesn’t abort the commit, that would be annoying if you’re working on a feature branch but gives you a heads up about what needs to be fixed and where.

To install the extension add this to a hgrc file, I put it in the .hg/hgrc file of my mozilla-central clone rather than the global config.

[extensions]
mozeslint = <path to clone>/tools/mercurial/eslintvalidate.py

After that anything that creates a commit, so that includes mq patches, will run any changed JS files through ESLint and show the results. If the file was already failing checks in a few places then you’ll still see those too, maybe you should fix them up too before sending your patch for review? 😉

Making communicating with chrome from in-content pages easy

As Firefox increasingly switches to support running in multiple processes we’ve been finding common problems. Where we can we are designing nice APIs to make solving them easy. One problem is that we often want to run in-content pages like about:newtab and about:home in the child process without privileges making it safer and less likely to bring down Firefox in the event of a crash. These pages still need to get information from and pass information to the main process though, so we have had to come up with ways to handle that. Often we use custom code in a frame script acting as a middle-man, using things like DOM events to listen for requests from the in-content page and then messaging to the main process.

We recently added a new API to make this problem easier to solve. Instead of needing code in a frame script the RemotePageManager module allows special pages direct access to a message manager to communicate with the main process. This can be useful for any page running in the content area, regardless of whether it needs to be run at low privileges or in the content process since it takes care of listening for documents and hooking up the message listeners for you.

There is a low-level API available but the higher-level API is probably more useful in most cases. If your code wants to interact with a page like about:myaddon just do this from the main process:

Components.utils.import("resource://gre/modules/RemotePageManager.jsm");
let manager = new RemotePages("about:myaddon");

The manager object is now something resembling a regular process message manager. It has sendAsyncMessage and addMessageListener methods but unlike the regular e10s message managers it only communicates with about:myaddon pages. Unlike the regular message managers there is no option to send synchronous messages or pass cross-process wrapped objects.

When about:myaddon is loaded it has sendAsyncMessage and addMessageListener functions defined in its global scope for regular JavaScript to call. Anything that can be structured-cloned can be passed between the processes

The module documentation has more in-depth examples showing message passing between the page and the main process.

The RemotePageManager module is available in nightlies now and you can see it in action with the simple change I landed to switch about:plugins to run in the content process. For the moment the APIs only support exact URL matching but it would be possible to add support for regular expressions in the future if that turns out to be useful.