Mozilla’s New JavaScript Value Representation

Here at Mozilla, we have many monkeys.

One such effort, JaegerMonkey, is focused on revamping the baseline performance of our JS Engine. That effort is going really well. On the SunSpider benchmark, JaegerMonkey is starting to pull away from the Mozilla trunk’s JS Engine. Both are faster than the engine that ships in Firefox 3.6. JaegerMonkey is not a total rewrite, but it does change some fundamental parts of the engine. If you’re an extension author that uses the JSAPI directly, or you embed Mozilla’s JS Engine in other software, there are some changes you’ll need to know about. Our new representation of JavaScript Values (aka jsvals) is the first big change. It has just landed on mozilla-central. The patch was a ton of work, and most of the credit goes to Mozilla engineer Luke Wagner.

So, what is a jsval? It’s the C/C++ type that corresponds to a value in a JavaScript program. Here’s a snippet of JavaScript that assigns some values to three variables:

var foo = {dana: "zuul"};
var bar = "hi";
var baz = 37;
var qux = 3.1415;

Developers can use the JSAPI to manipulate these values from C and C++. Since JavaScript is dynamically typed, the types of those values can change at runtime. For example, the type of the value of ‘bar’ in the code above could change from a string to a number. C++ code sometimes needs to be able to tell which type a value has at a given moment, so there needs to be a clever way to pack that information into jsval type.

Below, I’ll explain how the new value representation works on 32-bit systems, using information cribbed from a presentation by Mozilla engineer David Anderson. We adapted this layout from WebKit, with some modifications. On 64-bit systems, our design is different from theirs.

The Old Way

The old jsval representation fit in a 32 bit value, using the 3 lowest bits as a way to tag the value as a particular type. These were called type tags.

Objects

A jsval with an object value would look like this:

var foo = {dana: “xuul”} @ 0×86753090

C++ code can inspect the three tag bits at the end. By observing that all three are 0 valued, it can determine that the value is an object, and should be interpreted as a pointer to a JSObject at the address 0×86753090.

Strings

Strings worked similarly:

This time, one of the tag bits is set to 1, so C++ code knows that this value is a pointer to a string. Once the type tag was determined, the implementation would perform some bit masking to determine the true value of the last 4 bits.

var bar = “hi” @ 0×20506638

The masked value contains the correct pointer to a JS String at 0×20506638.

Numbers

Integers were stored in the value itself:

The implementation only examined the least significant tag bit for a 1 value to determine whether it was an integer. If it was, it would perform right shift on the value to get the actual integer value.

var baz = 37; // 0×25 in hex

The mechanics of this scheme mean that integers only have 31 bits of space to work with, rather than the usual 32. Floating point numbers don’t fit at all. For example, the float 3.1415 looks like this in memory: 400921cac083126f. Too many bits.

When an integer got too big to fit in 31 bits, or a floating point number was encountered, the value would get converted to a double, and stored on the heap.

Once again, some bit masking was performed to determine the actual address of the memory:

var qux = 3.1415 @ 0xA0B0CCD0 => 400921cac083126f

We have a 32-bit value that contains a pointer to the real value of the float, which is 64 bits wide. This arrangement is bad for at least three reasons. Firstly, we have to allocate to create a number. Secondly, we have to clean up that number later (during GC). Thirdly, it hurts locality, because we have to fetch float values from arbitrary heap locations to do even simple calculations.

The New Way

We call them Fat Values. They’re 64 bits wide.

For Objects, Strings, and Integers, we use the first 32 bits as a type tag. The second 32 bits contains the payload.

var foo = {dana: “xuul”} @ 0×86753090

var bar = “hi” @ 0×20506638

var baz = 37

The payoff is that we can fit the full range of 32-bit integers in integer-tagged values, and floating point numbers fit right in the jsval:

var qux = 3.1415; // (400921cac083126f)

We can distinguish type tags from floating point numbers by using a quirk of IEEE-754 double precision numbers.

The first bit is the sign bit (purple), and the next eleven (yellow) are all exponent bits. If all of the exponent bits are 1s, then the number is a NaN, unless all of the remaining bits (the blue ones) are 0s. If all of the blue bits in this diagram were 0, the value would be either positive or negative infinity. We distinguish the values we’re using for type tags from other NaNs by marking the first 16 bits as 1s. In practice, all hardware and standard libraries produce a single canonical NaN value, so we’re free to use all of the other values for our own purposes. This technique is called NaN boxing.

Changes for JSAPI users

Here is the short version, courtesy of Luke Wagner.

  • jsval is no longer word-sized
  • jsval can hold a full int32
  • doubles are stored in the jsval; JSVAL_TO_DOUBLE returns double
  • jsval and jsid no longer share the same representation
  • JSClass method signatures have been modified to take jsids for id
    arguments and pass jsval arguments by const jsval*.

You can read up in more detail, and provide feedback, by checking out Luke’s mozilla.dev.tech.js-engine post on the matter.