As I wanted to find good reasons to use Javascript as a language to do image processing, I thought of distributed computing that would be extremely easy to do. Users have nothing to install, they just have to visit a webpage. And since we want many users to participate we could embed it into a popular webpage.

There are many issues using browsers as distributed computing nodes:

  • User approbation: Will they allow you to run random script on their machine as they just wanted to visit a site.
  • Liability of the Data: Since process is being done on untrusted people, we must find ways to verify it.
  • User disconnection: We are going to compute the data while they are browsing the web, if they change of URL, reload the page or close their browser we wont get any result.

User Disconnection Over Time

In order to test the last point, I added a small Javascript program on the popular website MMO-Champion.com. Every one minute, it will send the time spent on the page to my server. I ran it for about 2 hours (then it DDOS'ed my server :(). I aggregated the results in the following chart.

We can extract 3 phases from this graph.

  • Under 5 minutes the user is really likely to disconnect.
  • Between 5 and 15 minutes, the chance of disconnection is reducing
  • After 15 minutes, really few users are disconnecting.

You can see it in another way:

  • 50% of the users that have stayed 10 minutes are staying 1 hour.
  • 10% of the users that have stayed 1 minutes are staying 1 hour.

Chance of Script Completion

What we really want to know is either or not our script will complete. In order to test that I took the data we gather and computed the percentage of users that would still be there X minutes later.

After 15 minutes, a script that takes 1-10 minutes to complete has 95% chance of finishing without being interrupted.

Conclusion

The user disconnection is not really an issue. If do the computation on users that are staying more than 15 minutes, we have a 95% rate of completion for a 10 minutes script.

Image Processing is an active research area and it is mostly written in C and C++ because of performance reasons. Browser makers are at war to make Javascript fast. I want to know if it is viable to do it in Javascript.

At the moment, there are really few people doing Image Processing in Javascript. I listed some examples there:

Image processing is computational heavy, you have to do the same operation on millions of elements. In order to get acceptable performances we should look into all the possible ways to implement it and micro benchmark those. Today we start with the different types of Arrays.

Different types of Arrays

In Javascript there are 3 ways to get an array of data.

  • Classic Arrays. The Javascript Arrays that you get using [1, 2, 3] or new Array(size)
    var classic = new Array(1048576);

  • Canvas Data. You can access the raw pixel elements of a canvas in an array. It's an array of 4 * width * height * unsigned 8 bit integers (0 - 255).
    var canvas = $('#canvas').getContext('2d').getImageData(0, 0, 512, 512).data;

  • Typed Arrays. They are being introduced in Javascript for WebGL (spec).
    var typed8 = new Uint8Array(1048576);
    var typed32 = new Uint32Array(1048576);

Benchmarks

The objective of this benchmark is to test both read and write abilities of the arrays for use in image processing. We used an array of 1 millions of elements (1024 * 1024) with random values ranging from 0 to 255. We benchmark a loop that increases all the values by one.

for (var i = 0; i < 1048576; ++i) {
  array[i] += 1;
}
Note: The following results were made in December 2010. The implementation of Typed Arrays have improved a lot. The benchmark and conclusions are no longer accurate.

(Chrome 10.0.611.0 Canary, Firefox Minefield 4.0b9pre, Opera 11.00 Beta 1111, Safari 5.0.3 (7533.19.4))

Conclusion

Typed Arrays are not viable

Typed arrays are a new addition to Javascript and the specifications are not frozen yet. A wish would have been that they perform better as they are direct machine representation. However benchs shows us the opposite. There is probably a lot of boxing / unboxing being done behind the scene between the representation and the number type.

One advantage of typed arrays is the ability to instantly work on a raw binary image (like p*m). All you have to do is to map your typed array to the file data part.

Since Typed Arrays are not implemented on all browsers and are slower than both classic arrays and canvas, they should not be used right now.

Classical Arrays

In both Chrome and Firefox classical arrays are faster than canvas (respectively 25% and 45%). They are containing numbers that are stored on either 32 or 64 bits (Mozilla Implementation). This is more than the 8 bits of the Canvas element.

Images are usually being stored in binary files readable either through the Canvas element or the Typed Arrays. The use of Classical Arrays requires to do 2 type conversions (loading and saving). This is a big overhead for small processing.

If you want to do heavy processing and are targeting either Firefox or Chrome, Classical Arrays may be a good choice. For more than 8 bits values it is the preferred method.

Canvas Arrays

Canvas Arrays are slower than classical arrays in Chrome and Firefox, however they are really fast on Opera. They have one huge advantage is the ability to be read and written directly through the canvas element. But they are limited to unsigned 8 bits.

If you want to do normal processing on traditional rgb values, Canvas Array is the best option. Especially on Opera where it is blazing fast.

As I read an article about solving the 8-queen problem storing the board in a 64bit integer (French) I wanted to test it in Javascript. I knew that numbers where not stored as int64 but who knows, maybe it would have worked!

As you may have expected, it failed, giving completly off results. The reason behind this is a lack of precision, the least significant bits are being ignored.

(0x8040201008040201).toString(2)
//  10000000 01000000 00100000 00010000 00001000 00000100 00000010 00000001
>> "10000000 01000000 00100000 00010000 00001000 00000100 00000000 00000000"
                                                                ^         ^

Floating Points

The reason behind this behavior is the fact that numbers are stored as floating points! The best way to know what's the exact storage system is to look at the specifications:

The type Number is a set of values representing numbers. In ECMAScript, the set of values represents the double-precision 64-bit format IEEE 754 values including the special "Not-a-Number" (NaN) values, positive infinity, and negative infinity.

Source: ECMA 262 - 4.3.20 Number Type

Looking at Wikipedia we've got this table:

Total bits Sign Exponent Significand
64 1 11 52

If you don't know how floating point values are working, let me quick explain it to you. It's in fact really simple. You start with a normal unsigned integer, in this case of 52 bits. You shift it left or right with the value of the exponent, here stored on 11 bits. And if the sign bit is set, you mark it as negative. (This is a really simplistic view, but serve the purpose here!)

Example:

Significand: 1000 1100 1001, Exponent: 4, Sign: 1
 
- 1000 1100 1001 0000
                < <<<

As you can see, when the exponent is set, 0 are being added at the end of the number. This is the source of our precision loss.

Maximum Integer

With this in mind we can guess the biggest number that does not suffer from this. We have to fill the significand with 1, set the exponent to 0 and the sign to 0. It is the number 253-1.

Significand: 111..111 ~ 52, Exponent: 0, Sign: 0
 
  1111 1111 .... 1111 = 2 ^ 53 - 1

Now let's see what happen if we want to write 253.

Significand: 100..000 ~ 52, Exponent: 1, Sign: 0
 
1 0000 0000 .... 0000 = 2 ^ 53
                    <

It still works due to the fact that the least significant bit is set to 0 during the shift. However this is impossible to write 253+1, the next number we can write is 253+2!

Significand: 100..001 ~ 52, Exponent: 1, Sign: 0
 
1 0000 0000 .... 0010 = 2 ^ 53 + 2
                    <

When Trouble Arises!

Finally, we can provide the first example such as n == n + 1!

Math.pow(2, 53) == Math.pow(2, 53) + 1
>> true

You should take care using traditional loops with big numbers, they will go infinite after that maximum integer value.

// WARNING: DO NOT TRY THIS AT HOME!
 
var MAX_INT = Math.pow(2, 53); // 9 007 199 254 740 992
for (var i = MAX_INT; i < MAX_INT + 2; ++i) {
  // infinite loop
}

Conclusion

The integer part of the Number type in Javascript is safe in [-253 .. 253] (253 = 9 007 199 254 740 992). Beyond this there will be precision loss on the least significant numbers.

If you want to use bigger integers there are several options:

  • String: if you don't want to perform arithmetic operations on them.
  • Combine Multiple Numbers: One number gives you 52 bits of precision. It's possible to use an array of them to achieve the precision you need. See Closure Long (Int64) and Leemon Big Integer for examples.

A friend of mine gave me a great challenge. Find out how jQuery did to return objects that behave like arrays but that are not arrays! The aim of this article is to find out how Firebug and Web Inspector can display an object with the bracket notation.

$('a')
>> [<a href="http://vjeux.com/">vjeux</a>, <a href="http://google.com/">google</a>]

Not an Array!

First, let's make sure that this is really not an Array. If you wonder why just not subclassing the array, there are many reasons explained on this great article.

// Easy way
typeof $('a')
>> "object"
 
// Normal way
$('a') instanceof Array
>> false
 
// Duck typing ...
$('a').indexOf
>> undefined
 
// Object toString
// http://whattheheadsaid.com/2010/10/cross-context-isarray-and-internet-explorer
Object.prototype.toString.call($('a'))
>> "[object Object]"
 
// http://ajaxian.com/archives/isarray-why-is-it-so-bloody-hard-to-get-right
$('a').constructor
>> function Object() { [native code] }
 
// ES5 Way
// https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Array/isArray
Array.isArray($('a'))
>> false

Testing ...

Obviously Firebug and Web Inspector must use another technique. What they need to display an Array is only the length property and the keys [0 .. length-1]. So let's try!

({0: 42, 1: 666, length: 2})
>> Object { 0=42, 1=666, length=2 }

This is obviously not working 🙁

In Source the Truth Lies!

Hopefully, both Webkit and Firebug are open source projects, this means that anyone can browse the source code and therefore tell exactly what happens behind the scene.

// Web Inspector Source Code
// http://trac.webkit.org/browser/trunk/WebCore/inspector/front-end/InjectedScript.js#L409
 
// FireBug's array detection.
if (isFinite(obj.length) && typeof obj.splice === "function")
  return "array";

We can see that Duck Typing is being used. It is now really easy to trick Firebug and Web Inspector to make them believe that your objects are arrays! Only 3 small rules are required:

  • [0 .. length - 1]: Array elements
  • length: Set to a positive integer
  • splice: Any function (even empty)

And here is the demo 🙂

({0: 42, 1: 666, length: 2, splice: function() {}})
>> [42, 666]

Check out the updated World of Warcraft talent calculator I've been working on 🙂