A slug is a way to represent a title with a limited charset (only lowercase letter and dash) to be inserted in the url. Even if it is a common function there is no good enough implentation when you Google for it.

Here are the features I needed:

  • No multiple dashes. ---- is converted to -
  • No wrapping dashes. -title- is converted to title
  • Basic support for internationalization. Coût d'éclat is converted to cout-d-eclat Don't look at the spelling mistake!
  • Basic support for common signs. 13$ & 12€ is converted to 13-dollar-and-12-euro

Demo

You can try the demo and see if it fits your needs.

Code

var keys = keys || function (o) { var a = []; for (var k in o) a.push(k); return a; };
 
var slug = function (string) {
//  var accents = "àáäâèéëêìíïîòóöôùúüûñç";
  var accents = "\u00e0\u00e1\u00e4\u00e2\u00e8"
    + "\u00e9\u00eb\u00ea\u00ec\u00ed\u00ef"
    + "\u00ee\u00f2\u00f3\u00f6\u00f4\u00f9"
    + "\u00fa\u00fc\u00fb\u00f1\u00e7";
 
  var without = "aaaaeeeeiiiioooouuuunc";
 
  var map = {'@': ' at ', '\u20ac': ' euro ', 
    '$': ' dollar ', '\u00a5': ' yen ',
    '\u0026': ' and ', '\u00e6': 'ae', '\u0153': 'oe'};
 
  return string
    // Handle uppercase characters
    .toLowerCase()
 
    // Handle accentuated characters
    .replace(
      new RegExp('[' + accents + ']', 'g'),
      function (c) { return without.charAt(accents.indexOf(c)); })
 
    // Handle special characters
    .replace(
      new RegExp('[' + keys(map).join('') + ']', 'g'),
      function (c) { return map[c]; })
 
    // Dash special characters
    .replace(/[^a-z0-9]/g, '-')
 
    // Compress multiple dash
    .replace(/-+/g, '-')
 
    // Trim dashes
    .replace(/^-|-$/g, '');
};

Bonus

If you are bored, here is a little exercise for you. Can you find a and b such as

a >= b && a < = b
// true
 
a == b
// false
BinaryReader is buggy and no longer maintained. Check out jDataView for an up to date version.

With WebGL coming in, it is important to be able to deal with binary data files like the models. Since there is no such thing on the internet right now I decided to make my own. The javascript BinaryReader library tries to mimic the C# BinaryReader.

The code is mostly based on the binary-parser class from Jonas Raoni Soares Silva. I've added the position management and re-factored the code to remove the with syntax.

In order to load a file into a string, you have to add req.overrideMimeType('text/plain; charset=x-user-defined'); in the Ajax XMLHttpRequest. To read more about this technique, see the Mozilla Developer Center. Here is an overview of the compatible browsers.

  • Chrome 4.0.295.0: Works
  • Firefox 3.5.7: Works
  • Safari 4.0.4: Works
  • Internet Explorer 8: Does not work. Doesn't have the overrideMimeType method.
  • Opera 10.10: Does not work. Have the overrideMimeType method but doesn't take it in account.

I hope this will help you to parse binary files!

BinaryReader is buggy and no longer maintained. Check out jDataView for an up to date version.

Download

API

Constructor

  • new BinaryReader(data: String) - Create a BinaryReader with the specified file.

Read Methods

  • BinaryReader.readChar()
  • BinaryReader.readString(length: Number)
  • BinaryReader.readInt8()
  • BinaryReader.readUInt8()
  • BinaryReader.readInt16()
  • BinaryReader.readUInt16()
  • BinaryReader.readInt32()
  • BinaryReader.readUInt32()
  • BinaryReader.readUInt32()
  • BinaryReader.readFloat()
  • BinaryReader.readDouble()

Position Methods

  • BinaryReader.seek(pos: Number) - Go to a specific position (in Byte) in the file
  • BinaryReader.getPosition() - Returns the actual position (in Byte) in the file
  • BinaryReader.getSize() - Returns the size (in Byte) of the file

Exception

  • Error("Index out of bound") - When you try to read something that is beyond the end of the file.
BinaryReader is buggy and no longer maintained. Check out jDataView for an up to date version.

Example - Demo

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// Code from https://developer.mozilla.org/En/Using_XMLHttpRequest#Receiving_binary_data
function load_binary_resource(url) {
	var req = new XMLHttpRequest();
	req.open('GET', url, false);
	// The following line says we want to receive data as Binary and not as Unicode
	req.overrideMimeType('text/plain; charset=x-user-defined');
	req.send(null);
	if (req.status != 200) return '';
	return req.responseText;
}
 
// Load the file
var file = load_binary_resource('author.gif');
 
// Create the Binary Reader
var reader = new BinaryReader(file);
 
// Read some informations
var tag		= reader.readString(6);
var width	= reader.readUInt16();
var height	= reader.readUInt16();
 
// Move around the file and try to read what is there
reader.seek(parseInt('FA', 16));
var random	= reader.readFloat();
 
console.log(
	tag,	// GIF89a
	width,	// 16
	height,	// 16
	random	// 66.01171875
);

Existing Dispatch Methods

Dispatch is the process of mapping a function call to a specific function based on its arguments. Most of the time this is done at runtime through the binding phase, however it usually lacks of modularity.

Dispatch by Type

The first way dispatch has been implemented is through function overloading. Two functions share the same name and depending on the type of the argument, a choice will be made to call the right one.

int   sqr(int i)   { return i * i; } // int version
float sqr(float i) { return i * i; } // float version

When entering the Oriented Object world, there is a more powerful way to dispatch that is not only based on the type but also the type hierarchy. You can allow an object of type A to use the method implementation of one of its children.

class A { public: virtual void print() = 0; };
 
class B : public A { public: void print() { std::cout < < "B" << std::endl; }; };
class C : public A { public: void print() { std::cout < < "C" << std::endl; }; };
 
void doPrint(A& a) {
  // A doesn't have a definition of the print method
  // There is an implicit dispatching to determine which of the B::print or C::print 
  // will be called. This is done at runtime
  a.print();
}

This is a single dispatch since we could see the print function this way: void print(B this) { ... }. In C++ we are limited to only one dispatch, even if Stroustrup proposed a way to support multimethods in C++. The main problem is that it's being limited to types.

Dispatch by Constant Value

There are ways to use other criteria in order to dispatch, for example the use of numerical constants. In C++ the way to do it is to use templates.

// General case
template < int N >
struct fact {
  enum { value = N * fact< N - 1 >::value };
};
 
// Specialization for n = 0
template <>
struct fact< 0 > {
  enum { value = 1 };
};

You have a code that will run differently depending on the parameter. We can now write functions that work for specialized types and that takes in account their type hierarchy and for specialized cases represented with numerical constants.

If you want to do fancier things, you are stuck! Your only way to achieve the result you expect is to write custom branching yourself. This is where Full dispatch comes in action.

Full Dispatch

Full Dispatch instead of redirecting to the right function based on hard-coded parameters such as the type or the equality of the argument with some constant, allows you to provide a function for each argument.

This way, you will be able to dispatch the function based on the criterion you want. You are no more stuck with a special subset of the language.

You would like to write something like this:

int fact(int n : n == 0 || n == 1) { return 1; }
int fact(int n) { return n * fact(n - 1); }

Implementation

To dispatch, we have to know two informations for each overriding method: the actual function and a list of boolean function to test against the parameters.

I've chosen to store it as an array of overriding method, where the overriding is stored as an array containing in first element the actual function followed by the argument conditions.

The dispatching function is trivial to write, you just have to walk through that table and if all the arguments match the conditions then call it.

function fullDispatch(arg, list) {
  outer: for (var i = 0; i < list.length; ++i) {
    if (list[i].length - 1 !== arg.length) {
      // Arguments length is different than number of conditions, no need to continue
      continue;
    }
 
    for (var j = 0; j < arg.length; ++j) {
      if (typeof list[i][j + 1] === 'function' && !list[i][j + 1](arg[j])) {
        // Condition fail, try next overriding function
        continue outer;
      }
    }
    // Execute the right function
    return list[i][0].apply(this, arg);
  }
  output.error('no dispatch found');
}

And here is the code to use it for the factorial function:

var fact = function () {
  return fullDispatch(arguments, fact.list);
}
 
fact.list = [
  [function (n) { return 1; },
    function (n) { return n === 0 || n === 1; }],
 
  [function (n) { return n * fact(n - 1); },
    null]
];

Helper functions

Most of the time, the condition functions will be trivial to write and will make the code unreadable because of the verbose way to declare an anonymous function.

The first step to make this cleaner was to allow null to be used when you don't want to impose any condition on the argument. It means "always return yes" and avoid a useless function call.

Then, we want to avoid to write basic conditions such as "is zero" or "is negative". Using the power of lambda function in association with closure, it's possible to write generics helpers to make these with no effort.

var helper = {};
 
// Combination
helper.or = function (f, g) { return function (x) { return f(x) || g(x); }; };
helper.and = function (f, g) { return function (x) { return f(x) && g(x); }; };
helper.not = function (f) { return function (x) { return !f(x); }; };
 
// Primitives
helper.Value = function (n) { return function (x) { return x === n; }; };
helper.StrictNegative = function (x) { return x < 0; };
 
// Primitive Combinations
helper.Zero = helper.Value(0);
helper.Negative = helper.or(helper.StrictNegative, helper.Zero);
helper.StrictPositive = helper.not(helper.or(helper.Zero, helper.StrictNegative));
helper.Positive = helper.not(helper.StrictNegative);

Of course, the shown primitive combinations could be written directly but this shows you how to construct and use complex functions based on few bricks.

Final Result

This is an example of the ackermann function written with the Full Dispatch method and the little helper functions.

var ackermann = function () {
  return fullDispatch(arguments, ackermann.list);
}
 
ackermann.list = [
  [function (m, n) { return n + 1; },
    helper.Zero,
    null],
 
  [function (m, n) { return ackermann(m - 1, 1); },
    helper.StrictPositive,
    helper.Zero],
 
  [function (m, n) { return ackermann(m - 1, ackermann(m, n - 1)); },
    helper.StrictPositive,
    helper.StrictPositive]
];

Write Less, Do More

The current version works well but is the syntax is too heavy. In order to make this work we have to copy each time the fullDispatch call which isn't good. We can use a function generator to do the work.

We want to put the conditions first, since they will tend to be smaller, they will appear nicely in one line just before the function definition. And having a setter could be interesting, this would allow to check the input and hide the structure from the user.

Applying these modifications, we would have a code that look like this:

var ack = FullDispatch();
 
ack.add([Zero, null], function (m, n) { return n + 1; });
ack.add([StrictPositive, Zero], function (m, n) { return ack(m - 1, 1); });
ack.add([StrictPositive, StrictPositive], function (m, n) { return ack(m - 1, ack(m, n - 1)); });
ack.add([null, null], function (m, n) { return 0; });

Which is a pretty nice result. This is not as short as the haskell code but this doesn't use any syntaxic sugar from the language and can be used with any kind of argument conditions.

ackermann 0 n = n+1 
ackermann (m+1) 0 = ackermann m 1
ackermann (m+1) (n+1) = ackermann m (ackermann (m+1) n)
ackermann _ _ = 0

To make this work we create a function that is going to to the real dispatch and augment it with the setter and the function list.

var FullDispatch = function () {
  // Create the function that does the real dispatch
  var object = function () {
    return fullDispatch(arguments, object.list);
  };
 
  // Give the setter to the function
  object.add = function (conds, func) {
    object.list.push([func].concat(conds))
  }
 
  // Initialize the function list
  object.list = [];
 
  return object;
};

Demo

The code demonstrated here can been seen working in a demo.

Follow-Up

This little library is working well as it is. There is still some work to do in this field. First, we have to find uses for this pattern. I haven't thought yet of any uses of it and the example provided do not show the real benefit of such a tool. I am pretty sure now this is known and implemented some of you will find how to use it!

The second point is performance issues. Having such a modular dispatch system is great for expressiveness but come at a price. The overhead is pretty huge. One dispatch of a call costs at most n * m where n is the number of overriding functions and m is the number of arguments.

Functions can be added a way that only one (or two) conditions are tested each time. By mimicking the way you would write a branching using if-then-else, the overhead is one function call per condition (and per argument).

It's certainly enough to avoid when performance is required, however I am confident this is acceptable in most conditions compared to the flexibility it gives. Benchmarks would be more than welcome.

For my new project on World of Raids I have to implement a table sorting. The browser not stable sorting and the faster sorting trick add difficulty to the task.

String Comparison

As mentionned in the Speed Up Javascript Sort() article, using a string as a key to represent each element is faster than using a custom sort function in most browsers.

Tables are commonly made of two data types: Numbers and Strings. Strings are the final representation so no work is involved there so our work will be focused on Numbers. What we want is to fit a number into a string. A string is a succession of characters, each one able to hold 256 values. So, we can see the problem as encoding the number into a base 256 which is trivial.

Padding

String comparison does not work exactly like we want it. It reads all the characters one by one and if they mismatch then it returns who is the highest. So for example "10" < "5" because '1' < '5'. We have to pad every numbers with 0 at the beginning and become "10" > "05".

In this case we are adding one 0 because 10 is 2 digits and 5 is only one. We won't know the values of other elements when making a comparison, so two solutions apply: either we iterate over all the elements and retrieve the highest, or we set an arbitrary maximum.

Characters Maximum
1 256
2 65 536
3 16 777 216
4 4 294 967 296

With that table in mind, it is obvious that it would be a waste of time to inspect every element just to find the highest value. 1 or 2 characters will probably fit most uses but if you want to be safe, you will always be able to store your number into 4 characters which is acceptable.

Here is the code to do the conversion:

function compareAsc(num, digits) {
  var s = "";
  // Encode num in base 256
  while (num !== 0) {
    s = String.fromCharCode((num % 256) | 0) + s;
    num = (num / 256) | 0;
    digits -= 1;
  }
  // Fill with 0
  while (digits > 0) {
    s = String.fromCharCode(0) + s;
    digits -= 1;
  }
  return s;
}

For the sake of simplicity, this code is using string concatenation which requires to build up new strings and make copy several times. Since this is for small strings (up to 4 characters!) this is probably alright but one looking for performances could bench this with the use of an Array with a .concat() at the end to build the string.

Also note that numbers are stored as float in Javascript, so we are rounding (flooring to be exact) all the values using the |0 trick.

Descending Order

We have made the code for an ascending sort but we have to handle the other case. We cannot just revert the order of the sorting or adding a minus sign before the number. What we have to do is to do a 256-complement of the final result, in other terms: digit = 255 - digit. Let see an example for a base 10.

00 -> 99
01 -> 98
...
54 -> 45
55 -> 44
56 -> 43
...
98 -> 01
99 -> 00

You can see that all the numbers are now sorted in the opposite order.

Here is the associated code :

function compareDesc(num, digits) {
  var s = "";
  // Encode num in base 256 and do the 256-complement
  while (num !== 0) {
    s = String.fromCharCode(255 - (num % 256) | 0) + s;
    num = (num / 256) | 0;
    digits -= 1;
  }
  while (digits >= 0) {
    s = String.fromCharCode(255) + s;
    digits -= 1;
  }
  return s;
}

Floats and Negatives

We have handled all the positive integers but they are not alone. In order to deal with the negative numbers you can apply the 2-complement to the number. This is how it is done in computer arithmetic.

To handle floats, it is possible to treat them as integers by multiplying by 10^(number of decimal displayed). So 12.34 is translated to 1234 and 11 to 1100 and it's working just fine. You will probably need more than 4 characters if you are using huge numbers though.

Stable Sorting

Implementing a table sorting requires the sorting algorithm to be stable (values that have the same key keep their original order). This property allows people to sort by multiple columns, the last one having the highest weight.

However, browser implementations of the Array.sort() are not stable. From there we have two solutions, implementing a stable sort in javascript or find a way to emulate that behaviour. Since Javascript is slow in many browsers, implementing such a computation heavy algorithm in Javascript is going to slow down things and requires more efforts than I am willing to spend on this project.

The solution instead is to understand what the stable property means and find a way to code it.

Stable sorting algorithms maintain the relative order of records with equal keys. [...] Whenever there are two records (let's say R and S) with the same key, and R appears before S in the original list, then R will always appear before S in the sorted list.

We can easily transpose that sentence into the sorting function:

if (R.key == S.key)
  return compare(R.position, S.position)

We need to know one more thing to do the computation: the position of each element in the list before sorting. This is straightforward to do, you have to iterate over all the elements and update their position field. This requires n steps which is acceptable.

Here is the full Javascript code

var sort = function (a, b) {
  if (a.key === b.key)
    return a.position - b.position;
  if (a.key < b.key)
    return -1;
  return 1;
};

String comparison

To apply this concept to the string comparison, you can append the position at the end of the string. "Key" now being "Key1", "Key2" and so on. When the keys are equal, the position is used for the comparison instead. We can reuse the compareAsc() function to encode the position using the minimal size.

This works well for fixed-size strings like the representation of numbers. However common strings do not share that property and a problem arise when two strings share a common base.

Take "Method" and "Methodology". It is obvious that "Method" < "Methodology". Now append the position: "Method" + chr(250) > "Methodology" + chr(251) the comparison changed because chr(250) > 'o'. (Where chr = String.fromCharCode').

The way to fix this problem is to add a chr(0) before the position. Considering that a normal string never contains a chr(0), the chr(0) will always be lesser than any character of the string and therefore stop the comparison. The longer string always wins!

,

In World of Raids Guild Recruitment Tool, we wanted the search to be instant! We achieved this goal by not having a single page refresh or ajax call while updating the settings. Since it runs in the browser, performance was a real concern. We will see in this article what were the tricks used to make this real.

Sorting

In order to sort the guilds, each guild is given a matching score (between 0 and 65535) based on the current filters. At start, each guild has the maximum points. Then, each filter is giving a malus. For example when a specialization doesn't match, the guild loses 10%.

This means that everytime you change a filter, you have to recompute that matching score for all the guilds. Instead of recomputing the whole matching score, the guild holds the result of each filter. Everytime a filter is updated, the value of this filter is upated on every guild. The global matching score is now recalculated by summing all the partial precomputed scores.

Every bit of code in the matching function has been optimized, no more jquery code, loops are unrolled and great use of lazy boolean expression evaluation. The sort function itself also got boosted thanks to a trick I commented on my previous article Speed Up Javascript Sort().

Guild Recruitment Search Filters

Display

Once we have the data sorted, it has to be displayed! And this is another big time eater. When it takes 50ms to sort the data, it takes about 150ms to display 10 guilds!

Since we wanted performances, DOM elements are created at page loading, they are then only being updated as the guilds change. And still, only the required changes are made, for example tooltips are created as the user trigger the mouseover event.

The main problem with this is that we are not able to make bulk changes. Everytime you make the slightest change, the whole UI has to be redrawn which is a waste of time. I hope there will be such API in the future.

Guild Display

Since it's a web application, we had to maintain the url up to date, this is done using SmallHash, a small library of mine that compresses filter data into the smallest possible hash.

Scaling

Having all the data being sent at page loading is a pain, it makes it load slowly. One way to avoid this problem would be to use the localstorage to keep these data accross sessions and only send the updated guilds. However at the time I started developping the application there was no released browser supporting HTML5.

However, the application is focused on World of Warcraft guilds. This is a limited audience and there is no need on heavy scaling there. So we decided that it was a good option to go.

Up to 3000 guilds, this is pretty much instant in every browser in my MacBook Pro. When increasing that number to 10 000 it is taking more than a second to update on IE8. It requires 50 000 guilds for Chrome 3 to get slow, which is a pretty good score. It means that it is possible to do heavy processing, the bandwith now being the limiting factor.

,