SmallHash encodes any range of integers into the smallest possible string. This way, you can use the hash part of your url with efficiency.

You can view the source at the SmallHash Github Repository.

My problem is having these options stored in the minimum characters as possible.

  • Faction: Alliance, Horde
  • Region: US, Europe
  • Type: PvE, PvP, RP
  • Lang: EN, FR, ES, DE, RU

The two faction and region could be stored in base 2 with no problem. However, if we wanted to store the others in base 2, there would have been space left. So i started digging up into the base conversion.

Here is the code to do a base2 to base10 conversion.

base10 = 0
foreach (bit in base2) {
  base10 *= 2
  base10 += bit
}

As you can see, we multiply the final number by 2, which is the number of possibilities. So, instead of multiplying by 2, we multiply by the number of possible options and it works! The decoding process is using the same technique by changing the divisor.

To get back to our example. [Alliance, US, PvP, DE] can be expressed as [0,0,1,3] over [2,2,3,5]. It will be encoded and decoded easily with the SmallHash library:

var input = [0,0,1,3];
var encoded = SmallHash.encode(input, [2,2,3,5], 'abcdefghijklmnopqrstuvwxyz');
var decoded = SmallHash.decode(encoded, [2,2,3,5], 'abcdefghijklmnopqrstuvwxyz');
console.log(input, encoded, decoded);
// Result: [0, 0, 1, 3], "bo", [0, 0, 1, 3]

As you can see, it fits into 2 characters instead of 4 with the easy way. The gain increases with the number of data you have to encode. This can also be improved by enlarging the base characters (uppercase letter, digits and special characters).

The algorithm is fairly easy, it is the same one explain before but using the range instead of 2 (when converting in base 2). This is the pseudo-code version.

 
SmallHash = {
  // encode( [2, 4], [10, 15], '0123456789' ) : '42'
  encode: function (input, ranges, base) {
    var result = 0
    for offset = ranges.length - 1 downto 0
      result = result * ranges[offset]
      result = result + input[offset]
 
    return int2str(result, base)
  },
 
  // decode( '42', [10, 15], '0123456789' ) : [2, 4]
  decode: function (input, ranges, base) {
    input = str2int(input, base)
    var result = []
 
    for offset = 0 to ranges - 1
      result[offset] = inputs % ranges[offset]
      inputs = inputs / ranges[offset]
 
    return result;
  }
};

Here is the full source code. This is the same code but being less readable due to the use of BigInt and the need of managing the allocation size.

// Requires BigInt.js ( http://blog.vjeux.com/wp-content/uploads/2009/08/BigInt.js )
SmallHash = {
  encode: function (input, ranges, base) {
    // Rough majoration of the final result size
    // It makes the sum of all the minimum of bits required for each range
    var size = 0;
    for (var i = 0, len = ranges.length; i < len; i = i + 1) {
      size += Math.ceil(Math.log(ranges[i]) / Math.LN2);
    }
    var result = bigInt.int2bigInt(0, size);
    for (var bit = ranges.length - 1, pos = 0; bit >= 0; bit = bit - 1, pos = pos + 1) {
      // If the value is higher than the expected range, the value is maximized
      // Therefore the result is always valid, even if the input is not
      var parsed_bit = Math.min(Math.abs(Math.floor(input[bit])), ranges[bit] - 1);
      bigInt.mult_(result, bigInt.int2bigInt(ranges[bit], 32));
      bigInt.add_(result, bigInt.int2bigInt(parsed_bit, 32));
    }
    return bigInt.bigInt2str(result, base.length, base);
  },
  decode: function (input, ranges, base) {
    input = bigInt.str2bigInt(input, base.length, base);
    var remainder = bigInt.dup(input); // Allocates enough room for the remainder
    var result = [];
    for (var pos = 0, len = ranges.length; pos < len; pos = pos + 1) {
      bigInt.divide_(input, bigInt.int2bigInt(ranges[pos], 32), input, remainder);
      result[pos] = Number(bigInt.bigInt2str(remainder, 10, '0123456789'));
    }
    return result;
  }
};

This script is using the BigInt library from Leemon Baird. I made some changes in order not to pollute the global namespace and added the possibility to modify the base string.

Update January 2010 - SmallHash is now being used on production at wowtal.com and you can download the source at http://static.mmo-champion.com/db/js/smallhash.js.

If you liked this article, you might be interested in my Twitter feed as well.
 
 

Related Posts

  • September 22, 2011 URLON: URL Object Notation (42)
    #json, #urlon, #rison { width: 100%; font-size: 12px; padding: 5px; height: 18px; color: #560061; } I am in the process of rewriting MMO-Champion Tables and I want a generic way to manage the hash part of the URL (#table__search_results_item=4%3A-slot). I no longer want to wr...
  • January 3, 2010 Bistromathique – Optimized Arbitrary Precision Calculator (0)
    The Bistromathique is an Arbitrary Precision Calculator EPITA project where the main focus is optimization. The input can be in any base (up to 250) and the following operations have to be performed: Addition, Subtraction, Multiplication, Division, Modulo. Base Representation Going back and f...
  • August 4, 2009 Project – Shortest Path (0)
    A school project was to find the shortest path in a dungeon graph. You start with an amount of hit points, and each edge gives or removes hit points. You have to find the path from two points going through the minimum of edges (no matter their value) alive (hp > 0 all along the path). The difficu...
  • January 8, 2010 Javascript – Sorting Table (6)
    For my new project on World of Raids I have to implement a table sorting. The browser not stable sorting and the faster sorting trick add difficulty to the task. String Comparison As mentionned in the Speed Up Javascript Sort() article, using a string as a key to represent each element is fas...
  • December 24, 2009 Guild Recruitement – Search Optimizations (0)
    In World of Raids Guild Recruitment Tool, we wanted the search to be instant! We achieved this goal by not having a single page refresh or ajax call while updating the settings. Since it runs in the browser, performance was a real concern. We will see in this article what were the tricks used to ...