Andres Suarez pointed me to some interesting code in the Hack codebase:

let slash_escaped_string_of_path path =
  let buf = Buffer.create (String.length path) in
  String.iter (fun ch ->
    match ch with
    | '\\' -> Buffer.add_string buf "zB"
    | ':' -> Buffer.add_string buf "zC"
    | '/' -> Buffer.add_string buf "zS"
    | '\x00' -> Buffer.add_string buf "z0"
    | 'z' -> Buffer.add_string buf "zZ"
    | _ -> Buffer.add_char buf ch
  ) path;
  Buffer.contents buf

What it does is to turn all the occurrences of \, :, /, \0 and z into zB, zC, zS, z0 and zZ. This way, there won't be any of those characters in the original string which are probably invalid in the context where that string is transported. But you still have a way to get them back by transforming all the z-sequences back to their original form.

Why is it useful?

The first interesting aspect about it is that it's using z as an escape character instead of the usual \. In practice, it's less likely for a string to contain a z rather than a \ so we have to escape less often.

But the big wins are coming when escaping multiple times. In the \ escape sequence, it looks something like this:

  • \ -> \\ -> \\\\ -> \\\\\\\\ -> \\\\\\\\\\\\\\\\

whereas with the z escape sequence:

  • z -> zZ -> zZZ -> zZZZ -> zZZZZ

The fact that escaping a second time doubles the number of escape characters is problematic in practice. I was working on a project once where we found out that the \ character represented 70% of the payload!


It's way too late to change all the existing programming languages to use a different way to escape characters but if you have the opportunity to design an escape sequence, know that \ escape sequence is not always the best 🙂

I'm working a lot with URLs that contain ids and very often, I made a mistake in one digit of the long id and end up with a completely different element. If I don't pay attention, then I end up looking at two elements thinking they are the same and am intrigued until I find out the mistake.

In order to avoid that, I wanted to know if I could make a sequence of ids where making one mistake would not give a valid id. For example, if you have the id 473, then you have to black list all the ids where the first digit is wrong (073, 173, 273, 373, 573, 673, 773, 873, 973) and the second being wrong (403, 413, 423, 433, 443, 453, 463, 483, 493) and the last one (470, 471, 472, 474, 475, 476, 477, 478, 479).

Here is the list of the first pseudo-numbers:

  1. 011
  2. 022
  3. 033
  4. 044
  5. 055
  6. 066
  7. 077
  8. 088
  9. 099
  10. 101
  11. 110
  12. 123
  13. 132
  14. 145
  15. 154
  16. 167
  17. 176
  18. 189
  19. 198
  20. 202
  21. 213
  22. 220
  23. 231
  24. 246
  25. 257
  26. 264
  27. 275
  28. 303
  29. 312
  30. 321
  31. 330
  32. 347
  33. 356
  34. 365
  35. 374
  36. 404
  37. 415
  38. 426
  39. 437
  40. 440
  41. 451
  42. 462
  43. 473
  44. 505
  45. 514
  46. 527
  47. 536
  48. 541
  49. 550
  50. 563
  51. 572
  52. 606
  53. 617
  54. 624
  55. 635
  56. 642
  57. 653
  58. 660
  59. 671
  60. 707
  61. 716
  62. 725
  63. 734
  64. 743
  65. 752
  66. 761
  67. 770
  68. 808
  69. 819
  70. 880
  71. 891
  72. 909
  73. 918
  74. 981
  75. 990

And here is a visual representation of those numbers:

For each digit in the number, we blacklist 9 other numbers. For a 5 digits number (eg 12345), that means blacklisting 45 numbers. For a 10 digits number (eg 1234567890), that means blacklisting 90 numbers. The number of blacklisted numbers only grows at a logarithmic scale.

In order to see how many numbers we lose, I plotted the ratio of pseudo numbers count compared to the real numbers. We can roughly keep one number every fifteen. But the good news is that the ratio doesn't fall off the chart as the numbers grow.

Looking at the numbers, they looked like to go from 10 to 10 but with some huge spikes and sometimes they were closer. So I plotted the difference between two consecutive numbers in a chart and it looks like the difference is centered around 10. But the variance is getting higher and higher as you move further.

I'm not really sure if this sequence can be really useful in practice but that was a fun week-end experiment. I hope it'll give you some, hopefully useful, ideas 🙂

Lately, I've been advocating to all my student friends to start a blog. Here's an article with the most common questions answered 🙂

What are the benefits?

Being known as an expert. The majority of my blog posts are about advanced Javascript topics. As a result, I'm being tagged as the "Javascript" guy by people that know (broad sense) me. Every time they have a question about Javascript or Web Development they go to me.

Expanding your relations. Through my activities with this blog, I had conversations with celebs such as Bjarne Stroustrup (C++), Brendan Eich (Javascript), Jeremy Ashkenas (CoffeeScript) and many other people that I didn't know before. No one in my close relations is deeply interested in what I am writing. But thanks to the internet, I can meet other people that shares the same interests.

Getting Recruited. As you become an expert in a field, people start to notice and will want you. My blog is barely known and still, I have received several job offers. However this happens at random, take it like a gift but don't expect it to happen, or it won't 🙂

Being better. I've been so much better at Javascript since I started writing about it. But maintaining this blog also helped me improve my writing skills as well as my English.

Benefits are long term. You should not start a blog and expect it to be rewarding the next week. A blog is a presence on the internet. During the first 6 months, I barely had more than 5 visitors per day, but this number slowly started growing as I wrote more articles and time passed. Now, I'm at 100 visitors a day. I'm still amazed to see that so many people come!

Who cares about my blog?

Your friends. The first thing I do when I am done writing an article is to paste the link to some of my MSN contacts that will be vaguely interested in. It is a powerful way to start a conversation with people you did not talk with for a long time. I often re-write parts of the article given the feedback I receive and it leads to great conversations.

People on the Internet. My girlfriend always says "If you thought about it, someone did it". On the same idea, I believe that the following statement is correct: "If you find something interesting, someone else does too". Here is a fact, there are over 100 visitors on my blog everyday. Google connects people with the same interests!

People that want to know about you. The extreme example is your recruiter, but it could be a co-worker or even just a friend. Your blog is the place where you can show who you are and what you are worth. It is much better than a resume as it isn't constrained by a strong standardization and a ridiculous 1-page constraint.

I don't know what to write

Projects you've done. The easiest way to start a blog is to write one article per school/personal projects you have done. Put a screenshot, a description of what it does and you are set. This way, if I want to know about you, I will quickly scroll over your blog and will see what you have done. If something interests me, I'm going to read the article more closely.

Techniques you have used. It is interesting to go over your projects and allocate one blog article for each technical problem you solved. For example, the project Guild Recruitment leaded me to write several articles such as Dynamic Query Throttling, Mysqli Wrapper, Search Optimizations. The latest eventually made me look deeper to write Sorting Table.

Follow-up a conversation. I am often debating with my friends on topics related to computer science. Those discussions are potential targets for a blog post! It is really interesting to go deeper and start writing about it. You'll get more arguments to crush your friends theories 🙂

Start writing and ideas will flow. Once you have an article written, it often generates new ideas. Now, every time I write some code, I ask myself if that would be worth blogging. If yes, I create a draft on WordPress for later. This way when I'm in the mood of writing, I have many things to write.

There are already many article on the subject, why should I write mine?

Influence people around you. I always love reading my friends blog because they are talking about subjects that I'm interested in but that I would not have willingly researched for. It helps expanding my Javascript-centric horizon.

If you write about it, it means you know it. For example I wrote an article about how to use makedepend in makefiles. It tells recruiter multiple things. I have experienced the problem of dependency management. I am confortable with Makefiles as I used them in a fancy way. This is invaluable compared to a Makefile entry in a resume.

Go deeper in the subject. I often find myself writing code pretty easily without too much thinking. Writing about this particular sexy technique in my blog more challenging. It requires me to state the exact problem I try to solve, think about alternatives, seek for already existing solutions ...

After the post has been written, I receive a lot of feedback. My friends almost always talk about how it would have been solved in their own language of predilection. They also provide more use-cases where the technique would and wouldn't work, leads to improve it ...

How long does it take?

Count one evening per post. I usually spend one full evening writing down an involved article like this one. However I think a lot about the subject in a passive way by reading related articles, thinking about it in my bed ... Articles that show off a project are much quicker to write, I'd say about 30 minutes.

Post as often as you want. When I have something I deeply want to share I write an article about it. Once I've written once, I often see myself writing some more about drafts I had. Then there is a period of silence. Overall, I'd say I write every two months, often during holidays.

How many articles. Each new article you write is not going to shadow the other but adds up. When I check my analytics, each article written slightly increases my daily viewer count. But even with one article a blog is worth it.

How do I start?

Install a WordPress. Even if you are a geek, I strongly recommend against you coding your own blog or using some nerdy blog software. I went over that myself and I found myself coding stuff every time I wanted to write an article. WordPress gives you all the tools you need to blog, and if you want to do something a bit special, chances are there's already an extension that does it.

Quickly write your first article. Don't spend so much time finding the perfect theme or installing many extensions you don't even need yet. The first article is the hardest! Write it down and send it to your friends. Listen to their feedback and instead of editing the article, write another article trying to fix those issues. Writing is hard so it'll take more than one shot to feel comfortable with it. The more you practice, the easier it will be.


If I managed to convince your starting a blog, please give me the link! I'd love to read about your stuff and may help you get started if you want.

Popularity System

In Starcraft 2, the Custom Maps are being listed by popularity. The popularity is the number of times the map has been played for more than 5 minutes during the last 12 hours.

As I am running the website, I would like to have this listing. This would let the players know what are the popular maps in the other regions (US, EU, KR ...) but also let them find quickly the maps they have played to leave comments!

Getting the list!

Bad news, the listing is only available inside the game. And with the anti-hack measures Blizzard is taking, this is not a viable solution to extract them from the client. So we have to find another solution!

Hopefully, there's one. Blizzard is updating a website called with the profiles or every player. In the player profile there are his latest 24 games played.

Since we can't parse all the characters (there are more than 1.6 millions as I speak) we are going to parse as much as we can picking them randomly. And we just add +1 for the couple [Map, Date]. As a result for each map we get the an approximation of number of times it has been played.

As you can see there's a hole around September 9, this is a side-effect of the 24 latest played limit. The data has been gathered for 2 days between the 15 and 16. Before the 9 you can see all the casual players that don't play too often and who's 24 limit has not been reached. Between the 11 and 13 the frequent players and after the hardcore ones.

However those artifacts are likely to disappear with a constant spidering.

Does it match?

Now that we've got a lot of data, the question that everyone's waiting ... Does it match the popularity listing!

Calculated Popularity | Position Difference | Real Popularity

As you can see, the top 12 maps are the same with some small ordering differences. However since the order is constantly changing, that's not that a big issue.

In order to get those results about 30 000 players have been parsed and the top popular map Nexus Wars has 2200 points (Real popularity is about 6000). Instead of using the data from the last 12 hours we used the data from the current day and the day before (the granularity of listing is 1 day).


It was really hard to know if this method was going to give similar results as the ingame popularity. The only way to make sure of it was to test! I've setup the website (recycled domain name) to show the values. However, due to the number of requests needed, I am probably not going to keep the spider running for a long time.

I just showed you that it was possible to datamine websites in order to obtain useful statistics 🙂 I hope you like it!

Prime number recognition is a very hard problem and yet no good enough solution has been found using classical algorithms. There are two ways to get around those limitations: find an algorithm with a better complexity or find a way to compute faster. The first one has already been researched by a large amount of people so I decided to give a shot to the second one.

We are limited by the speed of the electricity, so my idea is to use speed of light to improve speed by a great factor. I've come up with a simple example of how to use it to compute prime numbers.

We set up 2 mirrors with light sources that are directed.


We first set the sensor to the position lower 2

2 had no light so we create a new one that is going to en-light all the multiples of 2


3 had no light so we create a new one that is going to en-light all the multiples of 3

4 is already in the light, we skip it

4 is already in the light, we skip it

Here is the pseudo code of the prime calculation with Light & Mirror.

// Sensor placed at lower n that tells weither if it is enlightened or not
// Creates a light that starts from lower n, targeted at the upper n + n/2,
// that first reflects to lower 2n and then to all the multiples of n.
for i = 2 to sqrt(n)
  if not hasLight(i)
return not hasLight(n)

This method allows us to recognize prime numbers with light sources, mirrors, sensors and a bit of programming.

Recognizing if a number is prime or not requires to have a lot of lasers and sensors. I don't think that is a viable process, however this could be interesting to do multiplications. You want to fire 2 lights and then see in what point they cross. However there are 2 main problems:

  • It requires to have 2 really long mirrors possibly infinite which is not possible in practice. It may be possible to add a vertical mirror in order to work in a bounded area. Yet to know how many times the light hit that mirror.
  • We are required either to have an infinite number of sensors. We could have only one sensor that checks all positions sequentially but then we loose the speed of light! What is wanted instead are mirrors in each position that would redirect light to a unique sensor capable of knowing where it came from.

This is an experiment and it is not at the moment near to be faster than actual methods but this is a new way to think about programming. I don't know if anything useful is going to be deviated from this but it shows how to use the light to compute things.