I am in the process of rewriting MMO-Champion Tables and I want a generic way to manage the hash part of the URL (#table__search_results_item=4%3A-slot). I no longer want to write a new parser every time I want to add a new widget.

Why JSON is not suited for URLs

JSON uses characters that are not widely known in the URL. For example, having the double quote character in the URL sucks for HTML embedding. As well, Many of the automated URL insertion will fail at highlighting the whole URL.

MSN Messenger trying to automatically URLify: See the missing last three }.

http://website.com/#{"table":{"achievement":{"column":"instance","ascending":true}}}

As a user, I am also really disturbed when I see this URL. This does not feel like a normal URL. I am a bit suspicious.

A strong requirement for an URL is its small size. JSON representation can be further compressed by removing extra quotes and ending characters such as " and }.

URLON

This is why I made URLON, a new Object Notation intended to be used in URLs. It makes use of URL friendly characters such as = : @ _ &. It is shorter than JSON and should no longer scare your users :)

Put some random JSON / URLON / Rison in one of the boxes to see the other notations update.

JSON:

URLON:

Rison:




A reference implementation is available:

It is also available on NPM:

 npm install URLON
var URLON = require('URLON');
console.log(URLON.stringify({URLON: 'works'}));
// _URLON=works

JSON / URLON Comparison

Object

An object starts with an underscore _. All the fields are separated by ampersand & and terminated by a semicolon ;. It makes it feel really url'ish. Note that trailing semicolons can safely be removed.

  • JSON: {"first": "value", "second": "value"}
  • URLON: _first=value&second=value

String

A string just starts with an equal sign =. A string stop as soon as it reaches a reserved keyword (= : @ _ &). You can escape a character a slash before /.

  • JSON: "string"
  • URLON: =string

Number, Boolean and null

Sadly, we have to distinguish between Strings and Numbers/Booleans/null. Therefore I chose to use a colon : for those instead.

  • JSON: 123.42
  • URLON: :123.42
  • JSON: true
  • URLON: :true
  • JSON: null
  • URLON: :null

Array

An array starts with a arobas @ and elements are separated by comma @. The prefix typing does not really fit well with the Array syntax. If you can find something better, I'm open to your suggestions :)

  • JSON: [1, "vjeux", 3]
  • URLON: @:1@=vjeux@:3

URLON / Query String

Query String is a standard for encoding structured data in a URL context. It is implemented natively in PHP and Ruby. However, I strongly believe that URLON is better for several reasons:

  • Query String are not typed: Every atomic value is a string. This makes it less expressive than JSON and less easy to work with as numbers and booleans are common in urls.
  • Query Strings are big. For every leaf node, the whole identifier is being written, this is a waste of space.

The following example from the PHP Documentation compares JSON, Query String and URLON.

JSON

{
  "user": {
    "name": "Bob Smith",
    "age": 47,
    "sex": "M",
    "dob": "5-12-1956"
  },
  "pastimes": ["golf", "opera", "poker", "rap"],
  "children": {
    "bobby": {
      "age": 12,
      "sex": "M"
    },
    "sally": {
      "age": 8,
      "sex": "F"
    }
  }
}

Minified JSON

{"user":{"name":"Bob Smith","age":47,"sex":"M","dob":"5-12-1956"},"pastimes":
["golf","opera","poker","rap"],"children":{"bobby":{"age":12,"sex":"M"},
"sally":{"age":8,"sex":"F"}}}

Query String

user[name]=Bob+Smith&user[age]=47&user[sex]=M&user[dob]=5-12-1956&pastimes[0]=golf
&pastimes[1]=opera&pastimes[2]=poker&pastimes[3]=rap&children[bobby][age]=12
&children[bobby][sex]=M&children[sally][age]=8&children[sally][sex]=F

URLON

_user_name=Bob%20Smith&age:47&sex=M&dob=5-12-1956;&pastimes@=golf@=opera
@=poker@=rap;&children_bobby_age:12&sex=M;&sally_age:8&sex=F

Rison

(user:(name:'Bob Smith',age:47,sex:M,dob:'5-12-1956'),pastimes:!(golf,opera
,poker,rap),children:(bobby:(age:12,sex:M),sally:(age:8,sex:F)))

Rison is a tiny-bit longer than URLON but is easier to read. However it does not feel like it's part of an URL to me. It is also common for automatic linking to break on parenthesis ( or ), meaning that we are back with the same issues as JSON.

Please, feel free to give your ideas in the comments :)

If you liked this article, you might be interested in my Twitter feed as well.
 
  • Jean-Sébastien Ney

    isn't it a problem that array starts with a sharp ? it is considered as an anchor, so it won't be pass to the server. right ?

  • http://blog.vjeux.com/ Vjeux

    I was working in the context of a hash tag from the start. But you are right, I am going to change it to @.

  • Thom

    In your example, how do you tell that 'pastimes' isn't an attribute of 'user'?

  • http://blog.vjeux.com/ Vjeux

    You are right, I cannot get away without an end separator for structured data (object & arrays). I'm making a fix. I hope it will not increase the length too much.

  • Anonymous

    Personally, I love it - going to try it out in a few things and see how it gets on :o)

    One addition to your examples - I imagine the 'Query String' example probably wouldn't be seen like that - it'd probably be encoded, like:

    "user%5Bname%5D=Bob+Smith&user%5Bage%5D=47&user%5Bsex%5D=M&user%5Bdob%5D=5/12/1956&pastimes%5B0%5D=golf&pastimes%5B1%5D=opera&pastimes%5B2%5D=poker&pastimes%5B3%5D=rap&children%5Bbobby%5D%5Bage%5D=12&children%5Bbobby%5D%5Bsex%5D=M&children%5Bsally%5D%5Bage%5D=8&children%5Bsally%5D%5Bsex%5D=F"

    Have you found any objects where it breaks on stringify/parse?

  • Okke

    How do you handle nested Arrays?

    [1, [2, 4], 3] would turn into @:1,@:2,:4,:3
    Which could be parsed as both
    [1, [2, 4], 3] and [1, [2, 4, 3]].

  • Tom

    In the text, you define "," as the array element separator, but in the URLON example you use "&". Actually, "&" makes more sense, why would you use 2 different notations for what is essentially one and the same concept?

  • http://blog.vjeux.com/ Vjeux

    I'm sorry given the feedback I have received here and on the Hacker News thread ( http://news.ycombinator.com/item?id=3025505 ) I rapidly updated the specifications.

    At first it was #item1&item2&item3
    Then @item1&item2&item3
    It went through @item1,item2,item3
    And it's now @item1@item2@item3

    Sorry for the confusion :( The doc should be up to date, I hope I haven't forgotten anything.

  • http://blog.vjeux.com/ Vjeux

    I've added ; as a terminator in order to deal with this specific issue.

    [1, [2, 4], 3] -> @:1@@:2@:4;@:3
    [1, [2, 4, 3]] -> @:1@@:2@:4@:3

    Note that there should be a ";" at the end of the second example. However trailing terminators are removed for concision.

  • http://blog.vjeux.com/ Vjeux

    Oh yeah, you are right, square brackets [] are not excluded by encodeURI. However, it looks like in the wild, they are often treated as part of urls without problem.

    There were issues parsing nested structures but it should now be resolved. Please tell me if you can find any bug in the implementation :)

  • David

    I don't think there's a need to create another format. You can use URL encoding with json and the ready made parsers to use it. I think this is re-inventing the wheel and shouldn't be done unless there's a clear need.

  • http://blog.vjeux.com/ Vjeux

    Would you rather see

    http://db.mmo-cham pion.c om/items/#_table_achievement_column=instance&ascending:true

    or

    ht tp://db.mmo-cham pion.c om/items/#%7B%22table%22:%7B%22achievement%22:%7B%22column%22:%22instance%22,%22ascending%22:true%7D%7D%7D

  • Anonymous

    You say "semi-colon" but you type ":" which is a colon, not a semi-colon ";".

  • http://blog.vjeux.com/ Vjeux

    Fixed sorry. English is not my native tongue and I've got a hard time remembering the name of those two symbols. In French ":" is "two dots" and ";" is "dot comma" which seems more logical :p

  • Anonymous

    Objects starting with underscores means you can't have underscores in attribute names, which hurts readability when you get multi-word attribute names.

  • http://blog.vjeux.com/ Vjeux

    You can but they will be escaped: {"compound_word":10} -> _compound/_word:10

    "-" is probably a better symbol to use with this notation for multi-word names.

  • dubious

    Not 100% sure, but isn't "@" reserved for authentication ?

  • Ihateaccounts

    you're an asshole for making this. we don't need any more object notations.

  • Anonymous

    So true. English is a language made up of illogical rules with more exceptions to the rule than adhering. :)

  • http://twitter.com/py Paul Young

    I like the idea, but from reading the comments it appears to be rapidly changing.

    For that reason could you do the following?

    #1 use semantic versioning
    #2 mark this as alpha according to #1

    http://semver.org

  • Anonymous

    Escaping works, but is ugly. "-" is not a valid identifier character in javascript, so you'd have to do obj["compound-word"] instead of obj.compound_word.

    I'd rather pick a non-identifier character to denote objects.

  • Klfsdj
  • Polpetta

    you are reinventing the wheel, use Rison

    http://mjtemplate.org/examples/rison.html

  • Dfslkfadskjldsa

    There is already a good and well established solution for this with native support in almost every thinkable serverside framework, querystrings: http://www.example.com/?param1=abc&param2=123
    There is no need to express more advanced structures in the url than that, for that you use a form that is POSTed.
    URLs should be friendly and easy to remember, not for inserting new data or exposing super advanced queries going almost right into the database.

  • http://blog.vjeux.com/ Vjeux

    I'm not talking about server-side here but client-side. URLON is designed to be used after the hash #. The goal is to provide a quick loader for structured data in order to store the page state.

  • Anonymous

    interesting.

    i do feel like this might break back-end parsing of urls though, and why have one format for the front end, and one for the backend? in PHP and plenty of frameworks they have ways of dealing with multi_dimentional data in the url.

    http://www.zulius.com/how-to/send-multidimensional-arrays-php-with-jquery-ajax/ {method 1}
    http://php.net/manual/en/function.parse-str.php

    This is how paypal sends objects back and forth in the query string. In fact most languages just accept multidimentional query params.

    so,
    http://website.com/#{"table":{"achievement":{"column":"instance","ascending":true}}}

    would be
    http://website.com/#tableachievementcolumn=instance&tableachievementascending=true

  • David

    The second one. Just because when you try to include your URIs in XML (or XHTML) you're going to need to replace the apersands anyway (it's an entity marker) if you want it do be valid. What I htink you're doing is inventing a new encoding without thinking all the implications, as your recent modifications show. While I think it's a worthwhile experiment, aesthetics has to be a secondary concern while translating encodings behind validity and correctness.
    Anyway, trying to encode json into an URI is, almost every time, a bad idea. URIs are limited in length (not by RFC but by implementations) to about 2000 characters in several popular browsers (like internet explorer) and GET is not intended for POST-ing information. So, while I think is good for you to think about the encodings and the shortcomings of the current way of sending data, I don't think your efforts concerning this particular issue are useful for the world at large.

  • David

    So, 13 characters become 15 and your size advantage disappears. On top of that, you no longer have visual information that can be understood while reading the URL.

  • Dustin Whitney

    That's neat. I hope it takes off

  • Pingback: Ipad Zynga Poker | AppleEbooks.Net

  • Pingback: Vjeux » Cyclic Object Detection

  • Fabozz

    This is an interesting idea; I think it could be very useful.

    One thing you should consider is specifying a canonical order for the object fields: State that all objects within a single URLON string should be listed alphabetically, and all keys within one object should be listed alphabetically. If you do this, then if two Javascript objects are identical, their URLON strings are guaranteed to be identical, and vice versa.

    This complicates your stringify() method, which is unfortunate. But browsers will see http://db.mmo-champion.com/items/#_table_achievement_ascending:true&column=instance and http://db.mmo-cham pion.com/items/#_table_achievement_column=instance&ascending:true as two different pages when in fact they're the same.

  • Pingback: Vjeux » Javascript Object Difference

  • Pingback: Links for 2011-09-27 | Business Developer Talk

  • Dennis Hamilton

    I agree with the sense of your note, but I do question your use of %-encoding so heavily.

    A fragment can contain any sequence of pchars, including "/" and "?". That includes ":", "@", "!", "$", "&", "'", "(", ")", "*", +", ",", ";", "=", "-", ".", "_", and "~" beside the alphas and digits having ASCII codes. (There is a weird situation with "[" and "]" and " (QUOTE) would have to be %-encoded, but we don't have to get into that here.) [I am looking at RFC3986 as I type this.]

    So, as far as the URI is concerned, there is no need for %-encoding of any of those (and, in fact, their %-encoding is unwarranted when it comes to making IRIs having expanded Unicode support).

    Now, embedding of the URI in something will require whatever escaping that the embedding requires to end up delivering the URI string correctly. But don't mix the two by requiring it to be done as %-encoding.

    So the first example that Vjeux gives is perfectly valid as a URI, and if it is embedded in some language string (say an XML attribute value, it would be nothing more complicated than

    myattr="http://db.mmo-champion.com/items/#_table_achievement_column=instance&ascending:true"

    (I assume the stray spaces in the original were not intended.)

  • Dennis Hamilton

    Oh, I see. The spaces were added to prevent URL abridgement by the blog. Funny.

    OK, the only thing I would encode in the first vjeux example is the "&" and it would be encoded by using & in its place. (It wil be interesting to see how the blog deals with that sentence.)

  • Dennis Hamilton

    OK, one more time. The only thing I would encode in the first vjeux example is the "&" and I would simply replace that one character by "&" in its place.

    That should do it.

  • http://blog.vjeux.com/ Vjeux

    As I published it on NPM, there are now automatic version numbers. The current one is at 1.0.1: http://search.npmjs.org/#/URLON

  • http://blog.vjeux.com/ Vjeux

    Just so you know, you can edit your posts with Disqus instead of replying multiple times :)

  • http://blog.vjeux.com/ Vjeux

    I've added a test suite using Jasmine and found some bugs in the implementation. It should now be a lot more robust!

    http://fooo.fr/~vjeux/github/URLON/tests/SpecRunner.html

  • Pingback: Easy Table Plans – Latest News And Tips !

  • Pingback: Prestashop Templates

 

Related Posts

  • September 11, 2011 -- World of Warcraft HTML Tooltip Diff (0)
    MMO-Champion is a World of Warcraft news website. When a new patch is released, we want to show what has changed in the game (Post Example). An english summary of each spell change is hand written, but we want to show the exact tooltip changes. jsHTMLDiff is available on Github. .si...
  • August 19, 2011 -- Javascript – Stupid Idea: Hoisting at the end (0)
    JSLint imposes us to do manual hoisting of variables. What if we did it but at the end of the function? :P How you write function print_array (array) { var length = array.length; for (var i = 0; i < length; ++i) { var elem = array[i]; console.log(elem); } } How ...
  • October 8, 2011 -- Find HTMLEntity for any Character (4)
    I've always be annoyed when I want to use a character such as » in HTML as I struggle to find the corresponding HTML Entity. This is why I made this small utility. Just paste the sexy UTF-8 character you found and it will give you the associated HTML-ready code :) Enter any weird character...
  • July 13, 2012 -- Image Layout Algorithm – Google Plus (9)
    Layout Algorithms: Facebook | Google Plus | Lightbox | Lightbox Android | 500px Google Plus has a really nice image gallery. They somehow managed to display all the photos without cropping, without reordering and without any holes. We are going to see how they did it in this blog post. ...
  • May 10, 2012 -- Generic Image Processing With Climb – 5th ELS (0)
    ELS Presentation | A Generic and Dynamic Approach to Image Processing | Chaining Operators & Component Trees | Property-based dispatch in functional languages Laurent Senta had the opportunity to go to the 5th European Lisp Symposium to present Climb, the project I've been working on dur...