jDataView provides a standard way to read binary files in all the browsers. It follows the DataView Specification and even extends it for a more practical use.

Explanation

There are three ways to read a binary file from the browser.

  • The first one is to download the file through XHR with charset=x-user-defined. You get the file as a String, and you have to rewrite all the decoding functions (getUint16, getFloat32, ...). All the browsers support this.
  • Then browsers that implemented WebGL also added ArrayBuffers. It is a plain buffer that can be read with views called TypedArrays (Int32Array, Float64Array, ...). You can use them to decode the file but this is not very handy. It has big drawback, it can't read non-aligned data. It is supported by Firefox 4 and Chrome 7.
  • A new revision of the specification added DataViews. It is a view around your buffer that can read arbitrary data types directly through functions: getUint32, getFloat64 ... Only Chrome 9 supports it.

jDataView provides the DataView API for all the browsers using the best available option between Strings, TypedArrays and DataViews.

API

See the specification for a detailed API. http://www.khronos.org/registry/webgl/doc/spec/TypedArray-spec.html#6. Any code written for DataView will work with jDataView (except if it writes something).

Constructor

  • new jDataView(buffer, offset, length). buffer can be either a String or an ArrayBuffer

Specification API

The wrapper satisfies all the specification getters.

  • getInt8(byteOffset)
  • getUint8(byteOffset)
  • getInt16(byteOffset, littleEndian)
  • getUint16(byteOffset, littleEndian)
  • getInt32(byteOffset, littleEndian)
  • getUint32(byteOffset, littleEndian)
  • getFloat32(byteOffset, littleEndian)
  • getFloat64(byteOffset, littleEndian)

Extended Specification

The byteOffset parameter is now optional. If you omit it, it will read right after the latest read offset. You can interact with the internal pointer with those two functions.

  • seek(byteOffset): Moves the internal pointer to the position
  • tell(): Returns the current position

Addition of getChar and getString utilities.

  • getChar(byteOffset)
  • getString(length, byteOffset)

Addition of createBuffer, a utility to easily create buffers with the latest available storage type (String or ArrayBuffer).

  • createBuffer(byte1, byte2, ...)

Shortcomings

  • Only the Read API is being wrapped, jDataView does not provide any set method.
  • The Float64 implementation on strings does not have full precision.

Example

First we need a file. Either you get it through XHR or use the createBuffer utility.

var file = jDataView.createBuffer(
	0x10, 0x01, 0x00, 0x00, // Int32 - 272
	0x90, 0xcf, 0x1b, 0x47, // Float32 - 39887.5625
	0, 0, 0, 0, 0, 0, 0, 0, // 8 blank bytes
	0x4d, 0x44, 0x32, 0x30, // String - MD20
	0x61                    // Char - a
);

Now we use the DataView as defined in the specification, the only thing that changes is the c before jDataView.

var view = new jDataView(file);
var version = view.getInt32(0); // 272
var float = view.getFloat32(4); // 39887.5625

The wrapper extends the specification to make the DataView easier to use.

var view = new jDataView(file);
// A position counter is managed. Remove the argument to read right after the last read.
version = view.getInt32(); // 272
float = view.getFloat32(); // 39887.5625
 
// You can move around with tell() and seek()
view.seek(view.tell() + 8);
 
// Two helpers: getChar and getString will make your life easier
var tag = view.getString(4); // MD20
var char = view.getChar(); // a

Demos

I'm working on a World of Warcraft Model Viewer. It uses jDataView to read the binary file and then WebGL to display it. Stay tuned for more infos about it :)

If you liked this article, you might be interested in my Twitter feed as well.
 
  • Pingback: Javascript – Ajax Binary Reader | Vjeux

  • Pingback: Javascript – jQuery Binary Ajax | Vjeux

  • Pingback: PhantomJS, load.js, Phantom Limb, OpenOdyssey | tips & tricks

  • Pingback: PhotoSynth WebGL Viewer » Visual-Experiments.com

  • Joe Hocking

    (I'm trying in IE because submitting my comment wasn't working in Firefox.)

    Thanks for this very useful code. There may be something wrong with getUint32() because it's giving me a negative number and I can't tell if I'm doing something wrong or if there's a bug in your code. getUint16() works correctly.

  • Joe Hocking

    aand I just figured out how to fix that bug. I added >>> 0 to the return statement for _getUint32. The second answer on this page explains http://stackoverflow.com/questions/1240408/reading-bytes-from-javascript-string

  • http://vjeux.com vjeux

    I cannot seem to reproduce the issue. Do you happen to have a test case?

    Thanks!

  • http://vjeux.com vjeux

    Thanks,

    The issue is that b < < 24, if b > 127, will make the first bit at 1 and be considered as an 32bit signed int.

    Instead I do b * Math.pow(2, 24), therefore it is considered as a double and there is no ambiguity.

    I am not so sure about the >>> 0 trick, it might no be working the same on all the browsers.

  • http://www.newarteest.com Joe Hocking

    Well I just tested and got the same result on IE 8, Firefox 4, Chrome 13, Safari 5, and Opera 11. I guess it's possible other test data will give incorrect results though.

    (incidentally, reading binary data on IE doesn't work like other browsers and I had to write a little VBScript, typical)

  • ethan

    I wanted to skip a block of data of the stream.
    So I used
    view.seek(view.tell() + blockLen);

    But if the data to skip is at the end of file,
    seek() throws the error INDEX_SIZE_ERR
    In that case I am forced to do a check on view.length and use getString(blockLen) instead (and waste some runtime memory and time for the temporary string).

    Does it make sense to allow seek() to land on end of file position

  • http://vjeux.com vjeux

    I did not think about this case. But you are right, it should not throw an error when trying to seek at the end of file.

    I made a patch to fix the issue, thanks :)
    https://github.com/vjeux/jsDataView/commit/01de9764184bdd628f840b5ca8c35afdcd024513

  • Pingback: PhotoSynth Viewer using Three.js » Visual-Experiments.com

  • http://yuguangzhang.com Yuguang Zhang

    LZMA Decompression using jDataView

    See my post on decoding a compressed file.

  • Pingback: BinaryParser – How to use Javascript power | Vjeux

  • ferenc

    This is great stuff here. Thanks for sharing. I am not sure if it's intentional, but for me getInt16 fails and it seems from the code that the corresponding function doesn't have the endianness argument thus _getUint16 will get undefined for the littleEndian parameter resulting bigendian. Adding that extra littleEndian argument and the parameter for the this._getUint16 fixed for me.

  • http://blog.vjeux.com/ Vjeux

    You are totally right, I forgot to add this one. I've updated the github repo ( https://github.com/vjeux/jsDataView/commit/47b885a03609db3c57b96b87d9b13217697aee8f ). Thanks :)

  • Donahcoo

    newbie here... I'd like to use this code to inspect a file and get its magic number. I've tired several things, but I'm stuff. For example files with ID3 tags have a magic number of ID3 in ascii or 49 44 33 in hex. I can't seem to convert the output of any of the functions to the string ID3 or the hex numbers. I would actually prefer hex, I think.

  • http://blog.vjeux.com/ Vjeux

    Here's a little demo: http://fooo.fr/~vjeux/github/jsDataView/demo/id3/id3.html

    Use .charCodeAt and .toString to convert between string and hex value:

    	var tag = view.getString(3);
    	console.log('TAG:', tag);
    	console.log('TAG:',
    		'0x' + tag.charCodeAt(0).toString(16),
    		'0x' + tag.charCodeAt(1).toString(16),
    		'0x' + tag.charCodeAt(2).toString(16));
  • sathish

    This code is support for IE9?

  • http://blog.vjeux.com/ Vjeux

    Yes

  • Pingback: web | Pearltrees

  • Michael Stieler

    Hi, great approach! Could it be that you fixed the endianess in code and thus the example in this article became wrong?

  • marc

    it's very interesting, but i've still a problem in reading a binary-ajax-response when i use internet explorer... could you help me?

 

Related Posts

  • August 20, 2011 -- Idea – mouseFreeze – A solution for Browser FPS Games (8)
    There is an open problem in porting real game into the web browser related to cursor handling. Problem Many games such as First-Person Shooters require the mouse to freely move, without the constraints of screen edges. However there is no such API in the browser to make this work. If you ...
  • January 26, 2011 -- Javascript – jQuery Binary Ajax (10)
    I made a DataView API Wrapper to read binary data from either a string or a binary buffer. You probably want to load it from a file, so you need to make a XHR request. Sadly no ajax wrapper implement it yet. XHR and Binary In order to get a binary string one must use the charset=x-user-define...
  • September 11, 2011 -- WebGL – Julia 3D Representation (0)
    At school we've been studying Lie Algebra and we were asked to make a 3D representation of a Lie Group. We chose to represent Julia Set in the Quaternion domain. We were really impressed to see that it was possible to generate many different forms given such a simple equation. Feel...
  • August 29, 2011 -- Javascript: Improve Cache Performance: Reduce Lookups (2)
    In my Binary Decision Diagram Library, the performance bottleneck was the uniqueness cache. By reducing the number of cache lookup, it is possible to greatly improve the performances. Common pattern In order to test if the key is already in the cache, the usual pattern is to use key in cache....
  • September 17, 2011 -- WoW Interface Anchor Positioning (4)
    I've always found CSS positioning with both float and position: absolute/relative hard to work with. I want to introduce to you an alternative way borrowed from the World of Warcraft Interface: Anchors. Anchor The concept is extremely simple. You can tell where you want the element to be, rel...