6 months ago, I wrote the blog post "JSPP - Morph C++ into Javascript". My supervisor at the LRDE (R&D Lab of EPITA), Didier Verna, found it interesting and told me that it could be worth a publication. With his great help, I've written my first article. We have submitted it to two conferences (one after the other) and received two refusals.

Overall, the reviewers said the article was well written and the implementation interesting but since it has no practical use, it does not deserve a publication.

Even if it didn't make it, I'm really proud of the article. If you are interested in the implementation of JSPP, it can be an enlightening read πŸ™‚

Download PDF

(For copyright issue, this is not the submitted version of the file. There are still some minor wording issues)

I've done the Javascript presentation. It went alright and I hope that I taught things to people. I'm sorry for my blog reader but it's in French πŸ™‚

Few remarks:

  • The code in the slide was not big enough, people in the back couldn't read it. It also appears crappy in the video.
  • I allocated 2 hours but it only lasted 1 hour and 10 minutes.
  • I should have asked people for questions right after the beginning.
  • The room was full πŸ™‚

Last year, I've done a one-hour presentation of Mathematical Morphology in front of other students. Here are the slides:

For a project, I want to transparently be able to intercept all the included javascript files, edit the AST and eval it. This way I can manipulate all the code of an application just by inserting a custom script.

  1. Hook the <script> tag insertion.
  2. Download the Javascript file using XHR.
  3. Parse, transform and eval the AST.

Hook the <script> tag insertion

There is a DOM event called DOMNodeInserted. However it has two major drawbacks:

  1. It does not work with the nodes written directly in HTML.
  2. It does not work with <script> tags.

Given those constraints, we cannot intercept <script> tag written in HTML. However hope is not lost for dynamically included scripts.

Dynamic Script Include

There are libs like headjs that let you include your javascript files within Javascript. They all work the same way: create a <script> tag using document.createElement, set the type, src and onload and finally insert it in the DOM. The following snippet is a basic implementation:

function include(url, callback) {
  var script = document.createElement('script');
  script.type = 'text/javascript';
  script.src = url;
  script.onload = callback;
  document.getElementsByTagName('head')[0].appendChild(script);
}

Hook script creation

Using this include technique, we satisfy the dynamic property (A). In order to workaround the (B), we need to rename the <script> tag into something else, for example <custom:script>. This is done by hooking document.createElement.

var createElement = document.createElement;
document.createElement = function (tag) {
  if (tag === 'script') {
    tag = 'custom:script';
  }
  return createElement.call(document, tag);
};
 
document.addEventListener('DOMNodeInserted', function (event) {
  var el = event.target;
  if (el.nodeName !== 'CUSTOM:SCRIPT') {
    return;
  }

Do our stuff

The hardest part messy part is done, now we can get to work on our implementation:

Download the file

Let's download the file using a XHR request:

  var request = new XMLHttpRequest();
  request.open('GET', el.src, false);
  request.send(null);

The implementation is really basic, for a real implementation we should do something more sophisticated:

Transform the AST

In order to parse and transform the AST, I'm using the Reflect.js library. It has the advantage of containing a standalone javascript file that doesn't require NPM dependencies.

Note: I'm using window.eval in order to eval from the global scope.

  var ast = Reflect.parse(request.responseText);
  var ast = transform(ast)
  window.eval(Reflect.stringify(ast));

Fire DOM Load event

If you want your dynamic include library to work, you need to fire the load event and set the appropriate readyState value.

  // Fire DOM Loaded event
  el.readyState = 'loaded';
  el.onload();
});

Conclusion

This technique does not work with HTML <script> tags (both with src and with inline code). It is really sad since the majority of scripts are included this way. However, we found a way to make it work on dynamic script injection. It is better than nothing.

The technique also have a big drawback, it will make debugging a lot harder as both filename and line number are lost with the eval call.

Use Cases

I think that the technique is not completely useless. Here are some thoughts:

  • Better GreaseMonkey: Alter the code of a website you don't control.
    • Remove Closure. People often use closure to prevent people from accessing the code from the console. We could remove it.
    • Code injection: You may want to add some code like a console.log, making an additional check or setting up a hook on code you don't own.
  • Code Analysis: Instrument the code to extract data.
    • Code Coverage. It is useful to see if your tests pass through all the branches.
    • Dead-Code Removal. If you run the application, you can see what parts of the code you used and remove those that were not used. It will help an additional pass with an optimizer such as Google Closure Compiler.
    • Profile-Guided Optimization. It is often used to do branch prediction or type analysis.

For a project I will talk later on, I need to hook the function document.createElement. The code I wanted to write was:

var original = document.createElement;
document.createElement = function (tag) {
  // Do something
  return original(tag);
};

Problem

However, there's a silly Javascript exception triggered if you try to take a reference of the function

var createElement = document.createElement;
createElement('div');
// TypeError: Illegal Invocation

Naive Solution

Since it looks like we cannot use anything else but document.createElement to execute the function, I decided to restore the original document.createElement within the hook function. It is verbose but works.

var original = document.createElement;
var hook = function (tag) {
  document.createElement = original;
  // Do something
  var el = document.createElement(tag);
  document.createElement = hook;
  return el;
};
document.createElement = hook;

Why?

But then, I asked myself, how did they implement a function that could only be called with document.createElement form. Then I remembered that this calling convention sets this to be document. So they must be doing a check like this:

document.createElement = function () {
  if (this !== document) {
    throw new TypeError('Illegal Invocation');
  }
  // ...
}

Solution

Now that we know that they check for this === document, we can use .call to force it πŸ™‚

var original = document.createElement;
document.createElement = function (tag) {
  // Do something
  return original.call(document, tag);
};