Last year, I've done a one-hour presentation of Mathematical Morphology in front of other students. Here are the slides:

For a project, I want to transparently be able to intercept all the included javascript files, edit the AST and eval it. This way I can manipulate all the code of an application just by inserting a custom script.

  1. Hook the <script> tag insertion.
  2. Download the Javascript file using XHR.
  3. Parse, transform and eval the AST.

Hook the <script> tag insertion

There is a DOM event called DOMNodeInserted. However it has two major drawbacks:

  1. It does not work with the nodes written directly in HTML.
  2. It does not work with <script> tags.

Given those constraints, we cannot intercept <script> tag written in HTML. However hope is not lost for dynamically included scripts.

Dynamic Script Include

There are libs like headjs that let you include your javascript files within Javascript. They all work the same way: create a <script> tag using document.createElement, set the type, src and onload and finally insert it in the DOM. The following snippet is a basic implementation:

function include(url, callback) {
  var script = document.createElement('script');
  script.type = 'text/javascript';
  script.src = url;
  script.onload = callback;
  document.getElementsByTagName('head')[0].appendChild(script);
}

Hook script creation

Using this include technique, we satisfy the dynamic property (A). In order to workaround the (B), we need to rename the <script> tag into something else, for example <custom:script>. This is done by hooking document.createElement.

var createElement = document.createElement;
document.createElement = function (tag) {
  if (tag === 'script') {
    tag = 'custom:script';
  }
  return createElement.call(document, tag);
};
 
document.addEventListener('DOMNodeInserted', function (event) {
  var el = event.target;
  if (el.nodeName !== 'CUSTOM:SCRIPT') {
    return;
  }

Do our stuff

The hardest part messy part is done, now we can get to work on our implementation:

Download the file

Let's download the file using a XHR request:

  var request = new XMLHttpRequest();
  request.open('GET', el.src, false);
  request.send(null);

The implementation is really basic, for a real implementation we should do something more sophisticated:

Transform the AST

In order to parse and transform the AST, I'm using the Reflect.js library. It has the advantage of containing a standalone javascript file that doesn't require NPM dependencies.

Note: I'm using window.eval in order to eval from the global scope.

  var ast = Reflect.parse(request.responseText);
  var ast = transform(ast)
  window.eval(Reflect.stringify(ast));

Fire DOM Load event

If you want your dynamic include library to work, you need to fire the load event and set the appropriate readyState value.

  // Fire DOM Loaded event
  el.readyState = 'loaded';
  el.onload();
});

Conclusion

This technique does not work with HTML <script> tags (both with src and with inline code). It is really sad since the majority of scripts are included this way. However, we found a way to make it work on dynamic script injection. It is better than nothing.

The technique also have a big drawback, it will make debugging a lot harder as both filename and line number are lost with the eval call.

Use Cases

I think that the technique is not completely useless. Here are some thoughts:

  • Better GreaseMonkey: Alter the code of a website you don't control.
    • Remove Closure. People often use closure to prevent people from accessing the code from the console. We could remove it.
    • Code injection: You may want to add some code like a console.log, making an additional check or setting up a hook on code you don't own.
  • Code Analysis: Instrument the code to extract data.
    • Code Coverage. It is useful to see if your tests pass through all the branches.
    • Dead-Code Removal. If you run the application, you can see what parts of the code you used and remove those that were not used. It will help an additional pass with an optimizer such as Google Closure Compiler.
    • Profile-Guided Optimization. It is often used to do branch prediction or type analysis.

For a project I will talk later on, I need to hook the function document.createElement. The code I wanted to write was:

var original = document.createElement;
document.createElement = function (tag) {
  // Do something
  return original(tag);
};

Problem

However, there's a silly Javascript exception triggered if you try to take a reference of the function

var createElement = document.createElement;
createElement('div');
// TypeError: Illegal Invocation

Naive Solution

Since it looks like we cannot use anything else but document.createElement to execute the function, I decided to restore the original document.createElement within the hook function. It is verbose but works.

var original = document.createElement;
var hook = function (tag) {
  document.createElement = original;
  // Do something
  var el = document.createElement(tag);
  document.createElement = hook;
  return el;
};
document.createElement = hook;

Why?

But then, I asked myself, how did they implement a function that could only be called with document.createElement form. Then I remembered that this calling convention sets this to be document. So they must be doing a check like this:

document.createElement = function () {
  if (this !== document) {
    throw new TypeError('Illegal Invocation');
  }
  // ...
}

Solution

Now that we know that they check for this === document, we can use .call to force it ๐Ÿ™‚

var original = document.createElement;
document.createElement = function (tag) {
  // Do something
  return original.call(document, tag);
};

I've always be annoyed when I want to use a character such as ยป in HTML as I struggle to find the corresponding HTML Entity. This is why I made this small utility. Just paste the sexy UTF-8 character you found and it will give you the associated HTML-ready code ๐Ÿ™‚

Enter any weird character:

I've come across an SQL issue. I need to make a fake spell for the WoW database. However creating one from scratch is too annoying, there are like 30 non-NULL values to fill. Instead what I want is to copy an existing spell with the new id. It appeared to be less trivial than expected.

Working Solution

Let's say you want to copy the spell #1000 into the spell #12345. Here is the SQL command to do it:

INSERT INTO spell
  SELECT 12345 AS id, name, icon, /* ... fill all the column names */
  FROM spell
  WHERE id = 10000;

Trying to improve ...

It is really annoying to get the list of column names (You can get it using \d spell). What we would like to do is use the wildcard * like this:

-- This is not working!
INSERT INTO spell
  SELECT *, 12345 AS id
  FROM spell
  WHERE id = 10000;

The problem is that * also adds a column named id. Therefore there are two id columns ... Problem, now it no longer fits the spell structure.

What I need to make it work is to be able to remove a column from *. Here is something I'd like to write:

-- This is not working!
INSERT INTO spell
  SELECT * \ id, 12345 AS id
  FROM spell
  WHERE id = 10000;

I tried to simulate \ with JOIN, UNION but I couldn't get anything that works. If you have any idea, I'm really interested in!