Last year, I've done a one-hour presentation of Mathematical Morphology in front of other students. Here are the slides:
For a project, I want to transparently be able to intercept all the included javascript files, edit the AST and eval it. This way I can manipulate all the code of an application just by inserting a custom script.
- Hook the
<script>
tag insertion. - Download the Javascript file using XHR.
- Parse, transform and eval the AST.
Hook the <script>
tag insertion
There is a DOM event called DOMNodeInserted. However it has two major drawbacks:
- It does not work with the nodes written directly in HTML.
- It does not work with
<script>
tags.
Given those constraints, we cannot intercept <script>
tag written in HTML. However hope is not lost for dynamically included scripts.
Dynamic Script Include
There are libs like headjs that let you include your javascript files within Javascript. They all work the same way: create a <script>
tag using document.createElement
, set the type
, src
and onload
and finally insert it in the DOM. The following snippet is a basic implementation:
function include(url, callback) { var script = document.createElement('script'); script.type = 'text/javascript'; script.src = url; script.onload = callback; document.getElementsByTagName('head')[0].appendChild(script); } |
Hook script creation
Using this include technique, we satisfy the dynamic property (A). In order to workaround the (B), we need to rename the <script>
tag into something else, for example <custom:script>
. This is done by hooking document.createElement
.
var createElement = document.createElement; document.createElement = function (tag) { if (tag === 'script') { tag = 'custom:script'; } return createElement.call(document, tag); }; document.addEventListener('DOMNodeInserted', function (event) { var el = event.target; if (el.nodeName !== 'CUSTOM:SCRIPT') { return; } |
Do our stuff
The hardest part messy part is done, now we can get to work on our implementation:
Download the file
Let's download the file using a XHR request:
var request = new XMLHttpRequest(); request.open('GET', el.src, false); request.send(null); |
The implementation is really basic, for a real implementation we should do something more sophisticated:
- If
async
is set, use an asynchronous XHR. - Provide error handling.
- Provide a proxy to circumvent the Same-Origin Policy.
Transform the AST
In order to parse and transform the AST, I'm using the Reflect.js library. It has the advantage of containing a standalone javascript file that doesn't require NPM dependencies.
Note: I'm using window.eval
in order to eval from the global scope.
var ast = Reflect.parse(request.responseText); var ast = transform(ast) window.eval(Reflect.stringify(ast)); |
Fire DOM Load event
If you want your dynamic include library to work, you need to fire the load event and set the appropriate readyState
value.
// Fire DOM Loaded event el.readyState = 'loaded'; el.onload(); }); |
Conclusion
This technique does not work with HTML <script> tags (both with src and with inline code). It is really sad since the majority of scripts are included this way. However, we found a way to make it work on dynamic script injection. It is better than nothing.
The technique also have a big drawback, it will make debugging a lot harder as both filename and line number are lost with the eval
call.
Use Cases
I think that the technique is not completely useless. Here are some thoughts:
- Better GreaseMonkey: Alter the code of a website you don't control.
- Remove Closure. People often use closure to prevent people from accessing the code from the console. We could remove it.
- Code injection: You may want to add some code like a console.log, making an additional check or setting up a hook on code you don't own.
- Code Analysis: Instrument the code to extract data.
- Code Coverage. It is useful to see if your tests pass through all the branches.
- Dead-Code Removal. If you run the application, you can see what parts of the code you used and remove those that were not used. It will help an additional pass with an optimizer such as Google Closure Compiler.
- Profile-Guided Optimization. It is often used to do branch prediction or type analysis.
For a project I will talk later on, I need to hook the function document.createElement
. The code I wanted to write was:
var original = document.createElement; document.createElement = function (tag) { // Do something return original(tag); }; |
Problem
However, there's a silly Javascript exception triggered if you try to take a reference of the function
var createElement = document.createElement; createElement('div'); // TypeError: Illegal Invocation |
Naive Solution
Since it looks like we cannot use anything else but document.createElement
to execute the function, I decided to restore the original document.createElement
within the hook function. It is verbose but works.
var original = document.createElement; var hook = function (tag) { document.createElement = original; // Do something var el = document.createElement(tag); document.createElement = hook; return el; }; document.createElement = hook; |
Why?
But then, I asked myself, how did they implement a function that could only be called with document.createElement
form. Then I remembered that this calling convention sets this
to be document
. So they must be doing a check like this:
document.createElement = function () { if (this !== document) { throw new TypeError('Illegal Invocation'); } // ... } |
Solution
Now that we know that they check for this === document
, we can use .call
to force it ๐
var original = document.createElement; document.createElement = function (tag) { // Do something return original.call(document, tag); }; |
I've always be annoyed when I want to use a character such as ยป
in HTML as I struggle to find the corresponding HTML Entity. This is why I made this small utility. Just paste the sexy UTF-8 character you found and it will give you the associated HTML-ready code ๐
I've come across an SQL issue. I need to make a fake spell for the WoW database. However creating one from scratch is too annoying, there are like 30 non-NULL values to fill. Instead what I want is to copy an existing spell with the new id. It appeared to be less trivial than expected.
Working Solution
Let's say you want to copy the spell #1000 into the spell #12345. Here is the SQL command to do it:
INSERT INTO spell SELECT 12345 AS id, name, icon, /* ... fill all the column names */ FROM spell WHERE id = 10000; |
Trying to improve ...
It is really annoying to get the list of column names (You can get it using \d spell
). What we would like to do is use the wildcard *
like this:
-- This is not working! INSERT INTO spell SELECT *, 12345 AS id FROM spell WHERE id = 10000; |
The problem is that *
also adds a column named id
. Therefore there are two id
columns ... Problem, now it no longer fits the spell structure.
What I need to make it work is to be able to remove a column from *
. Here is something I'd like to write:
-- This is not working! INSERT INTO spell SELECT * \ id, 12345 AS id FROM spell WHERE id = 10000; |
I tried to simulate \
with JOIN
, UNION
but I couldn't get anything that works. If you have any idea, I'm really interested in!