For a project, I want to transparently be able to intercept all the included javascript files, edit the AST and eval it. This way I can manipulate all the code of an application just by inserting a custom script.
- Hook the
<script>
tag insertion. - Download the Javascript file using XHR.
- Parse, transform and eval the AST.
Hook the <script>
tag insertion
There is a DOM event called DOMNodeInserted. However it has two major drawbacks:
- It does not work with the nodes written directly in HTML.
- It does not work with
<script>
tags.
Given those constraints, we cannot intercept <script>
tag written in HTML. However hope is not lost for dynamically included scripts.
Dynamic Script Include
There are libs like headjs that let you include your javascript files within Javascript. They all work the same way: create a <script>
tag using document.createElement
, set the type
, src
and onload
and finally insert it in the DOM. The following snippet is a basic implementation:
function include(url, callback) { var script = document.createElement('script'); script.type = 'text/javascript'; script.src = url; script.onload = callback; document.getElementsByTagName('head')[0].appendChild(script); } |
Hook script creation
Using this include technique, we satisfy the dynamic property (A). In order to workaround the (B), we need to rename the <script>
tag into something else, for example <custom:script>
. This is done by hooking document.createElement
.
var createElement = document.createElement; document.createElement = function (tag) { if (tag === 'script') { tag = 'custom:script'; } return createElement.call(document, tag); }; document.addEventListener('DOMNodeInserted', function (event) { var el = event.target; if (el.nodeName !== 'CUSTOM:SCRIPT') { return; } |
Do our stuff
The hardest part messy part is done, now we can get to work on our implementation:
Download the file
Let's download the file using a XHR request:
var request = new XMLHttpRequest(); request.open('GET', el.src, false); request.send(null); |
The implementation is really basic, for a real implementation we should do something more sophisticated:
- If
async
is set, use an asynchronous XHR. - Provide error handling.
- Provide a proxy to circumvent the Same-Origin Policy.
Transform the AST
In order to parse and transform the AST, I'm using the Reflect.js library. It has the advantage of containing a standalone javascript file that doesn't require NPM dependencies.
Note: I'm using window.eval
in order to eval from the global scope.
var ast = Reflect.parse(request.responseText); var ast = transform(ast) window.eval(Reflect.stringify(ast)); |
Fire DOM Load event
If you want your dynamic include library to work, you need to fire the load event and set the appropriate readyState
value.
// Fire DOM Loaded event el.readyState = 'loaded'; el.onload(); }); |
Conclusion
This technique does not work with HTML <script> tags (both with src and with inline code). It is really sad since the majority of scripts are included this way. However, we found a way to make it work on dynamic script injection. It is better than nothing.
The technique also have a big drawback, it will make debugging a lot harder as both filename and line number are lost with the eval
call.
Use Cases
I think that the technique is not completely useless. Here are some thoughts:
- Better GreaseMonkey: Alter the code of a website you don't control.
- Remove Closure. People often use closure to prevent people from accessing the code from the console. We could remove it.
- Code injection: You may want to add some code like a console.log, making an additional check or setting up a hook on code you don't own.
- Code Analysis: Instrument the code to extract data.
- Code Coverage. It is useful to see if your tests pass through all the branches.
- Dead-Code Removal. If you run the application, you can see what parts of the code you used and remove those that were not used. It will help an additional pass with an optimizer such as Google Closure Compiler.
- Profile-Guided Optimization. It is often used to do branch prediction or type analysis.