Vjeux

Nov

Dirty-up and execute top-down

vjeux Javascript 12 comments

One common pattern when implementing user interface optimizations is to compute some value for a node where the computation involves looking at neighbor nodes and want to keep this value updated when the tree is mutated.

On this article, I'm going to explain the pattern we implement to solve this use case on various places in React Native.

Example: background color propagation

On React Native, we implement an optimization where the background color is propagated from the parent instead of being transparent. This provides a guarantee to the GPU that it won't need to paint pixels that are underneath. You can read this release notes for a more complete explanation.

In any case, the algorithm is pretty simple, if the background color is not set on an element, we take the one from the nearest parent that has one set. Here's an example:

Now, let say that the red node background color is being unset, the example would look like:

In order to implement this behavior, we first need to traverse up the hierarchy and find the background color and then down to propagate it, but stop at nodes that have set background colors. This means that we have to implement two different algorithms: one for the initial rendering and one for the update.

The complexity of this example as explained is small enough that it is easy to maintain the same invariants. But in practice, this algorithm is a bit more complex: we don't forward the color for transparent nodes nor for the image component in certain cases... We also experimented with more conditions that we didn't end up using.

Dirty-up and execute top-down

What would be great is to just write the top-down recursive algorithm you learn in school to apply colors however you want and whenever there's a mutation just re-run it on the entire tree. The problem with that approach is that you're going to spend a lot of CPU time running this algorithm on the entire tree when you only need to update a few nodes.

Instead, you can dirty the node you are mutating and all the nodes up to the root.

Then, run your algorithm top-down starting at the root.

You need to implement a way to figure out if a non-dirty node needs to be recomputed. The strategy we use is to cache all the arguments of getColor, for example (parentColor, nodeColor, isNodeText, ...) along with the result. If we're being called with the same arguments and the node is not dirty, then we don't need to go further and can just bail. The pseudo code looks like this:

function getColor(isDirty, prevArgs, prevValue, nextArgs) {
  if (isDirty || prevArgs !== nextArgs) {
    return realGetColorRecursive(...nextArgs);
  }
  return prevValue;  
}

Conclusion

The benefits of this technique is that you can write your business logic once in a traditional top-down recursive way. You don't need to care about the much harder update case.

The downside is that you are doing more work: instead of touching only the elements that have changed, you need to traverse everything up from the root.

Another downside is that it requires more infrastructure as you need to add a caching layer and refactor your code to add a pass that starts at the top.

We're making use of this technique for a few aspects of React Native:

Background color is propagation as explained in this article.
Computing the layout (top, left, width, height) values based on the css attributes (eg marginTop, flex: 1...).
Flag if a text node is the root of the tree of text nodes to be able to flatten it into a single native view.
Flag if a node only affect layout and doesn't paint anything so that it can safely be removed from the UI tree.

Sep

Fixing a 10 years old performance issue

vjeux PHP 9 comments

When I was 14, Philippe Pelletier reached out to me with an interesting project. He is passionate about cinema and maintains a database (lots of physical sheets of paper and a growing number of Word documents) of every movie an actor played in, the cover of the movie, a biography of that person. He had the idea of turning this into a website to share this work.

I built the website called CinéArtistes over the summer and forgot about it. Turns out, 10 years later with no code change, it is still running and growing!

Finally something is broken!

Philippe contacted me this week because the website was getting unstable and those errors started appearing more and more regularly over the last few months. It has gotten to a point where it was unusable at peak hours.

error-cineartistes

As a first glance, it looked like the mysql database was saturated. When I went to cineartistes.com, the page would take something like 10 seconds to load. My intuition was that some queries were not scaling properly with the growing number of entries in the database.

When I browsed around the website and went to an actor page, it was actually really fast. This confirmed my suspicion that this was not a site-wide issue but just a problematic query on the home page.

Going down memory lane

Now that I had an idea of what was going on, I asked Philippe how do I connect to the server and he gave me a FTP url, username and password. It's been a long time since I haven't used FTP for anything!

The entire website was a single file 2800-lines file called index.php! Mind you, there isn't any version control and the staging environment is a file called index_dev.php.

Even though today it seems crazy, it's a good reminder that tools are there to help you achieve your goal. This codebase has worked flawlessly to power the website for more than 10 years with 0 intervention, and now we can even give maintenance remotely since many software companies work like this, and have the workers in their homes, by monitor employee activity using special software for this.

It is also surprisingly well structured. There's a router that calls a specific function for every page, good thing that at the time having pretty URL were not a thing 🙂

switch ($_POST['page']) {
  case "admin": OnAdmin(); break;
  case "import": OnImport(); break;
  case "modif": OnModif(); break;
  case "modiffestival": OnModifFestival(); break;
  case "ajout": OnAjout(); break;
  case "ajoutfestival": OnAjoutFestival(); break;
  ...

I already structured my code as components:

PrintHeader();
Liste($rech_nom, $rech_naiss, $rech_deces, $rech_activite);
PrintFooter();

Of course, the implementation isn't fancy: echoing concatenated strings for most of the code and ending php interpolation ?><html...><?php for static things.

Finding the performance bottleneck

Time to go back to the task at hand, we need to figure out why the front-page is so slow. My intuition is that there is a query that runs slowly, so I want to instrument all the mysql_query calls and log the time each one takes.

Thankfully, there is a database abstraction in place $db_mysql->query(...)! This kind of blows my mind! I probably got super annoyed that the mysql API was so hard to use. The abstraction code doesn't look like it was coded by me, I probably copy and pasted it from somewhere else 🙂

Since all the calls are centralized there, I just need to do

$t = microtime(true /* Seriously php ... */);
// do the query
print($query, (microtime(true) - $t) * 1000);

But there's one little problem, how do I print? If I'm using echo, it's going to show it in the middle of the content and everything will look broken. I could buffer it to a global variable and print it at the end but there's a better solution: I can use JavaScript console.log!

function print() {
  echo '<script>console.log(';
  foreach (func_get_args() as $elem) {
    echo '"'.$elem.'", ';
  }
  echo '"");</script>'; // notice the "" in order to avoid dealing with trailing comma
}

At first I tried to use json_encode to be safe if there was a " in the query but guess what, this 10 years old version of PHP doesn't implement it!

I was expecting to see one query takes many seconds but I was a bit disconcerted when the most costly one would only take ~100ms. It turns out that this one was being executed 80 times in a row!

SELECT COUNT( * ) FROM `images` WHERE `from`='4020' AND `type`='2' 104ms
SELECT COUNT( * ) FROM `images` WHERE `from`='4019' AND `type`='2' 117ms
SELECT COUNT( * ) FROM `images` WHERE `from`='4019' AND `type`='2' 101ms
SELECT COUNT( * ) FROM `images` WHERE `from`='4018' AND `type`='2' 125ms
SELECT COUNT( * ) FROM `images` WHERE `from`='4018' AND `type`='2' 104ms
... 80 times ...

This is a classical N+1 problem where we first query for the the elements in the list and then send one query for each one. The proper way to fix this is to refactor the code to merge the inner query in the outer one, but this is super annoying to do in practice.

At Facebook, we use GraphQL and Relay which solves this problem elegantly: it lets you write the queries in a nested fashion as it is in this example, but has a pre-process step that merges all those queries into one.

Make it fast

Anyway, I first wanted to figure out if I could optimize the query instead. It is just being used to check if an actor has at least one image of type 2 (a cover photo), it really shouldn't take 100ms.

I saw three possible improvements:

1) COUNT(*) is wasteful because we don't care about the total count, we just want to know if there's at least one. I learned the hard way that at Facebook, count is extremely expensive because you need to privacy check all the elements in order to know the real count. This means that count is as expensive as fetching all the elements. But in this case, it probably isn't doing such dramatic things.

2) While searching, I found someone writing that if you search for a string on an int field, mysql would be much much slower. That seemed to make sense so I removed ' around the value and unfortunately it made the query go 3 times slower!??!? This wasn't the quick win I was hoping to see.

3) I've read many times that adding an index is super important but had never done it before, so I google'd for it and found this command:

ALTER TABLE `images` ADD INDEX (`from`);

After running this single line of code, the time it took went from 100ms to 0.5ms! So, instead of spending 8 seconds to run those 80 queries it only took 40ms. Problem solved.

Conclusion

Rewrite

Philippe asked several people to try and fix this problem before reaching out to me and their response was along the lines of: this website is built using ancient technology, you need to port it to <insert framework/language name>.

Rewriting is usually the first response but I've learned that it is usually not the best answer. In this case, it took one hour end to end to fix the issue. It would have taken a month+, a lot of money and time and disrupted Philippe flow to bring a new website for no obvious gains.

Precision

Looking back at my code, I can't help seeing many good patterns such as use of components, wrapping mysql calls into a helper, centralizing routing code... But, they were not being used consistently and there was also a lot of terrible patterns. So the fact that they were there is probably a mix of luck, intuition and part of trial and error.

After 10 years, I am now able to pick out every single pattern in this code and talk about the pros and cons and decide whether it would be a good or bad idea to use there. Experience made me a lot more confident and intentional when I code.

Practice makes perfect

If you were to give a similar task to my 14-years old self, I would probably have been able to figure out that I needed to instrument the mysql_query wrapper and add an index. But I bet it would have taken me multiple days to do that.

The reason is because it required me to execute a lot of auxiliary tasks such as

Setup the dev environment to update files on a remote FTP on save.
Figure out how to find out the code responsible for outputting an element on the screen.
Write PHP code that can be run on a 10 years old version (we are spoiled with Hack at Facebook!).
Implement a print function with variadic arguments that outputs console.log using a script tag and know that I didn't need to implement escaping.
Connect to a mysql instance and select the right database in order to be able to run the query.
...

I know that I struggled with each of those for many days/weeks/months in the past, but at this point, I've done variations of those so many times that I have muscle memory for all those tasks. I also used Google extensively during this hour, not to learn, but to remind me how to do those tasks.

Final thoughts

It will be interesting to read this article in 10 years and figure out what I was not seeing while writing this article. It'll certainly be more meta 🙂

Sep

React Rally: Animated — React Performance Toolbox

vjeux Facebook, Javascript 9 comments

I talk about various strategies to improve the performance of your React app: shouldComponentUpdate, Static Container, Element Caching, Raw DOM Mutations and finally, Data Binding, which is powering the Animated API.

I go over their respective trade-offs as sadly there is no perfect solution. Hopefully you can use the ones that make sense in your context.

Sep

React Europe Keynote

vjeux Facebook, Javascript 9 comments

I can't believe that I had the opportunity to do the opening session of React Europe, in front of 700 people! This is total madness how big this React thing has become and I'm so lucky to be able to play a small part of it.

In this talk, I tried to give an overview of the front-end ecosystem that formed around React and try to tell where I want us, as a community, to go towards!

Aug

140byt.es curried add function

vjeux Javascript 10 comments

How it all started

I was talking to frantic about some trick JavaScript question where we needed to use bind and apply in non obvious way and that reminded him of a similar problem where you needed to implement an add function that both works with arbitrary number of elements and can either be used as a number or another function.

test(add(1, 2), 3);
test(add(3)(4)(), 7);
test(add(3)(4)(5), 12);

It sounded fun so I asked him to provide me with some tests so that I could try to implement it. He did it https://github.com/frantic/currying and posted on the internal JavaScript group.

Then, Tadeu Zagallo, another co-worker working on React Native, saw it and while we were sharing our solutions mentioned that 140byt.es just closed down. I got emotional as I loved solving those kind of problems and for fun decided to shrink my implementation to see in how many characters it could fit. And this is how my Friday and Saturday vanished :p

In order to spread the fun, I posted it on Twitter and a lot of people helped out, including RReverser and sainaen who improved the existing solution!

Getting Started

In order to get started you need a valid implementation. Here is the code I wrote in ES6:

var add = (...args) => {
  var sum = args.reduce((a, b) => a + b);
  var fn = (...args) => add(sum, ...args);
  fn.valueOf = () => sum;
  return fn;
};

and "minified" it to be

91:var add=(...a)=>{var s=a.reduce((a,b)=>a+b),f=(...a)=>add(s,...a);f.valueOf=()=>s;return f}

But, unfortunately (or fortunately), ES6 has a lot of shorthand helpers that let you express this in a very concise way without much room for crazy hacks. So we decided to go the ES5 way instead.

The first version we started from was a direct translation of the ES6 version:

189:function add(){
  var s=[].reduce.call(arguments,function(a,b){return a+b});
  var f=function(){return add.apply(0,[s].concat([].slice.call(arguments)))};
  f.valueOf=function(){return s};
  return f;
}

There are three parts in this function: the sum, the returned function and valueOf. We'll see all the experiments we've done with the three.

Sum

Sadly, arguments is not a real array object so we need to do [].reduce.call(arguments instead of the normal version with less characters arguments.reduce(.

57:var s=[].reduce.call(arguments,function(a,b){return a+b})

One of the realization of this game is that creating a function is very expensive in characters. You've got to pay for function(){return} just to play. Let's see with a normal loop how we can write it shorter.

56:for(var s=0,i=0;i<arguments.length;++i){s+=arguments[i]}

We already won one character, but we can do better. arguments is very long, we can make a variable out of it.

52:for(var s=0,i=0,a=arguments;i<a.length;++i){s+=a[i]}

Turns out that the braces are not needed if there's only one statement in the body. 2 free characters.

50:for(var s=0,i=0,a=arguments;i<a.length;++i)s+=a[i]

At this point we're a bit stuck with the normal for loop. Thankfully, JavaScript has another way of looping through an array: for(var key in obj). It is technically supposed to be for objects but also works for arrays :p

40:var i,s=0,a=arguments;for(i in a)s+=a[i]

This is the shortest we could find using a normal loop. But JavaScript has many builtin functions that can be exploited in a few characters. In this case eval and join. Yes, eval! We care about code size, not best practices 🙂

39:var s=eval([].join.call(arguments,'+'))

The caveat with using this version is that join doesn't respect valueOf but instead calls toString. So we need to lose one character there.

Curried function

The objective for this one is to create a function that calls add with an initialized value of the current sum. The naive version is the following:

74:var f=function(){return add.apply(0,[s].concat([].slice.call(arguments)))}

Again, we don't want an entire function for doing that, the trick we're going to use time and time again is bind to generate a function in much less characters.

33:var s=this.s||0,f=add.bind({s:s})

Turns out that this only work in non strict mode. But if you enable strict mode it doesn't work anymore because undefined.s will throw an exception. Also, if you have a global variable s it will break. We found a way to make it a bit smaller using arrays instead of objects.

32:var s=this[0]||0,f=add.bind([s])

What you would really like to do is to just bind(s), turns out that you can do that if you are in strict mode. Because if you call add(), then this will be undefined and ||0 will work fine.

27:var s=this||0,f=add.bind(s)

What if we just did this|0, it will convert the element into a signed 32bit integer and has the nice property that window|0 === 0 and undefined|0 === 0 so it works in both strict and non strict mode! The caveat is that the function no longer works with floating point numbers as 3.6|0 === 3.

26:var s=this|0,f=add.bind(s)

But then we realized that we didn't need to pass that initial value as this, we could just pass it as an additional argument. Not only making it shorter but another few characters, but also working with strict/non-strict and floats 🙂

23:var s=0,f=add.bind(0,s)

Constant function

The last piece of the puzzle is to write a function that always return a number. This is something simple yet there are many ways to write it. Let's start with the naive version:

20:function(){return s}

The first attempt was to look at array functions for one that returns the first element and pass an array with just one. Turns out that join fits the bill. The only trick is that it converts the number into a string, so we need to cast it back to a number when we do the sum s+=+a[i] losing a character.

17:[].join.bind([s])

Then we figured out that we could use Number as a function to generate a number

16:Number.bind(0,s)

But, we're going to be evil again and use eval to shorten it even more!

14:eval.bind(0,s)

Putting it all together

We start with the following

108:function add(){var f,i,s=0,a=arguments;for(i in a)s+=a[i];
f=add.bind(0,s);f.valueOf=eval.bind(0,s);return f}

We have four variables f, i, s and a that we have to declare, but we don't really need all of them at the same time. We use isa together and then fs together. So we don't need to allocate a variable for f, we can reuse a or i.

106:function add(){var i,s=0,a=arguments;for(i in a)s+=a[i];
a=add.bind(0,s);a.valueOf=eval.bind(0,s);return a}

var i,s,a is a lot of characters, would be nice to be able to move it inside of the function declaration function add(i,s,a). Unfortunately, this only work in strict mode because in non-strict, if you override the variables it's also going to override the variables in arguments in the same position.

105:function add(i,s,a){s=0,a=arguments;for(i in a)s+=a[i];
a=add.bind(0,s);a.valueOf=eval.bind(0,s);return a}

Thankfully, when we switch to the eval version of the sum, JavaScript reads from arguments before setting the variable so it works even in non-strict mode 🙂

105:function add(s,a){s=eval([].join.call(arguments,'+'));
a=add.bind(0,s);a.toString=eval.bind(0,s);return a}

Now we're getting in the final phase where we're trying to inline as many expressions as possible. When you have s=...;bind(0,s), you can turn it into bind(0,s=...) which saves two characters: a ; and a repetition of s.

103:function add(s,a){a=add.bind(0,s=eval([].join.call(arguments,'+')));
a.toString=eval.bind(0,s);return a}

In the same vein, we can transform a=...;a.toString=...;return a into return(a=...).toString=...,a. It transforms 3a and 2; into 2a and () which is one less character!

102:function add(s,a){return(a=add.bind(0,s=eval([].join.call(arguments,'+'))))
.toString=eval.bind(0,s),a}

Here, we use eval (4 characters) twice: once to compute the sum s=eval(...) and once to have a function that returns a constant eval.bind(0,s). It turns out that we can build a function that returns the sum by doing s=eval.bind(0,...) and in order to get the sum we use s(). Doing this we save another two characters to reach 100!

100:function add(s,a){s=eval.bind(0,[].join.call(arguments,'+'));
return(a=add.bind(0,s())).toString=s,a}

It turns out that you don't even need to eval in the sum itself, you can pass the string '1+2+3' as first argument to the add function and when it is finally time to toString, then you can evaluate this entire expression.

96:function add(s,a){return(a=add.bind(0,s=[].join.call(arguments,'+')))
.toString=eval.bind(0,s),a}

Conclusion

This was super fun 🙂
I would have never imagined that there were so many avenues of optimization for this seemingly simple problem.
New JavaScript standards like strict mode and ES6 make writing more concise code easier.
In order to spawn collaboration, having a runnable specification is the critical.
No one person came up with all the optimizations, it was truly a collaborative effort.

All Posts

« Previous Page

Vjeux

Frenchy Front-end Engineer

Dirty-up and execute top-down

Example: background color propagation

Dirty-up and execute top-down

Conclusion

Fixing a 10 years old performance issue

Finally something is broken!

Going down memory lane

Finding the performance bottleneck

Make it fast

Conclusion

Rewrite

Precision

Practice makes perfect

Final thoughts

React Rally: Animated — React Performance Toolbox

React Europe Keynote

140byt.es curried add function

How it all started

Getting Started

Sum

Curried function

Constant function

Putting it all together

Conclusion

About Me

Recent Posts

@vjeux on Twitter

Categories

Blogroll