Originally posted on Perf Planet.
React is a JavaScript library for building user interfaces developed by Facebook. It has been designed from the ground up with performance in mind. In this article I will present how the diff algorithm and rendering work in React so you can optimize your own apps.
Diff Algorithm
Before we go into the implementation details it is important to get an overview of how React works.
var MyComponent = React.createClass({ render: function() { if (this.props.first) { return <div className="first"><span>A Span</span></div>; } else { return <div className="second"><p>A Paragraph</p></div>; } } }); |
At any point in time, you describe how you want your UI to look like. It is important to understand that the result of render is not an actual DOM node. Those are just lightweight JavaScript objects. We call them the virtual DOM.
React is going to use this representation to try to find the minimum number of steps to go from the previous render to the next. For example, if we mount <MyComponent first={true} />
, replace it with <MyComponent first={false} />
, then unmount it, here are the DOM instructions that result:
None to first
- Create node:
<div className="first"><span>A Span</span></div>
First to second
- Replace attribute:
className="first"
byclassName="second"
- Replace node:
<span>A Span</span>
by<p>A Paragraph</p>
Second to none
- Remove node:
<div className="second"><p>A Paragraph</p></div>
Level by Level
Finding the minimal number of modifications between two arbitrary trees is a O(n4) problem. As you can imagine, this isn't tractable for our use case. React uses simple and yet powerful heuristics to find a very good approximation in O(n).
React only tries to reconcile trees level by level. This drastically reduces the complexity and isn't a big loss as it is very rare in web applications to have a component being moved to a different level in the tree. They usually only move laterally among children.
List
Let say that we have a component that on one iteration renders 5 components and the next inserts a new component in the middle of the list. This would be really hard with just this information to know how to do the mapping between the two lists of components.
By default, React associates the first component of the previous list with the first component of the next list, etc. You can provide a key
attribute in order to help React figure out the mapping. In practice, this is usually easy to find out a unique key among the children.
Components
A React app is usually composed of many user defined components that eventually turns into a tree composed mainly of div
s. This additional information is being taken into account by the diff algorithm as React will match only components with the same class.
For example if a <Header>
is replaced by an <ExampleBlock>
, React will remove the header and create an example block. We don't need to spend precious time trying to match two components that are unlikely to have any resemblance.
Event Delegation
Attaching event listeners to DOM nodes is painfully slow and memory-consuming. Instead, React implements a popular technique called "event delegation". React goes even further and re-implements a W3C compliant event system. This means that Internet Explorer 8 event-handling bugs are a thing of the past and all the event names are consistent across browsers.
Let me explain how it's implemented. A single event listener is attached to the root of the document. When an event is fired, the browser gives us the target DOM node. In order to propagate the event through the DOM hierarchy, React doesn't iterate on the virtual DOM hierarchy.
Instead we use the fact that every React component has a unique id that encodes the hierarchy. We can use simple string manipulation to get the id of all the parents. By storing the events in a hash map, we found that it performed better than attaching them to the virtual DOM. Here is an example of what happens when an event is dispatched through the virtual DOM.
// dispatchEvent('click', 'a.b.c', event) clickCaptureListeners['a'](event); clickCaptureListeners['a.b'](event); clickCaptureListeners['a.b.c'](event); clickBubbleListeners['a.b.c'](event); clickBubbleListeners['a.b'](event); clickBubbleListeners['a'](event); |
The browser creates a new event object for each event and each listener. This has the nice property that you can keep a reference of the event object or even modify it. However, this means doing a high number of memory allocations. React at startup allocates a pool of those objects. Whenever an event object is needed, it is reused from that pool. This dramatically reduces garbage collection.
Rendering
Batching
Whenever you call setState
on a component, React will mark it as dirty. At the end of the event loop, React looks at all the dirty components and re-renders them.
This batching means that during an event loop, there is exactly one time when the DOM is being updated. This property is key to building a performant app and yet is extremely difficult to obtain using commonly written JavaScript. In a React application, you get it by default.
Sub-tree Rendering
When setState
is called, the component rebuilds the virtual DOM for its children. If you call setState
on the root element, then the entire React app is re-rendered. All the components, even if they didn't change, will have their render
method called. This may sound scary and inefficient but in practice, this works fine because we're not touching the actual DOM.
First of all, we are talking about displaying the user interface. Because screen space is limited, you're usually displaying on the orders of hundreds to thousands of elements at a time. JavaScript has gotten fast enough business logic for the whole interface is manageable.
Another important point is that when writing React code, you usually don't call setState on the root node every time something changes. You call it on the component that received the change event or couple of components above. You very rarely go all the way to the top. This means that changes are localized to where the user interacts.
Selective Sub-tree Rendering
Finally, you have the possibility to prevent some sub-trees to re-render. If you implement the following method on a component:
boolean shouldComponentUpdate(object nextProps, object nextState) |
based on the previous and next props/state of the component, you can tell React that this component did not change and it is not necessary to re-render it. When properly implemented, this can give you huge performance improvements.
In order to be able to use it, you have to have to be able to compare JavaScript objects. There are many issues that raises such as should the comparison be shallow or deep; if it's deep should we use immutable data structures or do deep copies.
And you want to keep in mind that this function is going to be called all the time, so you want to make sure that it takes less time to compute that heuristic than the time it would have taken to render the component, even if re-rendering was not strictly needed.
Conclusion
The techniques that make React fast are not new. We've known for a long time that touching the DOM is expensive, you should batch write and read operations, event delegation is faster ...
People still talk about them because in practice, they are very hard to implement in regular JavaScript code. What makes React stand out is that all those optimizations happen by default. This makes it hard to shoot yourself in the foot and make your app slow.
The performance cost model of React is also very simple to understand: every setState re-renders the whole sub-tree. If you want to squeeze out performance, call setState as low as possible and use shouldComponentUpdate to prevent re-rendering an large sub-tree.