Bond - A Spy-Based Testing and Mocking Library

Testing plays a key role in the development of any software package, especially for large or complex systems with many contributors. Unit tests allow you to perform sanity checks, verify the expected behavior of your program, quickly detect errors, change one area of a program with confidence that you didn't break any other, and so on; the benefits are enormous. But, writing tests can be difficult and tedious; you have to create the proper test environment, exercise the relevant components, and then validate that they ran correctly. The typical pattern of unit testing involves calling some function or module, obtaining results, and then using a series of assertEquals-style comparisons to check that the result is as expected. For some simple functions a single assertion may be sufficient, but the situation is often more complex, requiring numerous assertions to be thorough in validation and achieve confidence that the behavior is correct. Even worse, if any of the expected behavior changes, you have to individually adjust all of the assertions, meaning there is a tradeoff between thoroughness of the test validation and ease of maintenance as the test changes.

Graph showing hypothetical developer effort for a set of complex unit tests as a project progresses. The tests are initially developed in January; Bond requires slightly less effort due to a reduction in the number of assertions that need to be created manually. In April, a significant change to the behavior of the system under test is made, causing a large amount of developer effort to update the traditional unit tests. The ease of reconciling differences between old and new test behavior means that updating the tests with Bond will require very little developer effort. Again in August, a more minor change is made to the tests, and Bond requires less developer effort to update. Over the course of a few behavioral changes, the benefits of Bond become much more apparent than the initial savings. Note that this is a hypothetical situation intended only to demonstrate where Bond can be useful, but we plan to perform a more detailed study of this phenomena using a real project in the near future.

We are introducing a new testing library, Bond, which aims to make this process more efficient and developer-friendly by transitioning away from traditional assertions and instead using what we refer to as spy-based testing. Rather than explicitly specifying the expected value, you simply "spy" the object(s) you wish to make assertions about - similar to using a printf to see what a value is, but done in a more careful way. The first time this happens, Bond will display the spied objects to you in a serialized form called an "observation" and ask if the output is what you expect; if it is, Bond will save this value as a reference. For future test executions, spy output will be compared against this reference value to ensure that the behavior hasn't changed. This provides the same assertion power as a unit test, but without having to type out large blocks of assertions. The real power comes into play when the expected behavior changes; rather than manually changing test values, you are presented with an interactive merge tool which displays the differences between expected and current output. You can view the changes, see if they match the new expected behavior, and easily accept some or all of them as the new reference.

In addition to using spying as a way to replace assertions, you can also spy intermediate values and outside interactions within production code: writes to a database, HTTP requests, command line tool executions, etc. By placing calls to spy within your code, any time you run a test with Bond, these calls will appear in the observation log, allowing you to trace execution and easily monitor how your code is interacting with outside services. This has been used in the past to monitor complex interactions which result in 100s of observations - too much information to make individual assertions about, but at the level that can easily be verified by reading through the list of observations and checking that the system is behaving as expected. In subsequent test runs, you have the power of detecting changes to any of those numerous observations, as opposed to the few assertions that would probably be written in a traditional unit test.

Finally, Bond has a fully integrated mocking library, allowing you to simultaneously mock out and spy function calls at arbitrary points within your program. If a function is marked as a "spy point", any call to that function will be spied, recording its call arguments and (optionally) its return value. If you then deploy an "agent" to monitor that spy point, the agent can take various actions when the function is called: mock out a return value, execute arbitrary auxiliary functions, allow the function call to continue as normal after spying its call arguments, etc. You can also mark such spy points as requiring mocking, which can be useful on functions which are dangerous to execute during testing, and execution will automatically halt if the function is ever called from a test without having an agent prepared to perform mocking.

Bond was originally developed by George Necula for use in projects at Conviva and has been used there for a number of years. Recently, we rewrote the project from the ground up, taking into account many lessons learned from the first version, and Bond is now available for free on GitHub in Python, Ruby, and Java.

Usage and Examples

We show only some simple examples of how Bond is used in Python; for more detailed examples in all three languages, as well as tutorials which walk you through how to add Bond to a project, we encourage you to look at the Bond documentation. You will also find some more detailed explanations of the inner workings of Bond.

Basic Unit Test Spying

Imagine that you are testing some function which constructs a data dictionary to be sent off as an HTTP request. Let's say the content is your current location and looks like this:

  lat: float,
  lon: float,
  accuracy: float,
  user_id: string

Using Bond, we can easily check the accuracy of the output as such:

def test_location_dictionary(self):
    data = create_location_dictionary("some input data")
    bond.spy('create_location_dictionary', output=data)

When we first run the test, we will receive a message like:

There were differences in observations for MyTest.test_location_dictionary: 
--- reference
+++ current
@@ -0,0 +1,11 @@
+    "__spy_point__": "create_location_dictionary", 
+    "output": {
+        "accuracy": 10.5000, 
+        "lat": 27.6500, 
+        "lon": 90.4500, 
+        "user_id": "user1"
+    }

There were differences in observations for MyTest.test_location_dictionary: 
Do you want to accept the changes (MyTest.test_location_dictionary)? ([k]diff3 | [y]es | [n]o):

With a single spy statement, we are able to completely view the state of the object, and determine whether or not it is correct. You can see that the changes are displayed as a diff (against nothing, in this case). This same type of message will highlight exactly what parts of the observations have changed if the output ever falls out of sync with the accepted reference value.

Spy Points and Mocking

Now let's imagine that create_location_dictionary actually makes a call to an external GPS service to retrieve your location data. We don't want this to happen during testing, so we instrument the function with a spy point:

def create_location_dictionary(data):
    # ...

Now in our test code, we deploy an agent, which will intercept calls to create_location_dictionary and instead return a prepared mock result:

def test_func_which_uses_create_location_dictionary(self):
    mock_location = # ... 
    bond.deploy_agent('location_dictionary', result=mock_location)
    output = func_which_uses_create_location_dictionary()
    # ...

Now, when the function being tested tries to call create_location_dictionary, the mock location will be returned. Additionally, calls to create_location_dictionary will appear in the observation log, so you can easily track when and how calls to the external GPS service are made.

Where We Go From Here

While Bond can be very useful as-is, we think there is much more to explore in this new style of testing and hope to continue to expand Bond. One area we are specifically interested in is entering into a "replay-record" mode where Bond can assist you in easily generating the response data for your mocks. This can be especially useful when mocking HTTP requests, whose responses may be large and complex. We envision the ability to slowly run through the execution of your code against a live service, denote that the response received is a correct mock response, and have this be automatically saved to be used as the mock response in future test executions. While some mocking libraries provide similar functionality (e.g. VCR, Betamax), they require you to copy-paste mock data, which may be a large block of text, into your test. We believe Bond is in a unique position to provide this functionality, as it already provides a clean separation of test data vs. test code, as well as easy ways to create and change test data.

Bond is still under active development and we would love to hear your feedback; please leave us a comment!

Canary Deployment with NGINX

You have just deployed the first version of your web service, your users like it and can't stop using it, and now you contemplate deploying an upgraded version of the service such that:
  • There should be no interruption of the service while you deploy and configure the new version, even if this involves stopping and restarting several processes that make up your service.
  • You can run both versions in parallel for a while, each version on a controlled portion of the traffic, to validate that the new version of the service behaves as expected. This is known as canary-deployment, or blue-green deployment, and is also an instance of A/B testing.
  • The traffic control mechanism should be configurable without interruption of the service. For example, should the new version appear to misbehave, the traffic can be quickly directed to revert to the old version.
  • The traffic control mechanisms should allow both percentage-based control (e.g., 1% of the public traffic should be directed to the new version), and also client-dependent control, such as based on IP address, or HTTP headers such as User-agent and Referrer (e.g., employees or known testers, should be directed to the new version), or a combination thereof.
  • The traffic control mechanisms should ensure "stickiness": all HTTP requests from a client should end up being serviced by the same version, for a given traffic-control configuration.
  • The separate versions are all using the same database. This is a tall order, because it requires you to design the web app with forward and backward compatibility. This is not as hard as it seems, and is essential for achieving incremental deployment and rollback. We'll talk about how to achieve this in a future post.
Canary deployment is a very useful technique and there are multiple mechanisms that one can use to implement it. In this post, I describe how to do it if you are using NGINX as your front web server, with the examples specifically for a Django uWSGI-based web service, although a similar configuration can be used for other kinds of web services. In a future post, I will describe a slightly more complex way to achieve the same using HAProxy.
I have learned these techniques while developing the Touchstone web service for Conviva Inc.  Conviva is a leader in video-quality analytics and optimization, and Touchstone is a proprietary web service that Conviva customers use to test and fine-tune their integration of the Conviva libraries in video players. Since Conviva has customers in many time zones, and there are a number of automated continuous-testing tools that use the Touchstone web service, there is no convenient time to interrupt the service to deploy upgrades.

Let us now describe an example traffic-control configuration that we want to achieve (as shown in the figure below):
  • We want to deploy three separate instances of the web service, each with its own set of static asset files, and its own uWSGI server running possibly different versions of the Django application. We call these versions "alpha" (early-adopter least-tested version), "beta" (believed to be ready for general availability), and "ga" (hardened, general availability version). Note that all these instances of the Django application use the same database, and all the forward and backward compatibility support for the persisted data is built-in the application itself.
  • We identify clients coming from the "employees" and "tester" groups, based on their public IP addresses. We want to send 100% of traffic from "employees", and 30% of traffic from "tester" group to the "alpha" instance. The "beta" instance will get the rest of the "tester" traffic and also 1% of the public traffic. Finally, 99% of the public traffic should go to the "ga" instance.

NGINX is a high-performance web server and reverse-proxy server and it has a flexible configuration language to control how it handles the incoming requests. You already know that a good deployment design for a web service should use a real web server for the static assets, and for HTTPS termination, with a reverse-proxy setup to your actual application (called an "upstream" application in this context). NGINX is a very good choice as a web server, and as we show here it turns out that it can do quite a bit more for your deployment than serving static files and terminating HTTPS connections.
We will take advantage of the following features of the NGINX configuration language:
  • NGINX configurations can use variables, which are set for each request based on the contents of the incoming HTTP request. The variables can then be used to compute other variables, and ultimately to control how each request is handled, such as to what upstream application it is proxied to or from what directory are the static files served.
  • The configuration directive geo sets the value of a variable based on the IP address from which a request is coming:
    # The "$ip_range" variable is set to either “employees”, or “testers”, 
    # or “public”, based on the client's IP address
    geo $ip_range {
 employees;      # IP addresses of our office
  testers;        # IP address of our testers
          default        public;         # Everybody else
  • The directive split_clients sets the value of a variable based on a randomized percentage split with customized stickiness. In the example below, the value of the variable $remote_addr (the client IP address) is concatenated with the string "AAA" and the result is hashed into a 32-bit number. The value of the defined variable is set based on where in the 32-bit range the hash value falls:
    # The "$split" variable is set to different percentage ranges
    # sticky by remote_addr (client IP address)
    split_clients "${remote_addr}AAA" $split {
        1%    fraction1;   # 1%  of remote addresses assigned to "fraction1"
        30%   fraction2;   # 30% of remote addresses assigned to "fraction2" 
        *     other;       # rest to "other"

    This scheme guarantees that two requests with the same IP address will result in the same value for the "$split" variable. This scheme can be adapted by using other variables in place of, or in addition to "$remote_addr", in the split_client directive.
    Note that the notation "${remote_addr}AAA" performs string interpolation, computing a string based on the value of the $remote_addr variable concatenated with "AAA".
  • The map directive can be used to compute the value of a variable conditionally based on other variables. In the example below, the "$instance" variable is set based on a concatenation of the $ip_range and $split variables computed above, using regular expression matches.
    # The "$instance" variable is set based on $ip_range and $split. 
    map "${ip_range}_${split}" $instance {
       "~^employees_.*"        alpha;  # everything from "employees" to "alpha"
       "~^testers_fraction2$"  alpha;  # 30% from "testers" to "alpha"
       "~^testers_.*"          beta;   # the rest from "testers" to "beta"
       "~^public_fraction1$"   beta;   # 1% of the public to "beta"
       default                 ga;     # everything else to "ga"

    The "~" prefix tells "map" to use regular expression matching, and the different clauses of "map" are evaluated in order until one matches.

All we have to do now is to use the value of the $instance variable to decide which instance of the web application to proxy requests to, as shown below:

# uUSGI upstream handlers, separate UNIX sockets for each instance
upstream app_alpha_django {
    server unix:///opt/app_alpha/uwsgi.sock; 

upstream app_beta_django {
    server unix:///opt/app_beta/uwsgi.sock; 

upstream app_ga_django {
    server unix:///opt/app_ga/uwsgi.sock; 

server {
    listen       80;
    location /static {
        # Each instance has its own static files
        root  /opt/app_${instance}/static;

    # Define the common parameters for all uwsgi requests
    location / {
        uwsgi_param  QUERY_STRING       $query_string;
        uwsgi_param  REQUEST_METHOD     $request_method;
        … more standard uwsgi parameters …
        # Each instance has its own upstream server
        uwsgi_pass  app_${instance}_django;

That's it! Well, almost. I also use a script to generate the above configuration based on a specification of the different instances, the IP ranges, and the percentages. For example, the script can quickly generate a configuration that forces all traffic to a single instance. Finally, the script tests the NGINX configuration, and only then tells NGINX to load the new configuration (something that NGINX can do without dropping requests):

# Make sure you tell nginx to test the configuration before reloading
sudo /usr/sbin/nginx -t
sudo kill -HUP `cat /var/run/`

I hope this post will help you make the most out of NGINX, and perhaps motivate you to dig the manual for even more goodies. 

InductJS: INcremental DOM Updates in Client-side Templating for JavaScript

InductJS: INcremental DOM Updates in Client-side Templating for JavaScript


Client-side JavaScript templating is widely used in modern web sites. The performance of template instantiation and rendering has a direct impact on the usability of the web site because of JavaScript's single-threaded execution model. In many cases the performance of template instantiation and rendering is dominated not by JavaScript instantiating the template, but by the browser rendering the instantiated HTML string into the DOM. Even though there are numerous libraries that implement client-side JavaScript templating, we believe that there is significant untapped potential to exploit the common use case when a template is re-rendered onto the same place in the web page, with some partial changes to the underlying data to be rendered. This can happen, for example, when the client has received new data from the server and must update part of the web page. Often developers attempt to identify manually the parts of the web page that must be updated, based on keeping track of what has changed in the underlying data and using fine-grained templates that can be rendered independently. We believe that this bookkeeping job can be done better by the templating library.

In this project we are proposing an optimization to client-side JavaScript rendering whereby upon re-rendering the template, the resulting instantiation is compared efficiently with the result of the previous rendering, and only the changed DOM elements are updated. To this end we have developed a very simple compact web templating system, called InductJS, designed to update the DOM incrementally upon re-rendering of templates. In its current form, InductJS certainly works and is definitely fast, but supports only a minimal set of templating features, enough to allow us to evaluate the performance impact of the incrementality optimizations we are proposing in comparison with two other modern templating systems: React and AngularJS. We believe that these results suggest that it may be profitable to implement these optimizations in other fully-featured templating systems.

We describe first the results of the performance comparisons with React and AngularJS, then we describe the usage and implementation details of InductJS.

Results / Comparisons

In order to test the effect of incrementality optimizations, we wrote a template for an HTML table, backed by a 2-dimensional JavaScript array, which in its base form contains 300 rows, each row consisting of an array of 15 strings elements. This benchmark is similar to other performance comparisons we have seen using long lists (Improving AngularJS long list rendering performance using ReactJS, Faster AngularJS Rendering (AngularJS and ReactJS)). We implemented a simple comparison web page (see links in the table below) that renders this template with InductJS, and also with AngularJS 1.4.0-rc.2, and React 0.13.3. We used the production builds of AngularJS and React. The comparison web page allows the user to select which templating system to use, and presents several buttons for re-rendering the template in several scenarios for changes to the underlying data, as discussed below. For every rendering, the page shows the time taken to re-render. This time is measured by taking the difference in system time before and after requesting a re-render through the current framework. Unfortunately, this does not measure the entirety of the update, since the browser will perform some extra style calculation and painting after the rendering call. Initially we attempted to force the browser to finish re-rendering completely by accessing properties (such as width) of a displayed DOM node, but found that the browser was still able to perform steps afterwards. However, we found by viewing the Chrome timeline profiler that the amount of time spent after returning from the JavaScript function was negligible in comparison to the time spent during the JavaScript execution, and was relatively consistent across different frameworks / methods of updating.

We summarize in the table below some of the performance results. Best result for each test is shown in green, and the worst result in red. The leftmost column contains the name and a description of the change-scenario considered. (The name of the scenario is shown in bold, and corresponds to the title of the corresponding control on the comparison web page.) We have found that the performance can depend considerably on whether we update a templating site on the page with text only (no HTML markup, as is the default for expressions in React and AngularJS, and InductJS's "text" templating sites), or if we allow the templating site to change the markup (React's dangerouslySetInnerHTML property, AngularJS's "ng-bind-html" directive, and InductJS's "html" templating sites). The table shows rendering times in milliseconds as measured with Chrome 42.0.2311.39 using Linux Mint 17 on an Intel i7 4750HQ processor at 2 GHz. In order to ensure higher consistency of results across different runs, we used a separate profile in Chrome, we rebooted the computer and ensured that only Chrome is running, and we opened only the browser tab with the performance tests.

Rendering time (in milliseconds) for a nested array of 300 rows, each containing 15 elements
ScenarioInductJSAngularJS 1.4.0-rc.2React 0.13.3
Text OnlyHTML Content Text OnlyHTML Content Text OnlyHTML Content
Initial Rendering 82 81 * * 92 100
Rerender With No Changes 6.5 6.5 2 2 18 27
Change All Rows: all row arrays are replaced 14 55 243 293 41 83
Change All Elements: all elements are replaced 14 54 285 339 37 82
Change 75% of Rows: random selection of 75% of rows are replaced 12 44 190 230 34 70
Change 75% of Elements: random selection of 75% of elements are replaced 12 43 235 279 33 69
Change 50% of Rows 11 32 130 156 30 56
Change 50% of Elements 11 32 188 224 29 58
Change 25% of Rows 8 20 69 84 27 42
Change 25% of Elements 8 21 133 154 24 45
Insert 30 Rows at Start 8 17 33 39 26 40
Delete 30 Rows from Start 7 9 2 2 20 30
Insert 30 Rows at End 8 13 33 40 24 36
Delete 30 Rows from End 6 6 2 2 20 29
Insert Row in Middle 7 8 6 6 19 29
Change Row in Middle 7 6 6 7 19 28

A few interesting things to notice:

  • We have not found a good way to obtain measurements for AngularJS's initial rendering time so we have left those items blank.
  • For large changes (i.e. any of the comparisons that change a percent of the elements) using text-only sites, InductJS is far faster than any of the frameworks. This is especially interesting since in InductJS using a text-only site is an optimization, whereas in React and AngularJS it is the default.
  • AngularJS has blazing fast times for re-rendering without changes. InductJS almost manages to keep up, and React lags behind a bit.
  • For the operations which do not affect large amounts of the data, the performance is mostly limited by the amount of time it takes to do a full re-check (i.e. a re-render with no changes). For all three frameworks, deleting rows from either end takes a trivial amount of additional time, and InductJS/ReactJS both take a trivial amount of extra time to insert or change a row in the middle.
  • Using template sites which are only capable of text vs HTML-capable has very different effects on different frameworks. AngularJS has the smallest difference, with around a 25% speed decrease when forcing it to display arbitrary HTML strings. React takes around twice as long on some large operations, and InductJS has the largest difference, with HTML sites taking nearly 4 times as long in some cases.
  • InductJS and React both far outperform AngularJS for large changes, and InductJS's text optimizations are able to give it a significant edge over React on text-only sites, though this gap closes somewhat when both frameworks are asked to display arbitrary HTML strings. This may be somewhat of an unfair comparison, as in React (and AngularJS) displaying an arbitrary HTML string is nonstandard and likely not optimized.

InductJS Usage

A template is an HTML string that contains within it templating sites that are replaced with HTML content when the template is rendered. Each templating site contains one JavaScript expression that is evaluated in a context passed to the template rendering function. We implemented four types of template sites: "html", "text", "if", and "for", which are represented and rendered as follows:

"html": {{html-content-expression}}

The expression is evaluated in the current context and the result, which may contain HTML markup, is rendered in place of the templating site.

"text": {{text literal-content-expression}}

The expression is evaluated in the current context and the resulting string, which may not contain HTML markup, is rendered in place of the templating site. This is an optimization inspired by AngularJS's expressions, which by default only display text content. Using text only allows for more efficient updating in the browser because we can directly modify the nodeValue of the Text node which is used to display the content, rather than modifying the innerHTML of the host, hinting to the browser that no structural changes have occurred and that no HTML parsing is necessary.

"if": {{if boolean-expression}} inner-template {{/if}}

The boolean expression is evaluated in the current context and if the result is "true" the inner content template is rendered using the current context; if the result is "false" no content is rendered.

"for": {{for(for-variable) list-expression}} inner-template-to-repeat {{/for}}

The list expression is evaluated in the current context and the result should be a list/array value. The inner content template is rendered repeatedly once for every value in the list, with a context obtained by extending the current context with the for-variable set to the current list element.

In order to support incremental updating of the DOM, each templating site when rendered into the DOM will be wrapped inside host HTML elements, by default with the tag "span". To specify a different tag for the host element, include it between the opening braces. In case of a "for" templating site, each copy of the rendered inner template will be wrapped in a separate element. To access variables contained within the context which is passed in to render, refer to them using context.var_name (the required use of context rather than a bare variable name is a simple limitation due to the way expressions are treated; if a full expression parser was used this would be unnecessary, but that is beyond the scope of the project at this time).

For example, to display a table, we might use a markup string like:

  {tr{for(row) context.rows}}
    {td{text context.row.content}} 
    {td{if context.row.footer}} {{'<b>' + context.row.footer + '</b>'}} {{/if}}

Then to display this table inside of a div with an ID of "insertPoint", we would do:

rend_template = render(template,                              # the template string
                       {rows: myArray},                       # the context    
                       document.getElementById("insertPoint") # the host DOM element

where myArray is an array of objects such as {content: "Message", footer: "ending"}.

And, if myArray changed, to update the displayed content (note that the template you pass in must be the template returned from the initial render, not the template string):

rend_template = render(rend_template, {rows: myArray});

InductJS Implementation Details

We discuss the implementation details in the context of the same example we used in the Usage section above.

When a template is rendered for the first time, the following actions are performed:

  • the template is compiled and cached: the templating sites are identified, the corresponding expressions are wrapped and cached as JavaScript functions.
  • each templating site is instantiated by evaluating the expressions contained in the template, as explained in the Usage section. Each instantiated site is then rendered into the DOM, wrapped into a host element.
  • when rendering a template, we construct a snapshot tree data structure with nodes corresponding to the instantiated template sites. For each instantiated template site we record the previous value of the evaluated expression (labeled "prev"), a reference to the DOM host element (labeled "host"), and references to the snapshot sub-trees recording the information for the inner templates (labeled "inner").
For our example template, if we render it in the context:
  { rows : [
       {content = "Row 1", footer = "Foot 1"}, 
       {content = "Row 2"}

the figure below shows the generated HTML at the right, and the snapshot tree data structure at the left:

Diagram of the example above

The inner "html" template within the second iteration of the "for" template is not currently displayed, so it has host and prev values which are null. The "for" template stores a previous value which is equal to the length of the currently displayed array rather than the array itself, since the relevant content is already stored as the previous value within the inner iterations. The list-start and list-end comment nodes are used to keep track of the bounds of the "for" template's iterations; this becomes especially important when the displayed array contains zero elements.

When a template is re-rendered into the same top-level host DOM element, we traverse the snapshot tree data structure, and we reevaluate the expressions in the new context. If the expression evaluates to something different than previously stored in the snapshot, we update the DOM, and adjust the snapshot tree accordingly. However, in the common case when the expression evaluates to the same value as in the snapshot, we do not update the DOM. Even in this case, for "if" and "for" template sites, we continue evaluating the inner template sites (inner and inner_iterations) and compare with the snapshot to find more places that need to be updated.

When an "if" template that was previously false evaluates to true, its inner templates are not currently displayed (as above for the second iteration of the "for" template), so we can't perform an incremental render. In this situation, or for the similar situation where a "for" array grows in size and a new inner iteration must be added, we perform a full HTML string render and place this into the DOM, exactly as we do when we insert a template for the first time. When a "for" array has grown shorter or an "if" template goes from true to false, we simply clear out the matching content. More details about how "for" templates are handled incrementally can be found in the next section.

"For" Template Optimizations

"For" template optimizations are disabled by default, but can be enabled on a per-template basis by using "for*" instead of "for" as the template type (e.g., {{for*(row) context.rows}}). Currently the optimizations employed on "for" templates are somewhat limited, and are optimized for insertion and deletion at either end of the array. We traverse both the new and previous arrays from each end, attempting to find a "match" between the two (defined as 5 or more elements in a row that are equal between the new and previous arrays). If a match is found from both ends, items are then deleted and added outside of the match area as necessary, then the area between the two matches is updated incrementally. This allows us to attempt to leverage as much of the existing content as possible. If no match is found, we still attempt to leverage existing content by performing incremental rerendering on the existing items.

Note that each element of the array used in an optimized "for" template must be unique. You cannot have duplicate strings / numeric values and you cannot use the same object reference twice. This is a limitation due to the way the optimizations are implemented - the current location of an item within the list is tracked using a hash table mapping items to their indices, similar to what is used by React and AngularJS. Allowing a separate array of keys to be used to track elements, similar to what is required by React and available in AngularJS, may be implemented in the future.

Edit (6/26/15): In a previous version of this writeup, we have accidentally shown performance measurements using the development build of React rather than the production build. During the re-testing phase, we discovered some discrepancies in our data, and have regenerated all of the performance data with a higher level of confidence that they are consistent. Switching from the development to production build of React also removed the performance degradation from React version 0.12.2 to 0.13.3, so we have removed the older version of React from our comparisons. The discussion beneath the performance data has been adjusted somewhat to reflect the new data.