Richard Hipp on improving SQLite performance, 50% faster than 3.7.17:
This is 50% faster at the low-level grunt work of moving bits on and off disk and search b-trees. We have achieved this by incorporating hundreds of micro-optimizations. Each micro-optimization might improve the performance by as little as 0.05%. If we get one that improves performance by 0.25%, that is considered a huge win. Each of these optimizations is unmeasurable on a real-world system (we have to use cachegrind to get repeatable run-times) but if you do enough of them, they add up.
More often than not getting a big improvement in performance is about doing lots of little things.
Xiao Yu mentioned perfmap at work last week:
A bookmarklet to create a front-end performance heatmap of resources loaded in the browser using the Resource Timing API. A browser with support for the Resource Timing API is required.
It gives you a quick way to see how far into the page load images on the page finished loading.
If you are going to use this regularly I’d recommend editing the bookmarklet to load perfmap.js from a trusted source that you control.
I ran across a jsperf.com test of a for loop vs foreach and I was surprised at the difference when running the test in Chrome 38:
The for loop consistently came out over 40 times faster than the foreach.
“Project Express” from Hibernia Networks is a new trans-Atlantic fiber connection from New York to London. It was started in 2011 but isn’t expected to be completed until later in 2014. From a 2012 Bloomberg news article about the project:
Project Express will be the fastest cable across the Atlantic, reducing the time it takes data to travel round-trip between New York and London to 59.6 milliseconds from the current top speed of 64.8 milliseconds
The cost for this new, shorter, fiber line is reportedly over $300M. A big price tag for a 5.2 millisecond reduction in round-trip time.
It should come as no surprise then that the main customer base for this new connection is high speed trading companies.
Steve Souders takeaway #1 after trying to figure out unexpected caching behavior in Chrome ( emphasis is mine ):
Remember that Chrome may do DNS prefetch, TCP pre-connect, and even prerender the entire page based on the confidences in chrome://predictors.
Apparently this isn’t new information, it was mentioned by Ilya Grigorik in High Performance Networking in Google Chrome. But if Steve Souders didn’t know about it already then I expect that it isn’t widely known.
Looking over the chrome://predictors/ results in my browser it is about what I’d expect. One thing that would be helpful is the ability to sort by individual columns. I’m most interested in which pages Chrome is mostly likely to attempt to prerender.
How we code interactions on the web has changed significantly with mobile touch devices. It isn’t just about hover, it is also about timing:
By default, if you tap on a touchscreen it takes about 300ms before a click event fires. It’s possible to remove this delay, but it’s complicated.
– via Suppressing the 300ms click delay – QuirksBlog.
Some browsers allow pages to turn off this delay when you have
width=device-width set. Unfortunately mobile Safari isn’t one of those.
Zack Tollman suggested I try out SPDY with my updated Nginx install. While I’m sad at the idea of giving up a plain text HTTP API, I was curious to see what SPDY looked like on this site.
I was disappointed with the results. The fastest page load time out of 5 runs without SPDY was 1.039 s. With SPDY the fastest result was 1.273 s. I then did several more runs of the same test with SPDY enabled to see if any of them could get close to the 1.0 s base line. None of them did, most came in close to 2 seconds. I had honestly expected to see SPDY perform better. That said this type of testing is not particularly rigorous, so take these numbers with a sufficiently large grain of salt.
Given the initial poor showing of SPDY in these tests I’m going to leave it turned off for now.
A cross section of web performance over the last two years:
The median top 500 ecommerce home page takes 10 seconds to load. In spring 2012, the median page loaded in 6.8 seconds. This represents a 47% slowdown in just two years.
According to “Retail sites that use a CDN are slower than sites that do not*” on Web Performance Today.
I downloaded the PDF of the report to find out how these measurements were done:
Radware tested the home page of every site in the Alexa Retail 500 nine consecutive times. The system automatically clears the cache between tests. The median test result for each home page was recorded and used in our calculations.
The tests were conducted on March 24, 2014, via the WebPagetest.org server in Dulles, VA, using Chrome 33 on a DSL connection.
I asked about the 2012 settings that were used in the comments section.
I decided to try out this suggestion from Optimizing NGINX TLS Time To First Byte (TTFB) ( which I mentioned at the end of 2013 ):
After digging through the nginx source code, one stumbles onto this gem. Turns out, any nginx version prior to 1.5.6 has this issue: certificates over 4KB in size incur an extra roundtrip, turning a two roundtrip handshake into a three roundtrip affair – yikes. Worse, in this particular case we trigger another unfortunate edge case in Windows TCP stack: the client ACKs the first few packets from the server, but then waits ~200ms before it triggers a delayed ACK for the last segment. In total, that results in extra 580ms of latency that we did not expect.
I’ve been using Nginx 1.4.x from the Ubuntu package collection on this site. A few webpagetest.org runs showed that HTTPS negotiation was taking more than 300ms on the initial request. After updating to Nginx 1.5.13 more tests showed HTTPS negotiation was down around 250ms.
The 50ms savings isn’t nearly as dramatic as the worst case scenario described in the quote above, but I’ll take it.
Performance engineering is its own discipline. The problem is, not many people have realized that yet.
From Steve Souders post on web performance for the future.