Measure twice, cut once – Performance Testing Tools and Tips
Web Performance Testing Tools and Tips
Many organizations struggle with performance testing such as site load times and have yet to adopt the right measurement tools or processes to improve it. As the product manager for Optimizely Web, naturally, I’m passionate about web page performance because it’s a major influence on the user experience and your business goals. To help our customers deliver snappy experiences for their end-users, we recently launched Performance Edge. It enables performant experimentation at scale, by reducing the impact on site speed. If you are ready to supercharge experiments on your most performance-sensitive pages with Optimizely, or you are starting to think about setting site speed goals for the year, following the best practices below will help your team improve performance scientifically.
Prioritize Performance. Latency negatively impacts the user experience. Your site’s KPIs are a function of the user experience. It’s as simple as that.
Measure the right numbers. Many teams struggle to measure performance accurately. They lack focus when it comes to the metrics that matter. These metrics indicate how your performance is shaping the user experience. For example, Time to First Contentful Paint (FP/FCP) will let you measure the time it takes for a user to see something material on the page. It detects when the first major visible element, such as a hero image, renders. FCP is a pretty standard metric, and many tools like webpagetest support this out of the box.
Better yet, measure Time to First Meaningful Paint (FMP). FMP captures how long it takes a meaningful element to load. It’s up to you to decide which element makes the paint meaningful for your users. At Optimizely, I work on A/B testing products that modify web elements as the browser loads them. The element tested in variations of a given experiment, is the one that we consider meaningful. Time to Interactive (TTI) is when a visitor can click or tap and is important overall page performance health. Still, given that it is the outcome of even more resources loading in the browser than other metrics mentioned, it is less useful for identifying specific actions you can take to improve.
Use the right tools. Synthetic testing tools (with network throttling to mimic mobile) help give you an initial read, but there is no substitute for real-world traffic. Using real traffic is called Real User Monitoring (RUM). It is important to make sure your RUM collects info like the visitor’s browser/device and location so you can slice your data later (more on that below). Synthetic tools work too, sometimes letting you mimic mobile traffic, but they usually suffer from a limited sample size issue.
Use the right analysis technique. Performance data is involved. There can be lots of variance and outliers. Visitors’ devices and locations are literally all over the map. Performance timings tend to be unstable over time. Most sites are built on or with dozens of 3rd party technologies like CDNs, frontend frameworks, A/B testing tools, databases, and APIs, to name a few. Your data is unlikely to reflect a perfect bell-curve. That is, it’s probably not normally distributed and will have a long tail due to outliers.
The best way to analyze website performance in the face of noise is to segment your visitors’ requests, measure it over time to account for seasonality, examine a large sample size, and use percentiles. Using averages instead will cloud your understanding because a small number of hanging requests (due to things like a CDN cache miss or spotty connection) will move the average…towards the outliers.
Segment your visitors. Imagine a file loading in a browser – the main HTML document, a JavaScript bundle, or even an image. In this scenario, you measure the time to download (this is often a contributing factor to the metrics above). What influences how long that file takes to load? A lot of things, but the most important ones are connection speed and file size. We’ll talk about file size later on and focus on connection speeds here. Connection speed depends on the network (Wifi, 4G, 3G, etc.), as well as the bandwidth of the device and the location of the visitor. These two combined partially explain why mobile browsing is slower than your MacBook Pro at home. If you happen to work on a site with global visitor traffic, you’re likely to have visitors in parts of the world with slower connectivity like India, Africa, and Southeast Asia. What’s more, given how cheap data plans are nowadays, it is more likely for an internet user to be on a mobile device. Finally, mobile devices usually have lower CPU power, so executing JavaScript takes longer as well.
Use less code and split it up. Aside from connection speed, the strongest factor in a file’s load time is its size. Big files take longer than small files. Slow networks and limited bandwidth exacerbate this. Some files are JavaScript code that needs to be executed by the browser as well. In this capacity, more code takes longer to run. When it comes to A/B testing, we recommend reducing the amount of code with Performance Edge and Custom Snippets. It also helps to split your code up into smaller chunks so that only what’s necessary loads initially, then load everything else when you need it.
Run proper tests. When you want to improve your site’s performance, a) not everything is a silver-bullet, and b) you should measure and communicate the impact of the changes you’re making. Testing helps quantify any tradeoffs you’re making and enables you to communicate the impact of your work.
Performance testing is a server-side A/B test where your hypothesis is about changing your site to reduce latency. The experiment design is simple – split your visitors in half and show 50% one site version and the other 50% a modified version. The modification could be anything from splitting up your vendor bundle, using a different FE framework, hosting 3rd party assets yourself, or even just removing page elements. Make sure you’ve got your RUM instrumentation in place and the right analytics setup. Optimizely offers A/B testing SDKs in most backend languages – Java, JavaScript (Node), Python, etc. And we’ve got a free feature flagging tool called Rollouts so that you can control your features to lay down the foundation for experimentation.
This Performance White Paper covers many of the performance testing topics in this post. It expands on how traditional client-side A/B testing tools impact your site performance, and how Optimizely can help ensure that the value of your experiments is higher than the performance cost of delivering them.