How small amounts of randomness across multiple components affect the end-user latency

Have you ever seen unexplained latency spikes on your monitoring graphs, especially for tail percentiles? I have seen it more often than not. Unless the team or the organization deliberately focuses on latency predictability, it’s rather a norm.

For example, let’s say a web service needs to render a personalized web page. As the consumer of this information is a human being, providing consistently low latency in 90%–95% of cases is usually perceived as a good user experience. To collect all the necessary information for that page, it may take tens or even hundreds of underlying sub-requests. …

How predictable services can be built on top of not-so-predictable dependencies

This post is related to Managing tail latency, which explores variance and tail latency of network services.

Many random factors affect a service’s tail latency (i.e., latency in the worst 1%–5% of cases). Even for in-memory execution, we may see delays when the operating system prioritizes a different thread, or if the hypervisor pauses the virtual machine. On over-subscribed hardware (i.e., with shared cores), it may take tens of microseconds or even milliseconds to wait for available CPU cycles. This may get even worse if an app is running inside a container or a sandbox. Solutions relying on caching or…

Network service performance. Part II.

Comparing how different languages handle network I/O and checking if Rust maintains its high-performance promise

This post is a continuation of Measuring network service performance.

When my computer doesn’t have an Internet connection, I find that there is not much I can do with it. Indeed, we mostly use our laptops and smartphones to access information stored or generated somewhere else. It’s even hard to imagine the utility of non-user facing apps without network communication. While the proportion of I/O operations vs. data processing may vary, such operations’ contribution to the service’s latency might well be tangible.

There are many programming languages used for implementing backend services. Because of this, people have a natural interest…

Network service performance. Part I.

A brief overview of why it’s important but often neglected, how to improve it, and an example of how to measure it.

I enjoy improving application performance. After all, that’s why we need computers in the first place — to do stuff fast. If you think about computers for a moment, it feels like magic — on the lowest level, it is basic arithmetical and logical operations, such as adding and comparing binary numbers. However, performing myriads of these operations quickly not only allows us to play video games, watch endless videos, and navigate the entire array of human knowledge and culture, they also aid us in discovering secrets of the Universe and life. That’s why speeding up applications is a big…

How different forms of handling I/O affect the performance, availability, and fault-tolerance of network applications.

Dealing with distributed systems is quite a difficult job. There could be numerous components developed by different teams over long periods. Human mistakes, such as shipping bugs or incorrectly configured parameters, happen routinely in large systems, no matter how excellent the engineering practices are. Hardware faults also regularly occur in large fleets. On top of that, unpredictable workloads and permutations of all possible states and conditions make it virtually impossible to foresee what might go wrong.

That’s why it’s essential to limit the blast radius to avoid cascading failures and amplified outages. …

A step-by-step guide on how to create an async I/O app in Rust.


This post is for anyone interested in writing performant and safe applications in Rust quickly. It walks the reader through designing and implementing an HTTP Tunnel and basic, language-agnostic, principles of creating robust, scalable, observable, and evolvable network applications.

Rust: performance, reliability, productivity. Pick three.

About a year ago, I started to learn Rust. The first two weeks were quite painful. Nothing compiled, I didn’t know how to do basic operations, I couldn’t make a simple program run. But step by step, I started to understand what the compiler wanted. Even more, I realized that it forces the right thinking and correct behaviour.

Yes, sometimes, you…

Eugene Retunsky

I enjoy learning new technologies and working on ambiguous problems. My main focus is the security, reliability and performance of large distributed systems.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store