Scaling into infinity… part II
In the first part of this series, I started looking at the key components of performance when building and scaling large cloud estates. Now, I’d like to delve a bit further into the associated costs…
The Cost of Performance Testing in a Digital Transformation Environment – Human vs. Machine:
Optimising cost raises some interesting and long-standing debates. Chief amongst those is human resources vs the cost of computer resources. Why spend time optimising a service when the cost of the person (or persons) needed to do that optimisation is greater than buying additional computer resources? However:
- Without well-designed software, there will be hard limits to your scalability
- More resources will not help you reach enterprise-grade non-functional requirements
The above highlights the cost structure for one of Resillion’s internal Azure environments, but the same logic can also be applied to GCP and AWS, or any container-based solution. It gives clear traceability and cost structure which significantly helps identify where we need to focus on performance improvements. This provides the opportunity to accelerate performance capability, reduce cost and support the justification for introducing new products and bolt-ons to management to further progress system and operational performance.
Everyone’s favourite IT companion: maths!
It’s important to understand diminishing returns and that every computer resource you add to a solution comes with an overhead cost.
This principle is applied to avoid system resource diminishing return i.e., performance will increase in line with greater system resource allocation (more CPUs) – but only to a point. The greater the number of processors you have, the more they need to be fully utilised otherwise you will waste CPU capacity, resulting in a significant increase in system response time due to I/O (In/Out) bottlenecking. The CPU operational capacity outstrips the I/O rates, the rates at which read and write operations can take place.
Two examples of this are shown below:
The X-axis represents an increase in transaction response time and the Y axis CPU utilisation. As shown, once 95% utilisation is reached, speed-up increases exponentially, and additional CPUs only exacerbate(d) the performance issue(s).
The formulaic expression can be expressed in multiple ways depending on the objective:
If executing at a fixed workload the expression is as follows:
- Slatency is the potential speedup of the end-to-end task execution
- S is the performance improvement gained from additional system resources
- P is the ‘reduction’ in transaction time relative to the original task duration.
The purpose of this formula is to identify which part of the system would benefit most from processor parallelism. When we execute a scenario, we capture the transaction time and the number of cores used to achieve it. We then increase the number of cores and measure the change.
We see that an increase in CPU cores does not dramatically increase the response time after a certain point. With this in mind, we can determine the optimal number of cores. We see a big gain between 1, 2 and 3 cores after which the gain at each integer is smaller and less significant.
We can also determine that the parallel operation can be improved by increasing the number of CPUs, whereas the non-parallel part can only be increased by code optimisation.
… Performance is an exercise in long-term planning, performance takes care and time to implement, which may not always be economic.