Background

An multi-national insurance company required a single insurance application to be deployed in each office. There were two candidate systems. The “Best Fit” application had never been deployed with such a large user community. The client needed evidence that the application would scale up in order to assist in the decision making process.

The stakes in a project like this are huge, make a mistake and hundreds of millions are at stake; there’s significant  potential for office politics from supporters of the differing systems.

Three key factors made this project a success.

  1. Appointment of an independent overseer from a global consultancy to provide oversight of the testing process, data, results and conclusions;
  2. An experienced, and motivated team: each of whom knew what was expected of them, who collaborated and shared ideas
  3. Sign off by the key stakeholders  at each stage; each stakeholder had 4 working days to provide feedback.

The terms of reference called for 1000s of users conducting a real world simulation. What constitutes a real world simulation?; it’s not just 1000s of users conducting simultaneous transactions. It needs to be as real world as possible. The differing offices logging on in different timezones; conducting transactions as a real person would; going for lunch. The success criteria being the required response time is not exceeded, and the transaction succeeds, and the cost per user and per transaction.

The hardware specification -is limited only by what can be procured for the required project time period with the available budget. The terms of reference did not include the need to include the latency (delay) caused by distance of the user from the processing server. That latency could have been achieved by locating the remote terminal emulators (RTE) in data centres local to the offices, and the system under test (SUT) in the best location to reduce latency in all locations. The constraint is the need to comply with the local laws in terms of data privacy and geopolitical risk.

Agree the terms of reference

All the stakeholders (including the project manager) should be involved in agreeing the terms of reference; business representative (the client!), operations, IT, software supplier, and an independent overseer. The independent overseer’s role is to stand over and lend gravitas to the final report, and provide liaison with the client during the oroject lifecycle.

Ensure that the goals, and success criteria are clear; that scenarios are well defined, and that contingency scenarios are included.
In terms of the goal – what exactly is it? The application scales to n users running a real world transaction mix and with n users spread in offices as specified for a period of 24 hours; where 90% of responses are received within 3 seconds, no response should be greater than 5 seconds. That 99.9% of the transactions completed successfully and that a collocated database remained within 1 transaction of the primary.

It is imperative that terms are well defined and everyone is clear as to their exact meaning.

The Delphi technique is a very successful tool in achieving consensus.

All stakeholders must agree and sign up the terms of reference.

Document Every Decision, Configuration change and test result as it happens, and get sign off from the appropriate parties for the any deviation from the terms of reference.

The goal is to produce results which are not open to interpretation or question.

Programme of Works

The project manager was independent of the overseer.

It’s imperative test scenarios are repeatable, so there needs to be a task to create a baseline environment, that task might need to be repeated several times. The time taken to reset back to the baseline environment also need to be considered.

It took time to create the perfect the environment and configuration. Confuguration tuning was allowed for within the terms of reference. The tuning task was critical to scalability and the second most time-consuming task, after data generation and validation.

The programme needs to be signed up to and signed off by all parties including the client.

Scenario & Transaction Collection

The simulation needs to stand up to scrutiny. It needs to be as realistic as possible, the number and types of transactions performed per user per day; times of breaks and lunch and the times that users leave for the day; much of this information was collated from logs in existing applications. Analysis of the life cycle of a transaction; user logs in; waits period, performs part a of transaction; waits; performs part b of transaction etc.

Scenarios were then signed off by the client, the supplier of the application, and the independent overseer.

It’s highly unlikely that clients have provided their data for this purpose. So the data was anonymized or generated. Again sign off is required from the stakeholders that the data is representative; The data population and scenario step provided a good idea of the capacity that will be required for the hardware. Data anonymisation and generation into realstic data is also extremely challenging.

Write and test the scenarios

The scenarios; are defined in the terms of reference. Sometimes applications need to scale down as well as up. It may be in the future that multiple data storage location will be required; so contingency scenarios were included in the terms of reference. Some countries have legislated against the export of personal data; this legality may become a requirement in other countries or regions.

Testing the scenarios is extremely time consuming – made more difficult by the need to use anonymised data, which can bring about unexpected results. An inordinate amount of time was spent fixing data.

Each scenario needs to be tested – it’s usually based on send/expect, but the unexpected needs to be accounted for too;

Design the required infrastructure and computing environment

The design process was methodical; a combination of historical analysis and forecasting.

Network traffic was modelled and used to size the firewalls, VPN concentrator, routers, load balancers. The database sized and roughly tuned. Server sized based on historical transactions. You might not have this luxury.

There’s plenty of capacity planning tools on the market, but not a single tool that provides data for networks, servers and databases.

If the application is new and out of the box. run the application with a small number of users to gain some idea of the resource consumption, and use that data for forecasting.

Source & configure the hardware

With the cloud it’s relatively simple and cost effective using load balancers and elastic computing; but getting it wrong will be time consuming, and time – money especially given the specialties you will need here. (project manager, overseer, server, network, and database and application specialists). I’d suggest using cloud providers for the test even if thats not going to be the final environment.

Running the Tests

Run the tests, it’s tempting to leave the tests to run overnight in unattended mode, and analyse the results in the morning. First though ensure the environment and configuration is stable.

Don’t forget to keep the raw logs when you reset to the baseline. It’s important data.

Profile application performance – but only if it’s use is included in the terms of reference (and it should be). The profiling tool is typically language dependent

Analysing the results

Analysing the results can be complex; if the tests don’t meet the goals; why? Is it a fault of the configuration, hardware, network, database. or disk? . did the load balancer spread the ;load evenly. Would adding in a server help, offload the database to a external database provider, change load balancing, retune the environment.

Ensure all stakeholders have the opportunity to review and comment on the results.

The final report

Write a draft report, detailing the process, decisions made during the project, how the raw results were analyzed and interpreted. In writing the report, have a copy of the terms of reference (ToR); ensure the ToR have been strictly adhered to .Include areas for improvement, decisions made, configuration changes, scenarios, and raw results in the appendices.

An important point to include in the final report is the cost per user, and per transaction. This calculation isn’t always trivial, since it assumes the configuration is at its maximum performance capacity or the calculation is subjective.

Give everyone the opportunity to comment on, and sign the draft final report. Providing you have involved all the stakeholders throughout the project this should be a relatively straightforward exercise. Then give the draft report and feedback to the outside overseer for comment, distribution and presentation.

in our scenario it wasn’t all plain sailing. The majority of our challenges were getting the scenarios to run with the data we had generated. After tuning, the application scaled, and could have supported even more users than in the test scenarios.