Which witch is which — Availability, Reliability, Durability, Resilience, Responsiveness?
How do you tell how good your software is?
How do you tell how good your car is? Is an F1 racing car better than your modest homely one? That depends on the purpose it’s being used for. Is a sedan better than an SUV? Is a digital dashboard better than an analog? Again, depends on the users' taste and purpose. However, if you do want to compare cars, there are any number of attributes to consider — quality of build, power, efficiency, looks, price, handling, after sales service etc. Car makers are constantly improving these traits. Are there similar attributes that you must consciously improve when building software? Sure, read on…
Well architected, well implemented, and well run software
Software is everywhere. How good a software app is, can be gauged through some quantifiable and some abstract parameters. Just as there are many aspects defining how good a car is, there are aspects defining characteristics of software too. I have tried to briefly talk about some of the important ones here — namely — Availability, Reliability, Durability, Resilience, Responsiveness, Performance, Usability and Accessibility.
In many a technical discussion, I have seen people use these terms interchangeably, but they are quite distinct and must be treated so, w.r.t software applications or data. This is a brief article about the distinction between them at a very high level.
Again, drawing an analogy of software with the more familiar car -
Your car is easily ‘Available’ if it is in a ready-to-go condition, located close enough that you can get to it quickly when needed. You would never have to wait to get going on it. A chauffeur who is ever ready to drive you around and keeps the car squeaky clean makes it a bit more Available :)
In software terms, availability is the ratio of uptime vs. downtime of the application. For example, if your website is down for 1 minute in a hour, it is unavailable for 1.6% of the time (1/60). In other words, it has an availability of 98.4%. H.A (High Availability) is a general term used for applications that have extremely high values. On a cloud platform such as AWS or Azure, the expected availability for HA applications/services is at least 99.99% (“four nines”), which translates to a downtime of not more than 52 minutes per year.
For your car to qualify as H.A (highly available), it would need to have multiple engines (maybe a gas and an electric hybrid?), additional batteries and tyres on each axle, to compensate for failure of anything critical! Alternatively, you would need to have multiple cars, possibly in multiple garages, to compensate for geographical disasters, and chauffeurs working in shifts :) You would also need a second car to drive alongside the one you’re in, so if something happens to the one you’re driving, you can hop into the second car while you’re still rolling without a stop. Big deal huh? But this is how most software is made H.A!
How is this achieved? Typically using the ‘Design for Failure’ principle — do not strive to make your application infallible — instead, accept that failure is normal and design around it. The most common way to increase availability is to have redundant copies of your application running in geographically distributed areas, which may be offline (cold), ready to spin up at short notice (warm), or actively online (hot).
Reliability is a quality of being predictable and consistent. If your car needs to be refueled after say 4 hours of driving, you can rely on it if you need to drive for 3 hours, but not 6 hours. In this case the Reliability of your car is 4 hours. If your car feels moody, that is, it handles well on some days and not so well on others, or maybe on some days it gives more mileage than normal, then it can be said to have low reliability.
A software application cannot afford to be inconsistent. You can measure a system for reliability based on how long you can depend on it to serve its purpose correctly without buckling out. The system is expected to not fail, or serve faulty or incorrect data. If it works, it works well, or not at all.
Reliability is measured in terms of how long a system can continue to be available for use and behave as expected (producing accurate data). One way to express this is in terms of MTBF (Mean Time Between Failures). For example, if your website is down for some time every Sunday at 11pm for weekly deployment or optimization activity, then its reliability is about a week (it can go a week without failure).
Reliability is partially related to Availability. H.A systems typically can go several months between failures.
Durability of a car is well understood — how long you can continue to use it before the engine coughs and wheezes. While reliability is more concerned with temporary recoverable unavailability such as refuelling or changing a flat tyre, durability is more concerned with long term non-recoverable depreciation. Most modern cars have a durability well beyond 100,000 kms or miles, after which their resale value decreases due to expected mechanical troubles.
A car with easily replaceable spare parts is more durable than one whose parts are hard to replace. Regularly replacing the worn out parts of a car makes it endure longer. Similarly, in terms of software, Durability tells how amenable it is to be serviced to continue to meet the requirements and stay relevant (Serviceability). Durable software remains trustworthy, accurate, and usable for a long time. Durability of Software is an abstract concept and is not usually expressed numerically. However, Durability of Data is precisely defined and measured. It is the possibility of losing a file or DB record over time. For example, if one out of a thousand files on a server get corrupted in a year (despite backups), then the probability of losing a given file is 0.1% (1/1000), and consequently its Durability is 99.9%.
Replacing one part in a thousand per year may be acceptable degradation for a car, but not for software. Typical cloud based platforms are expected to offer a durability upwards of 99.9999999% (“nine nines”). For example, AWS S3 offers a Durability of 99.999999999% (“eleven nines”), which is equivalent to saying that you may see, in about 10 million years, 1 file getting lost/corrupted out of a million stored files!
Resiliency is most commonly confused with reliability. Resiliency does contribute to a system being reliable, but is still an independent measure. Your car is ‘Resilient’ if it can work well in multiple conditions- summer, winter, rain, smooth road, muddy tracks, high or low altitude, ethanol or gasoline fuel, with a spark plug worn out, a headlamp fused etc.. Similarly, a software application is resilient if it can work around limitations, deal with situations, and still deliver expected functionality.
A typical example is when a Mobile App allows you to enter information even when your phone is offline, and quietly syncs up with the server later when network is available. Another example is when a system finds its database is down and is unable to fetch data requested by a user, but queues up the requests and responds later through email. When your phone doesn’t have the latest weather info, instead of showing nothing it still shows the last updated weather along with a timestamp. A resilient system can function with same or reduced functionality when things go wrong, instead of hanging itself on a hook or showing a blank screen.
If your car swerves a few seconds after you turn the steering wheel, or responds to the accelerator and brake pedals after a few seconds, would that be acceptable? So, the responsiveness expected from a car’s navigation is in the order of microsecs and not seconds. Its performance, on the other hand is measured by many parameters such as how much load it can carry, how fast it can go from 0 to 60, how efficient it is with the fuel etc.
In Software terms, responsiveness is the ability of the system to complete the requested task in acceptable time. For example, the expected responsiveness of a webpage is typically less than 3 secs for lean pages and 5 secs for rich ones, before the user starts to lose patience and navigate away. It is important to distinguish two similar sounding terms here which should not be mixed up. Responsiveness is a measure of how quickly your app responds to user actions, while Responsive Design is a completely different concept which tells about the ability of the website to present itself optimally in multiple devices of differing forms- desktops, tablets and mobiles.
You connect with your car through the dashboard. As a good doctor looks into the patient’s eyes to gauge their health, you look into the gauges and feel confident about your car. Do you prefer a minimalist dashboard in your car or an advanced one that makes you feel like a Pilot? Do you think the old school analog dials are good or the digital ones? It’s a personal choice, but brings in a comfort between you and your car. A cluttered dashboard that shows irrelevant info or is difficult to understand, is not useful and takes the pleasure out of the drive.
Likewise with Software — Usability is the ease with which a software can be used by its users. It is an unquantifiable measure of the happiness or satisfaction your users experience upon interacting with your application. Usability is best evaluated by putting yourself in the place of the end users of your system and critically reviewing it — the UI is expected to be uncluttered, easy to navigate, with intuitive organization of content, visually appealing, handleable on mouse and touch screen devices, following a consistent branding pattern, and most importantly helps users accomplish useful work comfortably. Responsiveness helps achieve better usability.
Accessibility is a specific form of usability which is aimed at improving the user experience of differently abled persons interacting with your software. This includes considerations such as using clearly legible fonts, sizes and colors, providing text alternatives to media, making all features available through the keyboard, use compatibility with assistive technologies, and many more. The W3C WAI group defines standards for accessibility levels as well as guidelines for implementing them, and most standard websites built nowadays are compliant to some level of Ax. If you use a standard UI library such as Material UI to build out your UI components, you may already be partially compliant with ax standards.
~~~~~~ _()_ ~~~~~~
That’s it for now! Hope this drive through some related but distinct lingo was useful, and helps you build more awesome software!
Originally published by Parag Desai at https://randomcoding.blogspot.com.
Also published by Parag Desai at https://content.techgig.com/how-do-you-tell-how-good-your-software-is/articleshow/84749023.cms