cloud service level agreement, 3 Steps to Cloud SLA: #1 Availability(1)
One of the primary reasons why businesses migrate from the traditional IT infrastructure to the cloud infrastructure is availability. In some cases, such as Office 365 and Google Apps, this availability is perceived as “continuous availability.” This is where the service level agreement begins to shape: the vendor is offering some kind of availability to the customer and guarantees it contractually. In almost all cases, this contract is the Service Level Agreement.
In addition to the agreed service level, the availability, the SLA defines what happens when the cloud vendor cannot provide the availability. In the agreement, this subject is covered under the compensation section. The agreement also defines under which circumstances the cloud vendor is not liable for the disruption of service. This section details the cases where the unavailability of the service cannot be attributed to the cloud vendor and therefore will not be compensated.
The availability of the service defines two things: what the service is and what will be its “reachability”. The service is defined differently in Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS) and Software-as-a-Service (SaaS) scenarios due to the difference of the service offered. Broadly, the services offered and their availability defined are as follows:
Infrastructure-as-a-Service: The cloud vendor offers the datacenter infrastructure as the service. This is usually defined as the “fabric” and covers the processing, memory, storage, networking, virtual stack, the load balancers (if any) etc.. The cloud vendor is responsible for keeping the fabric up and running. The “unavailability” defined in this case is the time to take a server (or instance) back to online. Depending on the need and the service offering, the availability can be measured for single servers or can be pooled to cover all the servers in the SLA.
Platform as a Service: The cloud vendor offers the functionality of the platform as the service. The availability covered is the “usability” or “reachability” of the platform to the customer for process execution, external connections and the correct and timely execution of the commands. This can vary from the successful execution of the commands to the execution time of the command.
Software-as-a-Service: The service offered by the vendor is the availability of the application itself and the data. If you have an SLA for a 99% availability of an e-mail service, it means that 99% of the time you can access the website to check your e-mail (the application) and your e-mail (the data). SaaS is the scenario where the end users feel the disruption of the service most.
When the service offered/requested is clearly defined, the next thing is to define the availability. Almost all the time, the availability of the service is stated somewhere between 99 to 99.99% (although there are vendors who offer 100% availability, I see this is a misleading marketing effort: I cannot believe 100% availability when giants like Microsoft, Google and Amazon cannot promise and deliver such uptime). And in most cases the availability is measured on a monthly basis, starting from the first day of the month at 00:01 and ending in the last day at 23:59. This availability rates directly affect the pricing. As the aggressiveness on the availability increases, so as the pricing. The prices increases geometrically: between 99% and 99.99% uptime, the prices increase 30%-50%, rather than 0.99%.
The most important thing is to match the business requirement with the availability and the price. An uptime of 99% means that the service will be unavailable for 1% of the time. Considering the whole year, this will be 3.65 days (365 x 1%). For a 99.9% availability, the service will be unavailable for 0.365 day, which is 8.76 hours (this is the availability Microsoft offers with its Office 365 service). If you further consider an availability of 99.99%, the service will be unavailable for 0.0365 days, or 0.876 hours, or 52.56 minutes. Make the math and compare it with your business: if you are a small / medium business would 9 hours of downtime for the whole year hurt you? Or will it worth the additional payment to keep your services down for just 1 hour for the entire year? Matching the business requirements and the price will be the basis of your SLA. (Read: Top 3 Important Aspects of Web Hosting Security).
Now that the service you request and the availability is clearly defined, you need to focus on the exceptions and the vendor’s limitations, which I will discuss it in my next article.