Managing the uptime of business critical Internet-based systems

By Derick Swart on 29 July 2013
  Back

The development and use of Internet-based applications have become prevalent due to the major improvements in terms of reliability and performance that so-called “cloud computing” offers. Applications such as these are often referred to as “software as a service” or “SaaS” applications, because the end user is presented with a secure service that is accessible (typically via an Internet browser), rather than a physical installation at the end user’s site. It is easy to assume that SaaS services will never leave one in the lurch due to the excellent uptime and performance that users typically enjoy. There are however a number of risks that arise from the use of SaaS. In this article we will take an introductory look at two of these issues, namely managing service levels and business continuity.

The advantages of cloud computing are generally obtained by hosting the application across multiple physical data centres in a manner that is seamless to the end user, an approach that is referred to as virtualisation.  These data centres are typically located in different geographical locations for increased business continuity and disaster recovery, from where they continuously mirror each other.

Without getting into too much technical detail, load balancers for instance actively spread the load over multiple physical servers for increased performance.  Typically hosting providers also make provision for computing resources to be allocated dynamically, depending on the load required at any particular time.  Resources are often billed on a time basis given the dynamic nature of their use.

Managing service levels

Agreeing on service levels provide a way to manage the expectation of parties to an agreement as to what the level of service is that they can expect from each other.  These service level commitments can include any measurable performance obligation, such as:-
  • Uptime – the time that a system is available over a particular period;
  • Responsiveness – how long certain operations take to perform;
  • Response time – how quickly the service provider responds to issues;
  • Resolve time – how long it takes for service provider to resolve an issue; and
  • Scheduled downtime – the amount of time the service is not available for scheduled maintenance.
In the absence of well defined, measurable service level commitments, agreements often do not address the expectation of the parties adequately. Lawyerly words like “reasonable”, “best endeavours” and “without undue delay” are all very much open for interpretation. Building these value judgements into an agreement is perfectly legal, but delays the interpretation thereof for later – often to the steps of the courthouse and beyond.

In addition to service levels being measurable, it is important that regular reporting on the compliance with service levels is a firm obligation on the party providing the service. Often only the service provider will have access to the relevant information to facilitate the reporting on service levels and therefore the obligation typically falls to the service provider. Certain service levels may however be measurable by third party automated services (e.g. uptime, loading time) and the parties may agree to utilise such services for accurate, independent reporting.

Service levels allow the parties to address non-performance in a more dynamic manner, for instance by providing for service credits to become payable in the case that service levels commitments are not honoured.  This is a good alternative to pursuing traditional breach of contract remedies which are often “all or nothing” given the cost of formal litigation proceedings.  

Disputes pertaining to service level commitments can often be resolved by way of expert determination, which is an alternative to arbitration.  In this case an independent party knowledgeable in the field determines the issue.  The parties are then bound to that determination in the absence of a manifest error.  This allows for a quick and cost-effective manner in which to get disputes resolved.

It is to be expected that the parties to a service level agreement will initially have a difference of opinion as to what the appropriate service level commitments would be.  Addressing differences at the commencement of a business relationship makes sense, since it is better to discover that expectations are not aligned at this point.

Finally, care should be taken when reviewing exclusions to service level commitments.  Typically the vendor would want to exclude circumstances outside of its control – sometimes referred to as “force majeur” events.  The scope of these exclusions differs and it is therefore important to make sure that the exclusions are appropriate.  The parties need to agree what will happen if any such excluded event last for a certain period of time.

Business continuity and disaster recovery 

Given the distributed architecture of cloud computing systems as aforesaid, the licensee would not have access to a physical installation on its infrastructure running at its site. It therefore typically has no control over the business continuity of a SaaS service. This substantially increases the risk, because not even the underlying raw data may be readily accessible. 

Where the SaaS provider has only a limited number of licensees, for instance where a licence per country is issued, it would not be unreasonable for the licensee to push for undertakings regarding documented maintenance and security obligations, as well as SaaS escrow arrangements. 

The aforementioned obligations are typically documented in the licence agreement and require the licensor to compile and maintain a list of actions and procedures in respect of the service to keep it from becoming vulnerable to defects, malware or downtime. In appropriate cases, the right of audit can be included, for instance to establish if industry-standard security or other specific obligations are adhered to. 

Escrow in the current context involves a trusted third party being placed in a position to release deposit materials to a licence for business continuity purposes. SaaS escrow is to be distinguished from traditional software escrow in the sense that it moves the focus from availability of deposit materials to availability of the service. 

The verification services provided by an escrow agent can broadly be placed on a spectrum between “passive” to “active”. With passive escrow, the agent will not verify that the ingredients deposited by the licensor will enable the licensee to spin up its own instance of the service, but at best verifies that the constituent parts appear to be there and that the integrity of the files are in order. Active escrow on the other hand requires of the agent to certify that it can indeed restore the service using the deposit materials, either on the licensor’s infrastructure or its own infrastructure. 

Most licensors are nervous about active escrow, since it places third parties in a position to know and understand their systems. While agents performing active escrow would not run through the code line by line, they would certainly have access in a manner that no licensee would typically be allowed. 

While reputable SaaS escrow agents can propose template solutions, it is ultimately the responsibility of the licensor and licensee to agree upon an appropriate solution, primarily dictated by the downtime that is acceptable. The cost of escrow is very much related to the time within which the licensee must be able to spin up its own instance of the software from when a release condition is triggered. 

In conclusion 

In this post we provided some broad strategies and approaches to managing the uptime of business critical Internet-based systems. Experienced commercial and legal judgement is required in each particular case to ensure a well-balanced agreement that adequately places and manages the risk from these types of services.

Back to top

Please note that our blog posts are informal commentaries on developments in the law as at the time of publication and not legal advice. You should place no reliance on our blog posts; we look forward to discussing your particular matter with you.