As businesses rely more heavily on applications to better serve customers and to differentiate themselves from competitors, IT faces increasing pressure to meet more demanding service-level agreements. In order to meet those demands, IT organizations must improve application health. Here are five key ITIL process-focused areas that can help IT organizations improve the performance and reliability of their applications.
1. Change Management
The implementation of changes causes a significant proportion of incidents. You’ll often hear the generalization that 80% of incidents are caused by 20% of the changes, but this is an application of the Pareto Principle rather than a hard and fast statistic. (However, Joe the IT Guy has cited his own experience of seeing an 80% reduction in incidents from implementing a change freeze.) A Salesforce upgrade, for example, will generate incidents because people aren’t aware of the changes and can’t find what they’re looking for, or because a newly added custom capability may break existing workflows.
By rigorously following the standard ITIL change management process, organizations can significantly reduce change-related incidents and thereby improve application performance. We worked with an organization that had a 3:1 ratio of incidents to changes – every change applied to the production application infrastructure resulted in an average of three incidents. The organization hired a change manager who investigated why so many changes were related to incidents. The manager’s analysis found that 65% of the change-related incidents stemmed from changes to 10% of configuration items (CIs). The organization focused on those CIs and instituted a new process to handle changes to those CIs. As a result, the ratio of incidents to changes was significantly reduced to one incident per change.
2. The Configuration Management Database (CMDB)
Within ITIL, the CMDB is the primary system that ties different processes together. For example, each incident that comes in should be tied to a CI. Similarly, changes and problems should also indicate which CIs they affect.
Using a CMDB to link all of this information together enables you to analyze the data and uncover high-value insights. When changes are being planned, you should be able to assess which CIs they’re going to affect. If you’re upgrading the memory or changing the hard disk on a server that’s running five applications, then you might have five different CIs that are affected by the change being made. You’ll want to make sure you can move those workloads to another server during the changes if downtime is not an option for those applications.
Among other things, ITIL provides a set of definitions for various IT processes. You can measure the organization’s performance by implementing a standardized process with discrete phases and defining the workflow in an ITSM system. An analytics application like Numerify enables you to track how long it takes to complete a process, as well as slice and dice data every which way to find the proverbial needle in the haystack. You can monitor when an item moves from one phase to another and identify bottlenecks that impact service levels. For example, a common finding is that the approval phase of the workflow tends to be a major bottleneck for meeting customer request fulfillment expectations.
4. Shift Left
When discussing ITSM, shift left is the concept of pushing incident resolution toward tier 1 support (or even customer self-service) in an effort to reduce the involvement of higher-paid experts and increase job satisfaction for the service desk staff. (See Karen Ferris’s All Things ITSM blog for more on this topic.) The goal behind shift left is to improve first call resolution while lowering costs. Improved first call resolution rates ensure that the end user’s experience with the application being supported is more positive, as resolution SLAs can be met more frequently.
Analytics can help you with your shift left efforts. For example, you can use text analytics to look at common terms used in tickets and identify the areas where shift left efforts would pay the highest dividends. You can then focus your automation efforts on incidents that take a long time to resolve, or those that ping-pong through the organization because nobody really knows how to resolve them. This can both reduce incoming incident volume and increase first call resolution rates.
5. Root-Cause Analysis and Problem Management
Finally, a simple way to improve the reliability and quality of your applications is to actually resolve recurring issues rather than allowing them to resurface time and again. We know that IT staff are often busy, and workarounds can serve as quick fixes. However, if an issue continually comes up, an incident should be created in the ITSM system and the organization should investigate the underlying cause. Following the ITIL process and allocating the resources needed to fix the problem once and for all will reduce incidents and improve the health of the application.
[Photo courtesy of Pexels.]