Why Smart Incident Management Needs NLP (Part 1)
Better Incident Management is a critical area of focus for all IT organizations given the potential of Incidents to disrupt business, harm brand reputation and reduce stakeholder confidence in IT teams. Consumerization of IT and a movement towards best-of-breed applications has resulted in a volume and variety of Incidents that is unprecedented. Not only do IT organizations have to be nimbler than ever to resolve Incidents and restore business-as-usual in quick fashion, but they also have to be proactive to prevent Incidents from being generated at all.
As analytics become more prominent, many companies are focusing on performance metrics, data quality and process automation for efficiency and cost reduction in ITSM processes. Incident Management is a key driver for that, but one question that begs to be answered is, “Are we leveraging all the data captured in the Incident process for analytics?”
Turning Better Incident Management into Smart Incident Management
Your ITSM systems of record carry valuable information about process performance that you can leverage to identify opportunities for improvement. This information can be categorized into one of the following:
- Structured: Most commonly used data for dashboards, reports and analytics
- Unstructured: Free-form text fields like description and work notes which are often analyzed manually.
- Semi-structured: logs / events, usually analyzed using infrastructure and application monitoring tools
Structured data is often leveraged extensively for reporting and analytics, while semi-structured data is often leveraged to find technical root cause. However, conventional analytics tend to underutilize the unstructured data. This blog focuses on the ways in which Unstructured data can enhance structured data analytics.
Unstructured text fields are not used systematically or frequently for analysis as text analysis can be hard to execute at scale. The “signal-to-noise” ratio in text data is generally very low (especially in longer text fields like work notes or resolution notes) and the business user must cope with some ambiguity while looking at text-based insights. Additionally, when analyzing large volumes of Incidents, the volume of text data can be a deterrent. Relational data formats and querying techniques are unsuitable for text analytics and thus require different skills and technology stack. Finally, it can be difficult to juxtapose text-based insights with structured data analysis to present an actionable picture of the data.
However, it is worth overcoming these hurdles to apply text analytics to Incident data, as it contains the following important information not available in structured fields:
- Exact nature of the Incident or the issue
- Root cause or resolution steps for the Incident
- Similarity between Incidents and problems/changes beyond standard fields like CI, application, etc.
Given that most description text in incidents is human generated, we need to leverage Natural Language Processing to be able to parse, interpret and analyze this data.
What Is NLP and How Can You Apply it to Incident data?
Natural Language Processing is a field of computer science that deals with algorithms and techniques that enable computers to process, understand and analyze human languages.
Here’s an example of how the Numerify System of Intelligence processes text and produces usable insights.
Key Capabilities of Numerify’s NLP Engine
Keyword feature extraction – Keyword extraction for Incident text data needs to go beyond the standard tokenization & lemmatization that is generally used for text preprocessing. Terms such as IP addresses, email addresses, URLs, asset ids are significant for analytics and thus standard text preprocessing needs to be enhanced to handle such non-straightforward tokens. E.g. terms like “192.168.67.45”, “firstname.lastname@example.org”, “/proj/mps33b/rev2”, “L343HH23” are quite common in Incident and need special processing.
Domain-based Stopwords – Off-the-shelf text analytics packages and libraries typically work with a standard English language stopwords list as provided by Python NLTK or Stanford CoreNLP. This stopwords list is insufficient for ITSM and needs to be enhanced with the domain context. E.g. Words like “issue” or “incident” which occur very frequently across Incident text data are not a part of the English stopwords list. However, from an ITSM point of view, these are actual stopwords as they do not add any new information to the analysis we are doing.
Leveraging our deep domain expertise and experience across Fortune 500 clients, we have compiled an ITSM domain-specific stopwords list which is a part of Numerify’s NLP Engine.
Similarity identification – This is the core of the NLP engine and does the main task of isolating groups of similar Incidents based on text data. This core algorithm draws on industry standard techniques of Topic Modelling, Entity Recognition and Information Retrieval, and has been fine tuned for the ITSM context. This algorithm is generic in nature and can be applied to any process area beyond Incident such as problem or change request.
Upcoming Webinar: Smarter Incident Management with NLP-Driven Topic Clustering
- Achieve Smart Incident Management with AI-Powered Analytics
- Leverage NLP to help make sense of large volumes of unstructured text descriptions
- Identify related incidents using Incident Topic Clustering
- Benefit from using Topic Clustering for Smart Incident Management
What Kinds of Data Should You Be Using to Reduce IT Operations Risk (Part 1)
Changes to the production environment can precipitate major incidents, disrupting critical business services. In fact,…
Using AI to shift from reactive to proactive major incident management
What Predicting Tornadoes and Major IT Incidents Have in Common When the weather turns bad,…
How to Prioritize IT Problems for Maximum Impact
Prioritizing which problems to tackle depends on your organizational objectives. For example, businesses worried about…