Why Smart Incident Management Needs NLP (Part 1)
Better Incident Management is a critical area of focus for all IT organizations given the potential of Incidents to disrupt business, harm brand reputation and reduce stakeholder confidence in IT teams. Consumerization of IT and a movement towards best-of-breed applications has resulted in a volume and variety of Incidents that is unprecedented. Not only do IT organizations have to be nimbler than ever to resolve Incidents and restore business-as-usual in quick fashion, but they also have to be proactive to prevent Incidents from being generated at all.
As analytics become more prominent, many companies are focusing on performance metrics, data quality and process automation for efficiency and cost reduction in ITSM processes. Incident Management is a key driver for that, but one question that begs to be answered is, “Are we leveraging all the data captured in the Incident process for analytics?”
Turning Better Incident Management into Smart Incident Management
Your ITSM systems of record carry valuable information about process performance that you can leverage to identify opportunities for improvement. This information can be categorized into one of the following:
- Structured: Most commonly used data for dashboards, reports and analytics
- Unstructured: Free-form text fields like description and work notes which are often analyzed manually.
- Semi-structured: logs / events, usually analyzed using infrastructure and application monitoring tools
Structured data is often leveraged extensively for reporting and analytics, while semi-structured data is often leveraged to find technical root cause. However, conventional analytics tend to underutilize the unstructured data. This blog focuses on the ways in which Unstructured data can enhance structured data analytics.
Unstructured text fields are not used systematically or frequently for analysis as text analysis can be hard to execute at scale. The “signal-to-noise” ratio in text data is generally very low (especially in longer text fields like work notes or resolution notes) and the business user must cope with some ambiguity while looking at text-based insights. Additionally, when analyzing large volumes of Incidents, the volume of text data can be a deterrent. Relational data formats and querying techniques are unsuitable for text analytics and thus require different skills and technology stack. Finally, it can be difficult to juxtapose text-based insights with structured data analysis to present an actionable picture of the data.
However, it is worth overcoming these hurdles to apply text analytics to Incident data, as it contains the following important information not available in structured fields:
- Exact nature of the Incident or the issue
- Root cause or resolution steps for the Incident
- Similarity between Incidents and problems/changes beyond standard fields like CI, application, etc.
Given that most description text in incidents is human generated, we need to leverage Natural Language Processing to be able to parse, interpret and analyze this data.
What Is NLP and How Can You Apply it to Incident data?
Natural Language Processing is a field of computer science that deals with algorithms and techniques that enable computers to process, understand and analyze human languages.
Here’s an example of how the Numerify System of Intelligence processes text and produces usable insights.
Key Capabilities of Numerify’s NLP Engine
Keyword feature extraction – Keyword extraction for Incident text data needs to go beyond the standard tokenization & lemmatization that is generally used for text preprocessing. Terms such as IP addresses, email addresses, URLs, asset ids are significant for analytics and thus standard text preprocessing needs to be enhanced to handle such non-straightforward tokens. E.g. terms like “192.168.67.45”, “email@example.com”, “/proj/mps33b/rev2”, “L343HH23” are quite common in Incident and need special processing.
Domain-based Stopwords – Off-the-shelf text analytics packages and libraries typically work with a standard English language stopwords list as provided by Python NLTK or Stanford CoreNLP. This stopwords list is insufficient for ITSM and needs to be enhanced with the domain context. E.g. Words like “issue” or “incident” which occur very frequently across Incident text data are not a part of the English stopwords list. However, from an ITSM point of view, these are actual stopwords as they do not add any new information to the analysis we are doing.
Leveraging our deep domain expertise and experience across Fortune 500 clients, we have compiled an ITSM domain-specific stopwords list which is a part of Numerify’s NLP Engine.
Similarity identification – This is the core of the NLP engine and does the main task of isolating groups of similar Incidents based on text data. This core algorithm draws on industry standard techniques of Topic Modelling, Entity Recognition and Information Retrieval, and has been fine tuned for the ITSM context. This algorithm is generic in nature and can be applied to any process area beyond Incident such as problem or change request.
Upcoming Webinar: Smarter Incident Management with NLP-Driven Topic Clustering
- Achieve Smart Incident Management with AI-Powered Analytics
- Leverage NLP to help make sense of large volumes of unstructured text descriptions
- Identify related incidents using Incident Topic Clustering
- Benefit from using Topic Clustering for Smart Incident Management
The Right Way to Establish ITSM Accountability
Making specific leaders or teams responsible for particular ITSM metrics bridges the gap between understanding…
Establish Change Risk Metrics to Drive IT Agility
Risks in the production environment can have major consequences for users and the bottom line….
How to Start Analyzing Your IT Process Data Today
The thought of transitioning to a data-centered culture, where all of your IT process data…