Share This Post

You are surprised to see that your tax refund application is being stuck “in review” for nearly a month, while your father-in-law received his refund within two weeks after his submission. An option to chat with a government representative could potentially address the delay by identifying any minor issues or missing documents. However, is it possible to rely on such an automated service to effectively determine and explain the true reasons for such an unexpected delay? Can organizations rely on recent advances with large language models (LLMs) to drive the deployment of such a service? Would you trust an automatic explanation given to you regarding the delayed response to your tax return?

LLMs are a type of artificial intelligence (AI) program that can recognize and generate text, among other tasks. They are trained on a vast amount of text to interpret and generate human-like textual content. While the adoption and usage of LLMs by organizations to automate many aspects of their operations is growing rapidly, this is also accompanied by a certain degree of doubt related to the tendency of LLMs to produce what we call “hallucinations,” or incorrect information, due to a lack of inherent capacity to reason with understanding. In a recent paper, the authors tackle this issue and try to answer the question “how well can LLMs explain business processes?”.

Alongside the rapid development of AI-based models there is an inherent trust issue due to the “black box” nature of these models that hinders their wide adoption. This lack of trust has given birth over the past years to Explainable AI (XAI) models that have been developed to explain decisions and actions that are powered by AI models. Such methods are aimed to give users a better understanding of the inner working of AI, trying to ensure that the model is making its decision adequately and reliably, with little to no inherent biases, and to give its users assurance that the AI model is not going to fail upon encountering some unforeseen circumstances in the future.

When applied to business processes, XAI techniques aim at explaining the different factors affecting a certain condition in a business process. In our previous example of the tax refund, an XAI explanation could point to reasons for the delay in the tax refund application such as missing information in the application, tax credit, or tax amount.

However, contemporary XAI techniques are not adequate enough to produce explanations faithful and correct when applied to business processes as they generally fail to express the business process model constraints;  they don’t usually include the richness of information about the contextual situations that affect process outcomes; they don’t reflect the true causal execution dependencies among the activities in the business process; and they don’t make sense enough to be interpretable by users. After all, explanations are usually not given in enough of a human-interpretable form that can ease the understanding by humans.

To this end, IBM Research, a partner in the EU AI4GOV project, introduces Situation-aware eXplainability (SAX) as a framework for generating explanations for business processes that are meant to address these shortcomings. More specifically, an explanation generated with the use of the framework, or a “SAX explanation” in short, is a causally-sound explanation that takes into account the process context in which some condition occurred. Causally sound means that the explanation generated provides an account of why the condition had occurred in a faithful and logical entailment that reflects the genuine chain of business process executions yielding the condition. The context includes knowledge elements that were originally implicit, or part of the surrounding system, yet affected the choices that have been made during the process execution. The SAX framework provides a set of services that aid with the automatic derivation of SAX explanations leveraging existing LLMs. By using these services, process-specific knowledge can be prompted as an LLM input preceding the interaction with it. This library is expected for release as an open source tool at the end of the AI4GOV project.

To assess the perceived quality of LLM-generated explanations about different conditions in business processes, IBM Research has rigorously developed a designated scale and conducted a corresponding user study. In the study, users were asked to rate different quantitative measures about the quality of a variety of LLM explanations on a Likert scale. These explanations were derived by the LLM subsequent to different combinations of knowledge inputs that were introduced to the LLM prompt before the interaction. Our findings show that the input presented to the LLMs aided with the guard-railing of its performance, yielding SAX explanations that had better-perceived fidelity. This improvement is moderated by the perception of trust and curiosity. At the same time, this improvement comes at the cost of the perceived interpretability of the explanation.

The overall approach of the generation of explanations by an LLM, and the resulting user perception of the output, is depicted in the figure below.

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Do You Want To Boost Your Business?

drop us a line and keep in touch