Understanding Customer Utternances Better

9 min readJust now

This is the last in the current run of posts I have created covering my work to support a more “Natural Conversation” between humans and digital assistants. In this post I will explore work my team and I did to leverage large language models to better understand and extract information from long complext customer utterances.

Before I begin I want to highlight that “out of the box” watsonx Assistant has a LOT of powerful capabilities to address a many of aspects relating to this challenge. So why build something else? Well to support the broader concept of a “configuration driven” approach to defining customer “journeys”. In my other posts I have described how, using the work by Bob Moore, we created a framework in watsonx Assistant which allows conversational journeys to be expressed as configuration which in turn represents one of Bob’s conversational patterns. Where information is needed to be collected the framework abstracts this collection away from the variable by using “data types” as the mechanism for driving the collection of data. In this way the framework only needs to know how to collect a type of data not a specific varible. This means that variables can be defined in the configuration with out needing to be explicitly defined to watsonx Assistant.

With the covered lets look at “why” we decided to look at this…

The requirement emerged from an initial investigation my team and I did around how to create a complex interview process using our “Natural Conversation Framework”. Although successful this work highlighted that we were using a lot of “yes/no” style questions which meant it felt less “natural”. Using the insights from this work we started to look at how we could allow customers to be asked more “open” question and extract what we need from the potentially more complex response from the customer.

Expanding on this a bit more we have two key points where we can receive a block information from a cutomer:

When they ask their initial question e.g. “I need to transfer £100 from my savings account to my current account”
When they are responding to a question e.g. when asked “Are you saving for anything just now” they could respond with “Yes, I’ve been having a lot of problems with my car recently and I really think its time to change it so I am saving up so I can replace it”.

In the first case we can see that the customers intent is to transfer some funds and to do this we need to capture information on the source and target accounts along with the amount and the date of the transfer. In the utterance provided by the customer we can see that they have provided the account and amount details but we need to ask about the date. With the second case we have asked an open question about the custmers saving plans and they have indicated that they are looking to save for a new car. In this case we don’t want to provide a hard set of options for the customer to select from but rather deduce their savings intent from the response. From a framework point of view these areas break down into the following aspects:

Entity sweeping of initial utterance
Data collection from a response to a question

To achieve these we knew we had a clear definition of what was required from the customer based on the journey configuration JSON so we decided to explore how we could use a large language model (LLM) to perform the extraction.

Looking at the two requirements it was clear that the entity sweeping was really a process of collecting data for each relevant data items. Based on this we created the following architecture.

Before I dig into the details a bit more let me recap the JSON structure we use in the NCF to describe a information collection (A2 pattern) journey. We will use the “refund” journey as an example

{     "A2-refund": {
      "action": {
        "type": "rulesDefined",
        "name": "rulesDefined"
      },
      "captureType": "dynamic",
      "default": {
        "action": null,
        "confirm": "",
        "clarify": "",
        "response": "I can help you with a refund.",
        "repeat": "",
        "example": "",
        "paraphrase": "",
        "verify": false,
        "complete": false,
        "validation_question": "",
        "rules": "refund-For-Actions",
        "dataCapture": {
          "numberOfItems": 5,
          "dataToCapture": [
            {
              "action": null,
              "variableName": "payment_mechanism",
              "description": "How payment was made",
              "justification": "So I can route the right way",
              "confirm": false,
              "response": "How did you pay?",
              "repeat": "How did you pay?",
              "example": "Card, Paypal, Transfer, Direct Debit.",
              "paraphrase": "What did you use to pay?",
              "type": "A2-payment-mechanism",
              "options": [
                "card",
                "transfer",
                "paypal",
                "direct debit"
              ],
              "collected": false,
              "always_ask": false,
              "required": "yes"
            },
            {
              "action": null,
              "variableName": "payment_target",
              "description": "Target of the payment",
              "justification": "I need this information so I can assess best approach for a refund.",
              "confirm": false,
              "response": "Who was the payment to?",
              "repeat": null,
              "example": null,
              "paraphrase": "Where did the payment get sent to?",
              "type": "A2-llm-payment-target",
              "options": [
                "OFN Bank",
                "other",
                "unknown"
              ],
              "collected": false,
              "always_ask": false,
              "required": "no"
            },
            {
              "action": null,
              "variableName": "payment_suspicious",
              "description": "Payment is suspicious",
              "justification": null,
              "confirm": false,
              "response": "Do you think the payment was suspicious?",
              "repeat": null,
              "example": "Answer with 'yes' or 'no' ",
              "paraphrase": null,
              "type": "A2-llm-yes-no",
              "options": [
                "yes",
                "no"
              ],
              "collected": false,
              "always_ask": false,
              "required": "no"
            },
            {
              "action": null,
              "variableName": "closed_account",
              "description": "Target account is closed",
              "justification": null,
              "confirm": false,
              "response": "Was the payment to a closed account?",
              "repeat": null,
              "example": null,
              "paraphrase": null,
              "type": "A2-llm-account-status",
              "options": [
                "closed",
                "closed account"
              ],
              "collected": false,
              "always_ask": false,
              "required": "no"
            },
            {
              "action": null,
              "variableName": "direct_debit_paid",
              "description": "Direct debit has been paid",
              "justification": null,
              "confirm": false,
              "response": "Has the direct debit already left you account?",
              "repeat": null,
              "example": null,
              "paraphrase": null,
              "type": "A2-llm-yes-no",
              "options": [
                "yes",
                "no"
              ],
              "collected": false,
              "always_ask": false,
              "required": "no"
            }
          ]
        }
      }
    }
}

In this example we can see that five data items are described and for each we have a description, question, type and variable name. In the case of the “llm” data types we also have a list of potential valid options. To support the “sweeper” functionality this JSON is loaded into Milvus as a set of embeddings grouped under the journey identifier.

Focusing on the “sweeper” first lets look at the the execution path.

Customer asks a question e.g. “I need a refund for a card payment which I think is suspicious”
Question is passed to watsonx Assistant for handling
watsonx Assistant determines the intent as “refund”
Journey configuration loaded and returned
watsonx Assistant executes configuration which is an A2. This starts with a data “sweep”
watsonx Assistant calls LLM wrapper to process “sweep”
LLM wrapper searches Milvus for relevant data items that can be detected in the customers utterance
LLM wrapper selects upto 7 of the most “relevant” detected data items and calls LLM to process
Using the customer utternance and the data item details (name, description, collection question and options) the LLM attempts to classify the utternance as one of the options assigned to the data item
LLM wrapper returns the detected data options or none where an option wasn’t confidently detected (or the item wasn’t selected during the Milvus search phase). In the given example input payment_mechanism would be indentified as “card” and payment_suspicious as “yes”.
Detected data populated in data structure for returning to watsonx Assistant
watsonx Assistant updates its context with returned data and continues to execute the jouney to request any missing data or in the case all the data has been collected, jump to the appropriate journey to meet the customers needs.

At the watsonx Assistant level this processing is hanlded in the A2 Entity Sweeper action. The first step calls the LLM Wrapper via the common extension point we use for all NCF calls.

The extension is configured to pass the following information.

As you can see we pass details of the journey, the data to be collected and the input text. In addition we pass a flag indicating if this journey has been “jumped to”. If this is true then we haven’t started the journey with a complex utterance so we don’t call the LLM processing. This is an optimisation to ensure response times are kept as fast as possible.

Once the extension has been called successfully we update the watsonx Assistant context and continue to process the rest of the action.

Now lets look at the collection of a specific data item.

Following the sweep watsonx Assistant needs to collect more information. The next item to be collected is defined as an LLM driven data type. The question to collect the information is returned to the customer. Using the previous example the refund journey needs to know what the payment_target was for a card payment.
Customer answers the question e.g “The payment was to yourselves”.
watsonx Assistant calls the LLM wrapper passing the journey name, customer response and data item details (question and options)
LLM Wrapper calls LLM passing the data collection details and customer response
LLM attempts to classify the customers input as one of the options assigned to the data item. If it can’t then it returns that nothing was identfied. The outcome is returned to LLM wrapper. In the above case payment_target would be set to “OFN Bank” as the context of the LLM call is made with the LLM acting on behalf of “OFN Bank”.
Result is returned to wastonx Assistant
watsonx Assistant process the result and takes the next appropriate action

In this case watsonx Assistant handles the processing via the A2 Data Collection and A2 LLM Option Sweep actions. The A2 Data Collection action is responsible for collecting a data item based on its declared “type”. Here you can see that we added a step to check for a LLM data type.

The conditions for the entry to this step are a little but involved but they ensure that we are looking to call the LLM functionality and if so we jump to the A2 LLM Option Sweep action.

Again here we start by calling the extension to invoke the LLM via the LLM Wrapper service. The extension parameters passed are:

Once again we pass details about what needs to be collected along with the customers response to the data collection question. When the extension completes we use steps 3–6 to check if the customer as sought clarification or repetition of the question. If so we process this in line with the NCF principles. If the data was successfully captured the results are passed back the the A2 Data Collection action.

That concludes my walkthrough of the approach we have taken. Testing has proven this to be a robust and performant approach and by using the concept of an LLM data type the number of calls to the LLM can be controlled an only used in situation where it makes most sense. An additional benfit of our approach is that the call to the LLM is flexible enough to allow options to be amended dynamically. This does have implications on any deterministic rules supporting the journey flow but flows can be designed to support this level of behaviour though there are limitations about making a routing decisoion based on an option which is not pre-defined.

Understanding Customer Utternances Better

Written by Tony Hickman

No responses yet