UniMRCP Plugin for ASAPP

ASAPP offers a plugin for speech recognition for the UniMRCP Server (UMS).

Speech-related clients use Media Resource Control Protocol (MRCP) to control media service resources including:

Text-to-Speech (TTS)
Automatic Speech Recognizers (ASR)

To connect clients with speech processing servers and manage the sessions between them, MRCP relies on other protocols to work. Also, MRCP defines the messages to control the media service resources and it also defines the messages that provide the status of the media service resources. Once established, the MRCP protocol exchange operates over the control session, allowing your organization to control the media processing resources on the speech resource server.

This plugin connects your IVR Platform into the AutoTranscribe Websocket. It is a fast solution for your organization to quickly integrate your IVR application into GenerativeAgent. By using the ASAPP UniMRCP Plugin, the GenerativeAgent receives text transcripts from your IVR. This way, your organization takes voice media off your IVR and into the ASAPP Cloud.

Before you Begin

Before you start integrating to GenerativeAgent, you need to:

Get your API Key Id and Secret For authentication, the UniMRCP server connects with AutoTranscribe using standard websocket authentication. The ASAPP UniMRCP Plugin does not handle authentication, but rather authentication is on your IVR’s side of the call. Your API credentials are used by the configuration document. For user identification or verification, you must handle it by the IVRs policies and flows.
Ensure your API key has been configured to access GenerativeAgent APIs and the AutoTranscribe WebSocket. Reach out to your ASAPP team if you are unsure about this.
Use ASAPPs ASR Make sure your IVR application uses the ASAPP ASR so AutoTranscribe can receive it and send transcripts to GenerativeAgent.
Configure Tasks and Functions. By using the Plugin, you still need to save customer info and messages. The GenerativeAgent can save that data by sending it into its Chat Core, but your organization can also save the messages either by calling the API or by saving the information from each event handler. Your IVR application is in control of when to call /analyze so the GenerativeAgent analyzes the transcripts and replies. The recommended configuration is to call /analyze every time an utterance or transcript is returned. Another approach is to call LLMBot when a complete thought/question is provided. Some organizations may find a good solution call /analyze and buffer up transcripts until the customer’s thought is complete.

Implementation steps:

Listen and Handle GenerativeAgent Events

Setup the UniMRCP ASAPP Plugin

Manage the Transcripts and send them to GenerativeAgent

Step 1: Listen and Handle GenerativeAgent Events

GenerativeAgent sends events during any conversation. All events for all conversations being evaluated by GenerativeAgent are sent through the single Server-Sent-Event (SSE) stream.. You need to listen and handle these events to enable GenerativeAgent to interact with your users.

Step 2: Setup the UniMRCP ASAPP Plugin

On your UniMRCP server, you need to install and configure the ASAPP UniMRCP Plugin.

Install the ASAPP UniMRCP Plugin

Go to ASAPP’s UniMCRP Plugin Public Documentation to install and see its usage

Use the Recommended Plugin Configuration

Fields & Parameters After you install the UniMCRP ASAPP Plugin, you need to configure the request fields so the prompts are sent in the best way and GenerativeAgent gets the most information available. Having the recommended configuration will ensure GenerativeAgent analyzes each prompt correctly. Here are the details for the fields with the recommended configuration: StartStream Request Fields

Field		Description	Default	Supported Values
sender	role (required)	A participant role, usually the customer or an agent for human participants.	n/a	”agent”, “customer”
sender	externalId (required)	Participant ID from the external system, it should be the same for all interactions of the same individual	n/a	”BL2341334”
language		IETF language tag	en-US	”en-US”
smartFormatting		Request for post processing: Inverse Text Normalization (convert spoken form to written form), e.g., ‘twenty two —> 22’. Auto punctuation and capitalization	true	true, false Recomended: true Interpreting transcripts will be more natural and predictable
detailedToken		Has no impact on UniMRCP	false	true, false Recommended: false IVR application does not utilize the word level details
audioRecordingAllowed		false: ASAPP will not record the audio true: ASAPP may record and store the audio for this conversation	false	true, false Recommended: true Allowing audio recording improves transcript accuracy over time
redactionOutput		If detailedToken is true along with value ‘redacted’ or ‘redacted_and_unredacted’, request will be rejected. If no redaction rules configured by the client for ‘redacted’ or ‘redacted_and_unredacted’, the request will be rejected. If smartFormatting is False, requests with value ‘redacted’ or ‘redacted_and_unredacted’ will be rejected.	redacted Recommended: unredacted	”redacted”, “unredacted”,“redacted_and_unredacted” Recommended: unredacted IVR application works better with full information available

Transcript Message Response Fields All Responses go to the MRCP Server, so the only visible return is a VXML return of the field.

Field		Description	Format	Example Syntax
utterance	text	The written text of the utterance. While an utterance can have multiple alternatives (e.g., ‘me two’ vs. ‘me too’) ASAPP provides only the most probable alternative only, based on model prediction confidence.	array	”Hi, my ID is 123.”

If the detailedToken in startStream request is set to true, additional fields are provided within the utterance array for each token:

Field	Subfield	Description	Format	Example Syntax
token	content	Text or punctuation	string	”is”, ”?“
	start	Start time (millisecond) of the token relative to the start of the audio input	integer	170
	end	End time (millisecond) audio boundary of the token relative to the start of the audio input, there may be silence after that, so it does not necessarily match with the startMs of the next token.	integer	200
	punctuationAfter	Optional, punctuation attached after the content	string	’.‘
	punctuationBefore	Optional, punctuation attached in front of the content	string	’“‘

Step 3: Manage Transcripts

You need to both pass the conversation transcripts to ASAPP, as well as request GenerativeAgent to analyze the conversation.

Create a Conversation

You need to create the conversation with GenerativeAgent for each IVR call. A conversation represents a thread of messages between an end user and one or more agents. GenerativeAgent evaluates and responds in a given conversation. Create a conversation providing your Ids for the conversation and customer:

curl -X POST 'https://api.sandbox.asapp.com/conversation/v1/conversations' \
--header 'asapp-api-id: <API KEY ID>' \
--header 'asapp-api-secret: <API TOKEN>' \
--header 'Content-Type: application/json' \
--data '{ 
  "externalId": "1",
  "customer": {   
    "externalId": "[Your id for the customer]",
    "name": "customer name" 
  },
  "timestamp": "2024-01-23T11:42:42Z"
}'

A successfully created conversation returns a status code of 200 and the conversation’s id.

{"id":"01HNE48VMKNZ0B0SG3CEFV24WM"}

As the conversation goes, it is possible to give GenerativeAgent more context of the conversation by using thetaskName and inputVariables attributes. You can also simulate Input Variables in the Previewer

curl --request POST \
  --url https://api.sandbox.asapp.com/generativeagent/v1/analyze \
  --header 'Content-Type: application/json' \
  --header 'asapp-api-id: <api-key>' \
  --header 'asapp-api-secret: <api-key>' \
  --data '{
  "conversationId": "01BX5ZZKBKACTAV9WEVGEMMVS0",
  "message": {
    "text": "Hello, I would like to upgrade my internet plan to GOLD.",
    "sender": {
      "role": "agent",
      "externalId": 123
    },
    "timestamp": "2021-11-23T12:13:14.555Z"
  },
  "taskName": "UpgradePlan",
  "inputVariables": {
    "context": "Customer called to upgrade their current plan to GOLD",
    "customer_info": {
      "current_plan": "SILVER",
      "customer_since": "2020-01-01"
    }
  }
}'

Gather transcripts and analyze conversations with GenerativeAgent

After you receive the conversation transcripts from the UniMRCP Plugin, you must call /analyze and other endpoints so GenerativeAgent evaluates the conversation and send a reply. You can decide when to call the GenerativeAgent, a common strategy is to define an immediate call after a transcript is returned from the MRCP client Additionally, the GenerativeAgent will make API Calls to your Organization depending on the Tasks and Functions that were configured for the Agent. Once you have the SSE stream connected and are receiving messages, you need to engage GenerativeAgent with a given conversation. All messages are sent through REST outside of the SSE channels. To have GenerativeAgent analyze a conversation, make a POST request to /analyze:

curl -X POST 'https://api.sandbox.asapp.com/generativeagent/v1/analyze' \
--header 'asapp-api-id: <API KEY ID>' \
--header 'asapp-api-secret: <API TOKEN>' \
--header 'Content-Type: application/json' \
--data '{
    "conversationId": "01HNE48VMKNZ0B0SG3CEFV24WM"
}'

GenerativeAgent evaluates the transcript at that moment of time to determine a response. GenerativeAgent is not aware of any additional transcript messages that are sent while processing. A successful response returns a 200 and the conversation Id.

{
  "conversationId":"01HNE48VMKNZ0B0SG3CEFV24WM"
}

GenerativeAgent’s response is communicated via the events. Analyze with Message You have the option to send a message when calling analyze.

curl -X POST 'https://api.sandbox.asapp.com/generativeagent/v1/analyze' \
--header 'asapp-api-id: <API KEY ID>' \
--header 'asapp-api-secret: <API TOKEN>' \
--header 'Content-Type: application/json' \
--data '{
    "conversationId": "01HNE48VMKNZ0B0SG3CEFV24WM",
    "message": {
        "text": "hello, can I see my bill?",
        "sender": {
            "externalId": "321",
            "role": "customer"
        },
        "timestamp": "2024-01-23T11:50:50Z"
    }
}'

A successful response returns a 200 status code the id of the conversation and the message that was created.

{
  "conversationId":"01HNE48VMKNZ0B0SG3CEFV24WM",
  "messageId":"01HNE6ZEAC94ENQT1VF2EPZE4Y"
}

Next Steps

With your system implemented into GenerativeAgent, sending messages and engage GenerativeAgent, you are ready to use GenerativeAgent. You may find these other pages helpful in using GenerativeAgent:

Getting Started

Products

Additional Features

UniMRCP Plugin for ASAPP

Before you Begin

Step 1: Listen and Handle GenerativeAgent Events

Step 2: Setup the UniMRCP ASAPP Plugin

Install the ASAPP UniMRCP Plugin

Use the Recommended Plugin Configuration

Step 3: Manage Transcripts

Create a Conversation

Gather transcripts and analyze conversations with GenerativeAgent

Next Steps

Configuring GenerativeAgent

Safety and Troubleshooting

Going Live

Getting Started

Products

Additional Features

​Before you Begin

​Step 1: Listen and Handle GenerativeAgent Events

​Step 2: Setup the UniMRCP ASAPP Plugin

​Install the ASAPP UniMRCP Plugin

​Use the Recommended Plugin Configuration

​Step 3: Manage Transcripts

​Create a Conversation

​Gather transcripts and analyze conversations with GenerativeAgent

​Next Steps

Configuring GenerativeAgent

Safety and Troubleshooting

Going Live

Before you Begin

Step 1: Listen and Handle GenerativeAgent Events

Step 2: Setup the UniMRCP ASAPP Plugin

Install the ASAPP UniMRCP Plugin

Use the Recommended Plugin Configuration

Step 3: Manage Transcripts

Create a Conversation

Gather transcripts and analyze conversations with GenerativeAgent

Next Steps