Genesys CTI User Forum
Genesys CTI User Forum => Genesys CTI Technical Discussion => Topic started by: Sosy on June 18, 2019, 04:46:01 PM
-
Hi guys,
We've been asked to develope an application than can take letters as input based on italian city names pronouced by the callers (eg. "R" for 'Roma').
It's the first time we use the Nuance ASR on Composer as it is freshly installed. I verified that using builtin grammars it actually works (eg. using "builtinNumber" I was able to detect numbers and store them in my output variable).
How can I build a proper grammar file in order to make Nuance ASR detect city names? Is there a specific grammar builder I can use? What is the correct Input block configuration when an external Grammar is being used? I'm referring to the samples provided in the Composer templates but I keep getting the "Speech Recognition Error was detected, Goodbye" message when I try to use the following grammar built with Composer:
[quote]<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<grammar xmlns="http://www.w3.org/2001/06/grammar" xml:lang="it-IT" root="codice" version="1.0" mode="voice">
<rule id="codice" scope="public">
<one-of>
<item>
<tag>out='Ancona'</tag>
<one-of>
<item>
<token>Ancona</token>
</item>
</one-of>
</item>
<item>
<tag>out='Roma'</tag>
<one-of>
<item>
<token>Roma</token>
</item>
</one-of>
</item>
</one-of>
</rule>
</grammar>[/quote]
Thanks for your replies!
-
Well you won't find such information on Composer as you are using a Nuance product. MCP is just a GW for it.
So you need to search for Nuance ASR samples. There is documentation for developers available at their site. Go for it, study it and practice a lot.
You can also ask into Nuance forums around.
-
And why don't just use Composer Grammar builder?
-
Hi Cavagnaro,
Thanks for you reply, you're always really helpful! :)
Sorry if I sound a little confused and maybe unprepared... that's actually what I am because I've been recently assigned on this job with little to none knowledge about the matter, lot of work to do and no time to do it ::) ::)
So please, help me understand... I received a lot of different but conflicting advices on the ASR. As far as I can tell Nuance is the engine used both for TTS and ASR. TTS works fine through Composer, but, as far as I can tell, ASR miss the proper grammar file in order to recognize words.
I need to understand first if I have to build a grammar and if this grammar can be build via Composer Grammar builder or it needs something else (eg. a Nuance grammar builder?)
[quote author=cavagnaro link=topic=11364.msg51831#msg51831 date=1560878695]And why don't just use Composer Grammar builder?[/quote]
from your message it seems that Composer Grammar builder should do the work, right?
I already tried to build a simple grammar (the quoted one from my previous message) but if I set this resource in the Input block in Composer configured as follow I get the error [i]"Speech Recognition Error was detected, Goodbye!"[/i]:
[quote]Grammar Type: externalGrammar
Input Grammar Dtmf:
Input Gramma Voice: 'cities.grxml'
Input Mode: voice
Slots:
[/quote]
What am I doing wrong?
Thanks and, again, sorry for my confusion.
-
Check at Nuance logs to see why it is not accepting the ASR grammar.
If you built it with Composer Grammar Builder, don't forget to click the Default option at the root. Otherwise you will need to specify the section as name.grxml#cities
Check the composer documentation.
Enviado de meu SM-G9650 usando o Tapatalk
-
[quote]If you built it with Composer Grammar Builder, don't forget to click the Default option at the root.[/quote]
Thats it! Now my grammar is working and the ASR recognize the input! Thank you!
There's something I'm still missing though... when I try to print the output value I get [Default Object] instead of the word I pronounced (eg. Roma)
-
Read about output and shadow variables
Enviado de meu SM-G9650 usando o Tapatalk
-
Hi again,
it works now!
I was reading wrong the configuration for the Input Block in the example "HandleNBest" and in general the usage for the shadow variables. Instead of using AppState.[i]nameoftheblock[/i]$.utterance I was using AppState.[i]nameofthevariable[/i]$.utterance.
Thank you!
-
Is it possible to collect more words at once and store them in the same utterance (eg. Roma Genova Torino Milano) or do I have to go back in the Input block for every word I want to collect?
I tried to pronounce three cities at once (in order: Ancona, Bari, Roma) but I only got Bari stored in the variable
Thanks
-
Depends on your logic
Enviado de meu SM-G9650 usando o Tapatalk
-
For this application the Caller is asked to spell an alphanumeric code using city names in place of letters, so for example to get a code like [b]GG12345FT8960[/b] the Caller has to pronounce something like: [i]"Genova Genova 1 2 3 4 5 Firenze Torino 8 9 6 0"[/i]
Then, using a script, the code is stored in a variable.
-
Then your grammar must include numbers and the city names as letters. Then the input will have a length property.
Enviado de meu SM-G9650 usando o Tapatalk
-
The problem there was that all rules were written between <one-of></one-of> tags so once first word was captured all the other rules were ignored.
So using a WIP grxml file with only four rules (for letters A,B,C and D) I was able to detect all of them and stored all of them in the same variable at once.
However, this works only if I pronouce ALL of the letters...
If I pronounce only two of the letters the nomatch event is thrown.
I tried to use this code but it didnt work
[quote]<!-- the rule reference to LetterA can occur zero, one or many times -->
<item repeat="0-"> <ruleref uri="#LetterA"/> </item>[/quote]
Do you have any suggestion?
Thank you
-
Don't understand what you are doing and why are you messing with the grxml directly.
Show your grammar designer at Composer screenshot.
Enviado de meu SM-G9650 usando o Tapatalk
-
Hi,
Here's my Grammar Builder view where i configured only 4 rules for testing.
[url=https://ibb.co/20zX2KZ][img]https://i.ibb.co/Ng0whYT/grammar-builder.jpg[/img][/url]
This gbuilder exports this .grxml
[quote]<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<grammar xmlns="http://www.w3.org/2001/06/grammar" xml:lang="it-IT" root="Codice" version="1.0" mode="voice">
<rule id="Codice" scope="public">
<one-of>
<item>
<tag>out='A'</tag>
<one-of>
<item>
<token>A</token>
</item>
<item>
<token>Ancona</token>
</item>
</one-of>
</item>
<item>
<tag>out='B'</tag>
<one-of>
<item>
<token>B</token>
</item>
<item>
<token>Bi</token>
</item>
<item>
<token>Bari</token>
</item>
<item>
<token>Bologna</token>
</item>
</one-of>
</item>
<item>
<tag>out='C'</tag>
<one-of>
<item>
<token>C</token>
</item>
<item>
<token>Ci</token>
</item>
<item>
<token>Como</token>
</item>
</one-of>
</item>
<item>
<tag>out='D'</tag>
<one-of>
<item>
<token>D</token>
</item>
<item>
<token>Di</token>
</item>
<item>
<token>Domodossola</token>
</item>
</one-of>
</item>
</one-of>
</rule>
</grammar>[/quote]
It works but, as I said, it accept only one input... and that's why I started to mess with .grxml directly. I came up with this solution:
[quote]<?xml version="1.0"?>
<grammar xmlns="http://www.w3.org/2001/06/grammar" xml:lang="it-IT" version="1.0" root="codice">
<rule id="A" scope="public">
<item repeat="0-5">
<one-of>
<item>Ancona</item>
<item>A</item>
</one-of>
</item>
</rule>
<rule id="B" scope="public">
<item repeat="0-5">
<one-of>
<item>Bologna</item>
<item>Bari</item>
<item>Bi</item>
<item>B</item>
</one-of>
</item>
</rule>
<rule id="C" scope="public">
<item repeat="0-5">
<one-of>
<item>Como</item>
<item>Ci</item>
<item>C</item>
</one-of>
</item>
</rule>
<rule id="D" scope="public">
<item repeat="0-5">
<one-of>
<item>Domodossola</item>
<item>Di</item>
<item>D</item>
</one-of>
</item>
</rule>
<!-- Reference by URI to a local rule -->
<rule id="codice" scope="public">
<item repeat="0-5">
<ruleref uri="#A" />
<ruleref uri="#B" />
<ruleref uri="#C" />
<ruleref uri="#D" />
</item>
</rule>
</grammar>[/quote]
It accept multiple input, no need to provide all inputs to work BUT it only accept inputs in order: if I pronounce "B, C" my variable is populated as AppState.Input = B C... on the other hand if I pronounce "C, B" my variable in only populated as AppState.Input = C. I cant figure out if there's a way to go back to a previous rule.
-
I have created other grammars as yours for recognition of stores names for example and their variations as spoken by users or street common names, and works fine.
So there must be something wrong at your Input. Show its properties.
Also, enable Nuance logs and check the ASR feature going on and check what is it recognizing
-
Maybe what you can do is to make your Input max size as 1 and then grab the captured digit, store it to an accumulative var (c = c + captured) and then do a loop until the length of c is what you desire to be
-
Hi Cavagnaro, following is the Input Block configuration:
[url=https://ibb.co/4J2MX9b][img width=491 height=480]https://i.ibb.co/PtM4P32/inputblock.jpg[/img][/url]
I should ask to client if the lenght of the code is going to be always the same... then if it is the case I can use a counter as you suggested... but I'm wondering if the looping logic will prevent to spell without the needs of a long or innatural pause between words.
-
I think fastest way would be to ask Genesys and Nuance if there is already a builtin grammar for that in your language...many vendors do have.
You will spend 3 days in that and move on your development in parallel.
About your concern, shouldn't happen as long as recognition is smooth...ideal world of course.
Another option...go with google ASR
https://cloud.google.com/speech-to-text/?hl=es
And enable it on MCP
https://docs.genesys.com/Documentation/GVP/latest/GDG/CGCGC
-
Hi cavagnaro,
seems like I finally did it...!
For future reference, here's the grxml I came up with:
[quote]
<?xml version="1.0"?>
<grammar xmlns="http://www.w3.org/2001/06/grammar" xml:lang="it-IT" version="1.0" root="codice">
<!-- Reference by URI to a local rule -->
<rule id="codice" scope="public">
<one-of>
<item repeat="1-">
<item repeat="0-"><ruleref uri="#numeri" /></item>
<item repeat="0-"><ruleref uri="#lettere" /></item>
</item>
</one-of>
</rule>
<rule id="numeri" scope="private">
<one-of>
<item>Uno<tag>SWI_literal="1"</tag></item>
<item>Due<tag>SWI_literal="2"</tag></item>
<item>Tre<tag>SWI_literal="3"</tag></item>
ETC...
</one-of>
</rule>
<rule id="lettere" scope="private">
<one-of>
<item>Ancona<tag>SWI_literal="A"</tag></item>
<item>A<tag>SWI_literal="A"</tag></item>
<item>Bologna<tag>SWI_literal="B"</tag></item>
<item>Bari<tag>SWI_literal="B"</tag></item>
<item>Bi<tag>SWI_literal="B"</tag></item>
<item>Como<tag>SWI_literal="C"</tag></item>
<item>Ci<tag>SWI_literal="C"</tag></item>
ETC...
</one-of>
</rule>
</grammar>
[/quote]
I still have some problem tuning the recognizer... sometimes it ends the recognition even if I'm still talking, missing some letters, some other times it simply misinterprets the inputs. I'm currently working on the baseline.vxml file in the Nuance config folder... hope I'll be able to tune it better in the end.
Thank you for support!
-
Nice! Thanks for sharing
Be sure to check your nuance logs to see what it is identifying. Also try to mess with the asr parameters at the VXML properties
Enviado de meu SM-G9650 usando o Tapatalk
-
Hi, I'm back with another question...
We're discussing the possibility to let the customer enter its address using a speech input when this IS NOT STORED in the clients database.
However, as far as I can understand, this is not a job for ASR because If I can't tell what is the input to expect because I don't know the customer address, I wont be able to write a proper grammar.
I think in this case we should use a transcript engine... is that correct?
-
Correcr
Enviado de meu SM-G9650 usando o Tapatalk
-
Hi Cavagnaro,
I'm still dealing with my ASR project... The grammar is currently working fine but it seems like no matter what parameter I'm playing with, it suddenly cease to listen to my speech (characters listing) even if I'm not finished. From what I can understand, it has something to do with the incomplete/complete timeout parameter that I'm trying to set to 5s. However, I noticed from MCP logs that when ASR is invoked it sets parameters using [b]MRCPv2ClientLibrary.C[/b] or, at least, it seems so.
[quote]2019-07-15 16:33:12.564 DBUG 007A01D2-1000A36D 6724 01C00000 CMASRSession.C:1101 CMASRSession::HandleASREvent, msgType = 800, m_nModifySent = 0, m_RecognizerState = 0, m_MediaState = 1
2019-07-15 16:33:12.564 DBUG 007A01D2-1000A36D 6724 01C00000 CMASRSession.C:435 CMASRSession::SendASRLogQueue
2019-07-15 16:33:12.564 DBUG 007A01D2-1000A36D 6724 01C00000 CMASRSession.C:2215 CMASRSession::LoadASR
2019-07-15 16:33:12.564 DBUG 007A01D2-1000A36D-asr-default-2267 6724 06B00000 MRCPV2ClientLibrary.C:1390 In ASR Set Params
2019-07-15 16:33:12.564 DBUG 007A01D2-1000A36D-asr-default-2267 6724 06B00000 MRCPV2Session.C:78 Checking the request for 13
2019-07-15 16:33:12.564 DBUG 007A01D2-1000A36D-asr-default-2267 6724 06B00000 MRCPV2Session.C:153 Getting the ASR state
2019-07-15 16:33:12.564 DBUG 007A01D2-1000A36D-asr-default-2267 6724 06B00000 MRCPV2Session.C:165 State is valid.
2019-07-15 16:33:12.564 DBUG 007A01D2-1000A36D-asr-default-2267 6724 06B00000 MRCPV2ClientLibrary.C:1425 Calling Translate ASR SetParam Args
2019-07-15 16:33:12.564 DBUG 007A01D2-1000A36D-asr-default-2267 6724 06B00000 MRCPV2ClientLibrary.C:2432 Setting Language it-IT
2019-07-15 16:33:12.564 DBUG 007A01D2-1000A36D-asr-default-2267 6724 06B00000 MRCPV2ClientLibrary.C:2440 Setting Incompletetimeout 1000
2019-07-15 16:33:12.564 DBUG 007A01D2-1000A36D-asr-default-2267 6724 06B00000 MRCPV2ClientLibrary.C:2447 Setting Complete 1000
2019-07-15 16:33:12.564 DBUG 007A01D2-1000A36D-asr-default-2267 6724 06B00000 MRCPV2ClientLibrary.C:2464 Adjusting MaxSpeechTimeout to 60000
2019-07-15 16:33:12.564 DBUG 007A01D2-1000A36D-asr-default-2267 6724 06B00000 MRCPV2ClientLibrary.C:2472 Setting Sensitivity 40
2019-07-15 16:33:12.564 DBUG 007A01D2-1000A36D-asr-default-2267 6724 06B00000 MRCPV2ClientLibrary.C:2479 Setting SpeedVSAccurarcy 50
2019-07-15 16:33:12.564 DBUG 007A01D2-1000A36D-asr-default-2267 6724 06B00000 MRCPV2ClientLibrary.C:2493 Setting NBest 1
2019-07-15 16:33:12.564 DBUG 007A01D2-1000A36D-asr-default-2267 6724 06B00000 MRCPV2ClientLibrary.C:2508 Setting Confidence Threshold 50[/quote]
I can't find anything about this, do you know if it is a configuration file? If yes, where I can find it?
Thank you
-
As told you before, check nuance logs
Enviado de meu SM-G9650 usando o Tapatalk
-
Hi all,
I took up this old topic to ask you something about Nuance result sent to MCP and Composer.
We're tryng to deploya working application in the test environment to the production environment.
that is, in the production environment that's supposed to be configured exactly as the test one, do not work with the application and when the TTS is supposed to re-read what the client have spoken, it only says "Object Default".
What I can see is that when filling the recognition result in Composer, something differes between the two environments:
[quote]TEST Environment:
2020-02-26T17:14:02.230 Int 50076 014601E5-100046E7 400 filling :MSG_ASR_RICHIESTA_CONFERMA_spedizione.MSG_ASR_RICHIESTA_CONFERMA_spedizioneField:Field:CORRETTO
PRODUCTION Environment:
2020-02-26T16:37:02.258 Int 50076 012A01E5-10076C24 5300 filling :MSG_ASR_RICHIESTA_CONFERMA_spedizione.MSG_ASR_RICHIESTA_CONFERMA_spedizioneField:Field:{SWI_grammarName=session:0x0001609a;SWI_grammarName$={confidence=8400.000000};SWI_literal=CORRETTO;SWI_literal$={confidence=8400.000000};SWI_meaning=CORRETTO;SWI_meaning$={confidence=8400.000000}}
[/quote]
I already checked the Baseline.xml for Nuance, but they're identical...
What can be causing this issue?
Thanks in advance!
-
Hi,
we solved the issue, seems like i was lookin at the wrong Baseline.xml. The right one was, indeed, wrong. For future reference:
To make the ASR service work correctly with GVP, must edit the Nuance Recognizer file baseline.xml. and comment out the fourth and fifth lines in the code sample below:
<param name="swirec_extra_nbest_keys">
<declaration group="result" type="string_set" set_by="default+api"> </declaration>
<value>SWI_meaning</value>
<value>[b]SWI_literal[/b]</value>
<value>[b]SWI_grammarName[/b]</value>
</param>
The characters to add to the code are marked in bold.
Thanks!
-
Weird, maybe you were sending the object and not the out property?
Enviado de meu SM-N9600 usando o Tapatalk