Watson Voice Agent를 활용한‘한국어’ 인공지능콜센터 제작하기...
Transcript of Watson Voice Agent를 활용한‘한국어’ 인공지능콜센터 제작하기...
© 2018 IBM Corporation
맹윤호 Yunho Maeng IBM Watson & Cloud Platform
Technical Service Professional
Watson Voice Agent를활용한 ‘한국어’인공지능 콜센터제작하기: 챗봇 그 이상을 위하여
© 2018 IBM Corporation 2Page
▪ Speech to Text
▪ Text to Speech
▪ Watson Assistant
▪ SIP Service
▪ Watson Voice Agent
▪ Lesson Learned
INDEX
© 2018 IBM Corporation 3Page
✓ 본 세션은 아직 공식적으로 지원하지 않는 ‘한국어’
기능에 대해 '개인’으로서 실험적으로 구축한
인공지능 콜센터에 대해 다루고 있습니다.
Disclaimer
✓ 추후 Watson Voice Agent나 SIP 서비스의 업데이트에
따라 실습 방법이 달라질 수 있습니다.
✓ 구체적인 현장 적용시에 데이터와 비즈니스 로직의
복잡성, IVR 환경에 따라 구현이 달라질 수 있습니다.
© 2018 IBM Corporation 4Page
왓슨의 한국어 음성 인식 API
- AM + LM Customization
- Lightly supervised learning
Watson Speech to Text
© 2018 IBM Corporation 5Page
Speech To Text
Narrowband (8kHz)는전화상에서의대화를뜻하며,콜센터프로젝트등이사용되는주파수이다. Broadband
(16kHz)는발화나일반적인대화를뜻하며인공지능스피커등에사용되고있다.
https://speech-to-text-demo.ng.bluemix.net/
© 2018 IBM Corporation 6Page
일반적인 Speech To Text 트레이닝 데이터 생성 과정
이미있는음성파일을받아적거나,이미있는대화록을녹음하거나
Data Generation
(Text + Audio)
✓ 텍스트데이터를통해LM(Language Model)이오디오데이터를통해AM(Acoustic Model)을생성하여활용
© 2018 IBM Corporation 7Page
일반적인 Speech To Text 트레이닝 데이터 생성 과정
LightlySupervisedLearning을통해사람이직접음성파일의타임프레임을지정해주지않더라도미리입력된
텍스트를통해음성파일을학습함
✓ 학습데이터를생성하는데걸리는시간을단축시킴
© 2018 IBM Corporation 8Page
Speech To Text : cURL
Create a custom model
curl –X POST https://stream.watsonplatform.net/speech-to-text/api/v1/acoustic_customizationsReturns a model id called ‘customization_id’
Add a ZIP file containing audio (e.g. WAV files)
curl -X POST –data-binary @audiofiles.zip -H "Content-type: application/zip" https://stream.watsonplatform.net/speech-to-text/api/v1/acoustic_customizations/{customization_id}/audio/audio-name
Start training
curl –X POST https://stream.watsonplatform.net/speech-to-text/api/v1/acoustic_customizations/{customization_id}/train
curl –X GET https://stream.watsonplatform.net/speech-to-text/api/v1/acoustic_customizations/{customization_id}
Query the customization service to find status of training – it should return
‘available’ once training has completed successfully
Using the custom model with the STT service for recognition
curl https://stream.watsonplatform.net/speech-to-text/api/v1/recognize?acoustic_customization_id={customization_id}
© 2018 IBM Corporation 9Page
Public API for STT customization
API Number
Method API URL Description
API-01 POST speech-to-text/api/v1/customizationsCreate new custom model, returns a 'customization_id' GUID
API-02 GET speech-to-text/api/v1/customizations?language=en-US List custom models for a language
API-03 DELETE /api/v1/customizations/{customization_id} Delete custom model
API-04 GET /api/v1/customizations/{customization_id} Query contents of custom model
API-05 POST/api/v1/customizations/{customization_id}/corpora/{corpus_name}
Adds training corpus data to custom model, names it 'corpus_name'
API-06 GET /api/v1/customizations/{customization_id}/corpora Returns list of corpora in custom model
API-07 DELETE/api/v1/customizations/{customization_id}/corpora/{corpus_name}
Deletes corpus resource named 'corpus_name'
API-08 POST /api/v1/customizations/{customization_id}/trainStarts training session (creates custom model for use in STT service)
API-09 POST /api/v1/customizations/{customization_id}/wordsAdds one or more custom words or words+pronunciations to model
API-10 GET /api/v1/customizations/{customization_id}/wordsReturns all custom words added to model as well as out-of-vocab words found in the input corpus
API-11 PUT/api/v1/customizations/{customization_id}/words/{word_name}
Adds a single word 'word_name' to model
API-12 GET/api/v1/customizations/{customization_id}/words/{word_name}
Gets info about a single word 'word_name'
API-13 DELETE/api/v1/customizations/{customization_id}/words/{word_name}
Deletes a single word 'word_name'
API-14 POST /api/v1/customizations/{customization_id}/upgrade_modelUpdates custom model to work with latest release of the STT service
API-15 POST /api/v1/customizations/{customization_id}/reset Clears all corpora and words from model
© 2018 IBM Corporation 10Page
왓슨의 한국어 음성 합성 API
- SSML
Watson Text to Speech
© 2018 IBM Corporation 11Page
Text to Speech (Demo)
https://text-to-speech-demo.ng.bluemix.net/
TTS한국어버전의경우, IBMKorea에직접문의하면사용할수있으며,Aibril을통해서도사용해볼수있음
© 2018 IBM Corporation 12Page
Text to Speech : cURL
Default/v1/synthesize (EN)
$ curl -X POST -u {username}:{password}--header "Content-Type: application/json"--data "{\"text\":\"hello world\"}"--output hello_world.ogg"https://stream.aibril-watson.kr/text-to-speech/api/v1/synthesize"
/v1/synthesize (KR) voice= youngmi, yuna
$ curl -X POST -u {username}:{password}--header "Content-Type: application/json"--data "{\"text\":\"hello world\"}"--output hello_world.ogg"https://stream.aibril-watson.kr/text-to-speech/api/v1/synthesize?voice=youngmi"
/v1/synthesize (KR) custom model create
curl -X POST -u {username}:{password}--header "Content-Type:application/json"--data "{\"name\":\"cURL Test\", \"language\":\"en-US\", \"description\":\"Customization test via cURL\"}""https://stream.aibril-watson.kr/text-to-speech/api/v1/customizations"
curl -X GET -u {username}:{password}"https://stream.aibril-watson.kr/text-to-speech/api/v1/customizations/{customization_id}"
/v1/synthesize (KR) custom model call
© 2018 IBM Corporation 13Page
Text to Speech
https://watson-api-explorer.ng.bluemix.net/apis/text-to-speech-v1
WatsonAPIExplorer
© 2018 IBM Corporation 14Page
Speech Synthesis Markup Language (SSML)
https://console.bluemix.net/docs/services/text-to-speech/SSML.html#ssml
SSML을활용하면음의높낮이Pitch,특정발음을교정해주는subalias등을활용할수있습니다.
© 2018 IBM Corporation 15Page
Waston Assistant (Formerly Watson Conversation)
왓슨의 한국어 챗봇 서비스
© 2018 IBM Corporation 16Page
ChatBot UI/UX [5]
이제아래와같은이미지/터치엘리먼트는사용할수없습니다.오로지텍스트로만!
© 2018 IBM Corporation 17Page
Watson Assistant
https://console.bluemix.net/docs/services/text-to-speech/SSML.html#ssml
한땀한땀농업적근면성을발휘하여DialogFlow를구성합니다.
© 2018 IBM Corporation 18Page
Watson Assistant
[NewCredential]을통해WatsonAssistant챗봇의Credential을추가
********************
© 2018 IBM Corporation 19Page
전화 받는 서비스
- 버튼 입력 IVR
SIP Service: Twillo
© 2018 IBM Corporation 20Page
SIP(Session Initiation Protocol) Service : Twillo
없을경우,Twillo와같은무료Trial계정을통해테스트를진행해볼수있음
© 2018 IBM Corporation 21Page
SIP(Session Initiation Protocol) Service : Twillo
SIPTrunking을통해전화를받는서비스(Inbound)를구현할수있음.
© 2018 IBM Corporation 22Page
SIP(Session Initiation Protocol) Service : Twillo
VoiceAgentEndpoint를SIP서비스에설정,필요시 DREndpoint를추가할수있음.
✓ [Elastic SIP Trunking dashboard] -[Create SIP Trunk] -[Origination] -[SIP URI] 항목에서Voice Agent의SIP주소를등록
© 2018 IBM Corporation 23Page
SIP(Session Initiation Protocol) Service : Twillo
[Numbers]탭에서 [GetStartedwithPhoneNumbers]로무료미국번호를얻거나 [BuyaNumber]를통해한국
전화번호를구입할수있음
✓ [Elastic SIP Trunking]-[Numbers] 에서새로발급받은번호를할당Assign 해주어야함.✓ Elastic SIP를통해Call log와과금등을확인할수있음.
© 2018 IBM Corporation 24Page
IVR(interactive voice response) 기능을 테스트하고 싶을 경우
기존구축된 IVR에추가적인연동작업필요.Twillo에서는현재영문/유료기능으로테스트가능 [6]
© 2018 IBM Corporation 25Page
왓슨 음성 비서(Watson Voice Agent)와
‘한국어’ 콜센터 구현
Watson Voice Agent
© 2018 IBM Corporation 26Page
Watson Voice Agent
https://www.youtube.com/watch?v=LcTAkRZzuwo
© 2018 IBM Corporation 27Page
Watson Voice Agent
SIP Service
전화
음성 인식
챗봇
음성합성
© 2018 IBM Corporation 28Page
Watson Voice Agent
SIPService에서발급받은전화번호를입력한뒤,WatsonAssistant,SpeechtoText의Credential을추가합니다.
**********
**********
© 2018 IBM Corporation 29Page
Watson Voice Agent
TexttoSpeech의Credential과EventForwarding용CloudantDB를추가합니다.
**********
**********
**********
© 2018 IBM Corporation 30Page
Call flow from a customer to a live agent
이미기업에서사용하고있는 IVR이나기업레벨의SBC등과연동도가능
https://www.ibm.com/support/knowledgecenter/SS4U29/contactctr.html?view=embed
© 2018 IBM Corporation 31Page
Event Forwarding (Last Updated: 2018-08-24)
총3가지이벤트Forwarding지원,CallDetailRecord(CDR), TranscriptionEvents,WatsonAssistant turnsEvents
© 2018 IBM Corporation 32Page
이벤트Event를통해DB에WatsonVoiceAgent데이터를저장하고활용할수있음
– Call detail record (CDR) 이벤트
시작 및 종료 시간, 종료 이유 및 대화 트랜잭션에 대한 세부 정보와 같이 단일 호출Single
Call에 대한 요약 정보가 들어 있습니다.
– IBM Watson™ Assistant Turn 이벤트
Voice Gateway가 Watson Assistant에 대한 각 요청Each request 후에 수신 한 즉각적인JSON 응답이 포함되는 이벤트 정보가 들어 있습니다.
– Transcription 이벤트
발화Utterance가 감지 될 때마다 발행되며 발화 텍스트Utterance text, 신뢰 점수Confidence score 및 세션 정보를 포함합니다. 버전 1.0.0.2 이상에서 지원됩니다.
✓ 이벤트를HTTP POST Requests로보내어Splunk HTTP Event Collector (HEC) 혹은REST 서버와통합할수있습니다. NoSQL DB나IBM Cloudant DB로이벤트를저장할수도있습니다.
Event Forwarding (Last Updated: 2018-08-24)
© 2018 IBM Corporation 33Page
Cloudant DB
[CloudantDB생성]-[Credential추가]-[LAUNCHCLOUDANTDASHBOARD] -[CreateDatabase3개생성]
© 2018 IBM Corporation 34Page
Calldetail recordeventformat (UnitofAnalysis :Call)
Event Forwarding (Last Updated: 2018-08-24)
https://www.ibm.com/support/knowledgecenter/SS4U29/reportingcdr.html
© 2018 IBM Corporation 35Page
WatsonAssistant turneventformat (UnitofAnalysis :Request)
Event Forwarding (Last Updated: 2018-08-24)
https://www.ibm.com/support/knowledgecenter/SS4U29/reportingconvturn.html#cloudant-assistant
© 2018 IBM Corporation 36Page
Transcriptioneventformat (UnitofAnalysis :Utterance)
Event Forwarding (Last Updated: 2018-08-24)
https://www.ibm.com/support/knowledgecenter/SS4U29/rttconfig.html?view=embed
© 2018 IBM Corporation 37Page
Deploying Voice Gateway on a Kubernetes cluster in IBM Cloud
Github https://github.com/WASdev/sample.voice.gatewayGuide https://www.ibm.com/support/knowledgecenter/en/SS4U29/deploybmix.html
© 2018 IBM Corporation 38Page
Lesson Learned
실제 구현하면서 느낀 점들
© 2018 IBM Corporation 39Page
이제 실제 사용을 해봅시다!
다른 팀원들한테도 수줍게 번호를알려줍시다.
두근반 세근반
© 2018 IBM Corporation 40Page
Demo
© 2018 IBM Corporation 41Page
잠 깐!음성 데이터 학습을
어떻게 시킨다고 했었죠?
© 2018 IBM Corporation 42Page
일반적인 Speech To Text 트레이닝 데이터 생성 과정
이미있는음성파일을받아적거나,이미있는대화록을녹음하거나
Data Generation
(Text + Audio)
✓ 텍스트데이터를통해LM(Language Model)이오디오데이터를통해AM(Acoustic Model)을생성하여활용
© 2018 IBM Corporation 43Page
사람들은 인공지능에게 존댓말을
하지 않습니다!
User Behavior
© 2018 IBM Corporation 44Page
인공지능 상담원이 받는다고하면 말이 짧아집니다!
User Behavior
© 2018 IBM Corporation 45Page
학습 데이터와 실제 데이터가다르면?
Data
© 2018 IBM Corporation 46Page
과적합 Overfitting!
Data
© 2018 IBM Corporation 47Page
훈련데이터와실제데이터의차이가클수록과적합이발생할가능성이높음.
모의고사열심히풀었지만수능에처음보는유형이나온다면?
Overfitting 과적합
© 2018 IBM Corporation 48Page
Overfitting 과적합
존댓말을학습했는데,사용자는반말로문의를하면어떻게될까?
긴문장을학습했는데,짧은문장으로질문한다면?
© 2018 IBM Corporation 49Page
인공지능이 아닌척 최대한자연스럽게 해보면 될까?
© 2018 IBM Corporation 50Page
캘리포니아의 AI 상담원 고지 의무 [4]
https://www.pe.com/2018/10/30/best-in-law-must-you-disclose-what-your-bot-is-up-to-california-weighs-in-on-ai/
인공지능이사람을상대할때고지의의무를명문화
© 2018 IBM Corporation 51Page
사용자로부터 발화되는 소음
예) 음... 어... 에....
사용자로부터 들어야 하는 대답
예) 응. 어. 예
각종 배경 소음(기종도 천차만별)
+
+
Background Noise
© 2018 IBM Corporation 52Page
사람들이 챗봇 상담원의 말이너무 길어지면 오랫동안 듣지
않습니다.
==
(자의반 타의반) 사람들이 챗봇의말을 짜르는 경우가 많습니다.
Background Noise
© 2018 IBM Corporation 53Page
기존 챗봇과 다르게, 챗봇은 짧은 시간 동안 많은 양의
정보를 전달해야 하고, 사용자는 더 긴 시간 동안 자신의의도를 설명해주어야 합니다.
© 2018 IBM Corporation 54Page
챗봇의 말은 짧아지고,사람의 대답은 길어져야 한다.
적절히, 유도하는 멘트 필요!
© 2018 IBM Corporation 55Page
전화에 구체적이고 특정한 목표가있을 때, 더 잘 알아들음.
목표가 광범위할 수록 어려움.
© 2018 IBM Corporation 56Page
Project Examples
IBM’s Watson to Listen in on 911 Calls https://www.meritalk.com/articles/ibms-watson-to-listen-in-on-911-calls/
Essextec CI delivers Cognitive Solutions with IBM GBS https://essextec.com/about/news/essextec-ci-delivers-cognitive-solutions-with-ibm-gbs
Association of Public-Safety Communications Officials (APCO) International Taps IBM Watson Capabilities to Evaluate and Report on Emergency
Calls https://essextec.com/wp-content/uploads/2017/08/IBM-Watson-APCO-press-release.pdf
The Association of Public-Safety Communications Officials (APCO) International
✓ IBM’s Watson to Listen in on 911 Calls
© 2018 IBM Corporation 57Page
Project Examples
Mizuho Bank, Watson Technology installed in Call Centers preparing for next generation channels
WatsonTechnology installed in Call Centers preparing for next generation channels
https://www.mizuhobank.com/mizuho_fintech/news/watson/index.html
© 2018 IBM Corporation 58Page
Project Examples
Invoca Call Center
Invoca(2016), Invocaembeds IBM’s Watson so call centers can better understand their customers
https://www.invoca.com/press/invoca-integrates-ibm-watson-real-time-voice-analytics/
https://marketingland.com/invoca-embeds-ibms-watson-call-centers-can-better-understand-customers-200376
https://blog.invoca.com/ibm-watson-speech-text-turns-phone-calls-invaluable-marketing-data/
© 2018 IBM Corporation 59Page
한국에서의 다음 성공사례는
어떤 기업이 될까요?
© 2018 IBM Corporation 61Page
References
[1] Chingwa Univ http://www.cs.nthu.edu.tw/~shwu/courses/ml/labs/08_CV_Ensembling/08_CV_Ensembling.html[2] Penn Univ https://quantdev.ssri.psu.edu/tutorials/cross-validation-tutorial[3] Sigmoidal ML c https://sigmoidal.io/[4] https://www.pe.com/2018/10/30/best-in-law-must-you-disclose-what-your-bot-is-up-to-california-weighs-in-on-ai/[5] https://chatbotsmagazine.com/cheat-sheet-all-facebook-chatbot-interactions-4b14e4e00178[6] https://www.twilio.com/docs/voice/tutorials/how-to-gather-user-input-via-keypad-python[7] WatsonTechnology installed in Call Centers preparing for next generation channels https://www.mizuhobank.com/mizuho_fintech/news/watson/index.html[8] Invoca(2016), Invoca embeds IBM’s Watson so call centers can better understand their customershttps://www.invoca.com/press/invoca-integrates-ibm-watson-real-time-voice-analytics/https://marketingland.com/invoca-embeds-ibms-watson-call-centers-can-better-understand-customers-200376https://blog.invoca.com/ibm-watson-speech-text-turns-phone-calls-invaluable-marketing-data/
[9] IBM’s Watson to Listen in on 911 Calls https://www.meritalk.com/articles/ibms-watson-to-listen-in-on-911-calls/[10] Essextec CI delivers Cognitive Solutions with IBM GBS https://essextec.com/about/news/essextec-ci-delivers-cognitive-solutions-with-ibm-gbs[11] Association of Public-Safety Communications Officials (APCO) International Taps IBM Watson Capabilities to Evaluate and Report on Emergency Calls https://essextec.com/wp-content/uploads/2017/08/IBM-Watson-APCO-press-release.pdf[12] https://console.bluemix.net/docs/services/text-to-speech/SSML.html#ssml