I want to extend my Watson Assistant and have it fetch information from an external source (via an HTTP REST API service call) so that I could have richer interactions in my dialogs. For example, bring the weather forecast to my the chatbot conversation.
When using IBM Watson Speech to Text (STT) and Text to Speech (TTS) services for my Cognitive Candy project I started off using WAV file format. That was the easy choice since WAV is a raw audio format requiring no additional software for encoding. Continue reading “Comparison of WAV, FLAC and OGG audio formats: size and latency”
A lesson learned from my Cognitive Candy project is that Candy’s response time is a key factor for a great user experience. When people talked to Candy, they expected ‘her’ to respond in the same cadence a person would. People’s excitement and engagement level seemed to quickly drop off if response time were too long. Continue reading “Improve Watson Text to Speech latency by 99% with Caching”
In this article, I describe how I implemented Wake-Up-Word Speech Recognition on a Raspberry Pi with PocketSphinx. Continue reading “Wake-Up-Word Speech Recognition on a Raspberry Pi with PocketSphinx”