When using IBM Watson Speech to Text (STT) and Text to Speech (TTS) services for my Cognitive Candy project I started off using WAV file format. That was the easy choice since WAV is a raw audio format requiring no additional software for encoding. Continue reading “Comparison of WAV, FLAC and OGG audio formats: size and latency”
Serverless Node-Red applications with OpenWhisk and Docker
IBM recently launched a service called OpenWhisk; a distributed compute service to execute application logic in response to events. The most notable advantages of such serverless framework is: Continue reading “Serverless Node-Red applications with OpenWhisk and Docker”
Wake-Up-Word Speech Recognition on a Raspberry Pi with PocketSphinx
In this article, I describe how I implemented Wake-Up-Word Speech Recognition on a Raspberry Pi with PocketSphinx. Continue reading “Wake-Up-Word Speech Recognition on a Raspberry Pi with PocketSphinx”