In this article, I describe how I implemented Wake-Up-Word Speech Recognition on a Raspberry Pi with PocketSphinx.
Why you need Wake-Up-Word?
- So you can wake-up your device by talking to it out loud, without having to press any buttons. You see such feature in the Amazon Echo “Hi Alexa,” Apple: “Hey Siri”, and Google “OK Google.” I did it for my candy machine; I’d say “Hey Candy.”
Why not use online speech recognition?
Two reasons: cost & connectivity
- Cost: online Speech recognition such as IBM Watson Speech to Text service: costs $0.02/min. If you run 24/7 for a month, your bill would be $844/mo.
- 1 month * 30days/month * 24hours/day * 60min/hour = 43200 minutes/month * $0.02/min = $844/mo
- Connectivity
- What if the internet is down? What about having some barebones functionality in your device. Consider registering words such as ‘help.’
Well, enough background. Here’s how to get it working:
Step 1:
- MIC must be the “first” device so it to works with PocketSphinx.
- Most instructions out there were written for Debian version Wheezy. If you have the newer Jessie version, follow this to setup the microphone order:
- http://raspberrypi.stackexchange.com/questions/40831/how-do-i-configure-my-sound-for-jasper-on-raspbian-jessie
Step 2:
- Now that you have the mic priority configured, you can follow these:
- http://cmusphinx.sourceforge.net/wiki/raspberrypi
- Update: before you run ./make on the pi, do this: ( reference )
- $ sudo apt-get install python-dev swig
- $ sudo apt-get install autoconf libtool automake
- $ ./autogen.sh
Step 3:
- Here’s how to run it:
- $ pocketsphinx_continuous -inmic yes -keyphrase “hey candy” -kws_threshold 1e-20
- If you need to tweak those params check out this blog
Lessons Learn / misc notes:
- These steps were done on 5/11/2016; some instructions may be outdated.
- ‘omxplayer’ cannot output audio to USB devices. [to do: add reference]
- Detection performance was inconsistent. To do: add more testing on performance
Next Steps:
- Trim the dictionary file, to see if performance improves
- Post Node-Red flow showing this feature as part of a bigger solution.
References:
#1 Blog: Raspberry Pi 2 – Speech Recognition on device – https://wolfpaulus.com/journal/embedded/raspberrypi2-sr/