A Node.js Binding for PocketSphinx
This module aims to allow basic speech recognition on portable devices through the use of PocketSphinx. It is possible to either use buffer chunks for live recognition or pass audio buffers with the complete audio to PocketSphinx to decode.
To build the node extension you have to install Sphinxbase and PocketSphinx from the repositories preferably:
var PocketSphinx = require('pocketsphinx');
var ps = new PocketSphinx.Recognizer({
'-hmm': '/file/path',
'-dict': '/file/path',
'-nfft': 512,
'-remove_silence': false
});
ps.on('hyp', function(err, hypothesis, score){
if(err) console.error(err);
console.log('Hypothesis: ', hypothesis);
});
ps.addKeyphraseSearch("keyphrase_name", "keyphrase");
ps.addKeywordSearch("keyword_name", "/file/path");
ps.addGrammarSearch("grammar_name", "/file/path");
ps.addNgramSearch("ngram_name", "/file/path");
ps.search = "keyphrase_name";
ps.start();
ps.write(data);
ps.stop();
You can pass any valid PocketSphinx parameter you could e.g. pass to pocketsphinx_continuous but you have to assure you're using the correct type for the parameter. If you are for example pass '-samprate' as a string like '44100.0' the configuration will fail. In the following list you can see some of the probably most required options with their required data type:
option | type | default | description |
---|---|---|---|
-samprate |
float | 44100.0 |
The sample rate of the passed data |
-hmm |
string | modelDirectory + "/en-us/en-us" |
The hmm model directory |
-dict |
string | modelDirectory + "/en-us/cmudict-en-us.dict" |
The dictionary file directory |
-nfft |
int | 2048 |
The nfft value |
For more options you can look into the manual of pocketsphinx_continuous with $ man pocketsphinx_continuous |
The PocketSphinx Object itself has the properties
Recognizer(options, [hyp])
- Creates a new Recognizer instancemodelDirectory
- The default model directoryfromFloat(buffer):buffer
- Resamples javascript audio buffers to use with PocketSphinx
A Recognizer instance has the following methods:
on(event, function)
- Attaches an event handler (overwrites old event handlers for this event)off(event)
- Removes an event handlerstart()
- Starts the decoderstop()
- Stops the decoderrestart()
- Restarts the decoderreconfig(options, [hyp])
- Reconfigures the decoder without having to reload itsilenceDetection(enabled)
- Disables or enables silence detection (Default: enabled)addKeyphraseSearch(name, keyphrase)
- Adds a keyphrase searchaddKeywordsSearch(name, keywordFile)
- Adds a keyword searchaddGrammarSearch(name, jsgfFile)
- Adds a jsgf searchaddNgramSearch(name, nGramFile)
- Adds a nGram searchwrite(buffer)
- Decodes a complete audio bufferwriteSync(buffer)
- Decodes the next audio buffer chunklookupWords(array):object
- Returns an object with the propertiesin
(an object with words in dictionary and their phonetic transcription as value) andout
(an array with out of dictionary words)addWords(object)
- Adds the phonetic transcription from object to dictionary (key = word, value = transcription)free()
- Releases all resources associated with the decoder.
The following events are currently supported
event | parameters | description |
---|---|---|
hyp |
error, hypothesis, score |
When a hypothesis is available. score is the path score corresponding to returned string. |
hypFinal |
error, hypothesis, isFinal |
When a hypothesis is available. isFinal indicates if hypothesis has reached final state in the grammar. |
start |
none | When decoding started. |
stop |
none | When decoding stopped. |
speechDetected |
none | When speech was detected the first time. |
silenceDetected |
none | When silence was detected after speech. |
error |
error |
When an error occurred |
To specify a search you can use one of the add functions mentioned in the methods section above and then add the name to the instance's search accessor like so:
ps.addGrammarSearch('someSearch', 'digits.gram');
ps.search = 'someSearch';
ps.start();
Or you can pass e.g. a language model file, a jsgf grammar or a keyword file directly with the Recognizer options or reconfigure the Recognizer with such options at runtime:
var ps = new PocketSphinx.Recognizer({
'-lm': '/path/myFancyLanguageModel' // This adds an nGram search
});
ps.on('hyp', function(err, hypothesis, score) {
if(err) console.error(err);
console.log('Hypothesis: ', hypothesis);
});
ps.start();