-
Notifications
You must be signed in to change notification settings - Fork 4.1k
enable hot-word boosting #3297
enable hot-word boosting #3297
Conversation
Can we not limit that to the C-client? It's very much likely people will want to use this part of the API from elsewhere, and in the current state, it's completely unknown whether this works or not. |
| if (!hot_words_.empty()) { | ||
| // increase prob of prefix for every word | ||
| // that matches a word in the hot-words list | ||
| for (std::string word : ngram) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have you measured perf impact with scorers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perf as in WER? or perf as in latency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
latency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not measured the latency effects yet, no.
Are there any TC jobs that do this, or should I profile locally? What do you recommend?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, you'd have to do it locally. Using perf should be quite easy.
lissyx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please expose it in the API as a real list of words, and please add:
- basic CI testing for that feature
- usage in different bindings would really be a good thing (Python, JS, .Net, Java) if you can
Also, it looks like your current code breaks training and ctc decoder, so please fix that.
This isn't how log probabilities work, you're making exponential increases in the probability here. exp(-3.5) ~= 0.03 and exp(-1.75) ~= 0.17. This, combined with the fact that a single word will be boosted several times in the same beam as it appears in multiple n-grams, makes it hard to reason about the behavior of the coefficient. It should probably be an additive factor (multiplication in probability space). |
|
@JRMeyer To keep your API simpler, I suggest you move to a single entry point: This entry point would add a new Depending on usecase, it could also be cool to expose (though I'm unsure it is really required): This would simply re-init the set of hot words With this API, you could more easily expose and update all our bindings ( |
Even though my initial intuition was wrong about how the boosting compounds, I still like the UX. Namely, if you're using this feature, and trying to find the right boosting coefficient for your data, you would know to sweep between 0 and 1, which isn't hard. with an additive effect, the search space now goes from |
carlfm01
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice @JRMeyer, just missing the following on the IDeepSpeech interface:
/// <summary>
/// Add a hot-word.
/// </summary>
/// <param name="aWord">Some word</param>
/// <param name="aBoost">Some boost</param>
/// <exception cref="ArgumentException">Thrown on failure.</exception>
public void AddHotWord(string aWord, float aBoost);
/// <summary>
/// Erase entry for a hot-word.
/// </summary>
/// <param name="aWord">Some word</param>
/// <exception cref="ArgumentException">Thrown on failure.</exception>
public void EraseHotWord(string aWord);
/// <summary>
/// Clear all hot-words.
/// </summary>
/// <exception cref="ArgumentException">Thrown on failure.</exception>
public void ClearHotWords();
I set these as |
Sorry, I forgot to delete the public, you did it right. |
lissyx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now looking quite good, just fix the Android test execution, and ensure to squash into one commit
|
@lissyx -- it's one commit, but all the previous commit messages got appended into the one commit message :/ it doesn't look pretty, but yes it is one commit |
This PR enables hot-word boosting (immediate support in the C and Python clients) with the new flags
--hot_words.The flag takes a string of
wordsand their respectiveboostsseparated by commas and colons, as such:--hot_words "friend:1.5,enemy:20.4". Theboosttakes a floating point number between-infandinf.The boosting is applied as an addition to the negative log likelihood of a candidate word sequence, given by the KenLM language model. Since the LM probability is a negative log value, at
0.0we have 100% likelihood, and at negative infinity we have 0% likelihood. As such, we will always have some negative number from the KenLM model.For example, if KenLM returns
-3.5as the likelihood for the word sequence "i like cheese", if we add3to this number, we get-0.75, therefore increasing the likelihood of that sequence. On the other hand, if we add a-3to the likelihood, we decrease the likelihood of that sequence. Adding a negative number as a boost will make the decoder "avoid" certain words.