Skip to content

Grammar Bug: Sometimes words are only partially recognized #2496

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
LazoVelko opened this issue Oct 17, 2024 · 1 comment
Open

Grammar Bug: Sometimes words are only partially recognized #2496

LazoVelko opened this issue Oct 17, 2024 · 1 comment

Comments

@LazoVelko
Copy link

Problem

If I have a word like escape in my grammar, sometimes whisper will output the first few letters esc instead of the whole word. The expected behavior is that only the entire word should be recognized.

How to Reproduce (example 1)

Go into examples/command and make a simple single line grammar root ::= " escape". Now if you say "escape" it will sometimes print out esc instead of the whole word escape. You can also try to say "essk" and that will also print out esc but the expected behavior would be to print nothing. This is an invalid command.

How to Reproduce (example 2)

Another example is to set the grammar to root ::= " caps". If you say "cap" it will print out cap (without the s). The expected behavior should be to print nothing because cap is an invalid command, only caps (with the s) should be accepted.

My Setup

I'm running examples/command with my custom grammar on a Window 10 machine via GPU/CUDA and I get the same problem whether I use ggml-small or ggml-large-v2.

Temporary Workaround Issue

I can remove invalid words in post processing but the problem is that these erroneous words prematurely cut off recognition of any other commands which should come after. For example, if I have a long list of commands like "please escape and log out", if escape is incorrectly outputted as esc then everything that comes after that command will be omitted from the output.

Notes

I noticed user @ulatekh also experienced this problem #2127 (comment) #2047 (comment). I created this issue as a response to this comment #2127 (comment).

@philipag
Copy link

philipag commented Feb 24, 2025

I see the same thing. E.g. the following minimal grammar used on normal dialogue with the word "fixed" mentioned once (and transcribing normally without a grammar) produces "f".

root   ::= init word
init ::= " "
word ::=  ("fixed")

I also notice that timestamps (with time stamping enabled of course) appear to be wrong when using a grammar. The timestamp for the "f" that whisper returns is wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants