Skip to content

Large model starts to repeat itself / gets stuck on a phrase #924

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
timmermansjoy opened this issue May 15, 2023 · 6 comments
Open

Large model starts to repeat itself / gets stuck on a phrase #924

timmermansjoy opened this issue May 15, 2023 · 6 comments

Comments

@timmermansjoy
Copy link

Hi there. using the large model it sometimes starts to go in a loop or so and just repeat a sentence the rest of the transcript. with the following command:

./main -m models/ggml-large.bin "$output_file" -t 11 -l nl --output-txt --print-colors --best-of 3

the audio file is 20 min long. but I have seen it with other files aswell

Screenshot 2023-05-15 at 09 26 04

@mrfragger
Copy link

table: nothingbutdots
I have it happen kinda. Using medium.en model here I have transcribing working till 13 h 57m mark then just . . . . . till 25 hours at which point I killed it. It was a 36h 2 m audio segment. I'm gonna keep audio segments around 30 hours to hopefully avoid this issue.

@dhx
Copy link

dhx commented May 20, 2023

Possibly related to: openai/whisper#1253

@jingyibo123
Copy link

This can be easily reproduced with the sample:

./main -m ./models/ggml-large-v3-q5_0.bin -f samples/gb1.wav
whisper_init_from_file_with_params_no_state: loading model from './models/ggml-large-v3-q5_0.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51866
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1280
whisper_model_load: n_audio_head  = 20
whisper_model_load: n_audio_layer = 32
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1280
whisper_model_load: n_text_head   = 20
whisper_model_load: n_text_layer  = 32
whisper_model_load: n_mels        = 128
whisper_model_load: ftype         = 8
whisper_model_load: qntvr         = 2
whisper_model_load: type          = 5 (large v3)
whisper_model_load: adding 1609 extra tokens
whisper_model_load: n_langs       = 100
whisper_model_load:      CPU buffer size =  1080.97 MB
whisper_model_load: model size    = 1080.47 MB
whisper_init_state: kv self size  =  220.20 MB
whisper_init_state: kv cross size =  245.76 MB
whisper_init_state: compute buffer (conv)   =   32.42 MB
whisper_init_state: compute buffer (encode) =  212.42 MB
whisper_init_state: compute buffer (cross)  =    9.38 MB
whisper_init_state: compute buffer (decode) =   99.24 MB

system_info: n_threads = 1 / 16 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | METAL = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0 | 

main: processing 'samples/gb1.wav' (3179927 samples, 198.7 sec), 1 threads, 1 processors, 5 beams + best of 5, lang = en, task = transcribe, timestamps = 1 ...


[00:00:00.980 --> 00:00:08.720]   My fellow Americans, this day has brought terrible news and great sadness to our country.
[00:00:08.720 --> 00:00:17.280]   At 9:00 this morning, Mission Control in Houston lost contact with our space shuttle Columbia.
[00:00:17.280 --> 00:00:24.640]   A short time later, debris was seen falling from the skies above Texas.
[00:00:24.640 --> 00:00:27.200]   The Columbia is lost.
[00:00:27.200 --> 00:00:29.860]   There are no survivors.
[00:00:29.860 --> 00:00:32.920]   On board was a crew of seven.
[00:00:32.920 --> 00:00:39.760]   Colonel Rick Husband, Lieutenant Colonel Michael Anderson, Commander Laurel Clark, Captain
[00:00:39.760 --> 00:00:50.120]   David Brown, Commander William McCool, Dr. Kulpna Shavla, and Ilan Ramon, a colonel in
[00:00:50.120 --> 00:00:52.780]   the Israeli Air Force.
[00:00:52.780 --> 00:00:59.720]   These men and women assumed great risk in the service to all humanity in an age when
[00:00:59.720 --> 00:01:03.100]   flight has come to seem almost routine.
[00:01:03.100 --> 00:01:08.720]   It is easy to overlook the dangers of travel by rocket and the difficulties of navigating
[00:01:08.720 --> 00:01:12.580]   the fierce outer atmosphere of the Earth.
[00:01:12.580 --> 00:01:19.220]   These astronauts knew the dangers, and they faced them willingly, knowing they had a high
[00:01:19.220 --> 00:01:22.940]   and noble purpose in life.
[00:01:22.940 --> 00:01:29.580]   Because of their courage and daring and idealism, we will miss them all the more.
[00:01:29.580 --> 00:01:36.360]   All Americans today are thinking as well of the families of these men and women who
[00:01:36.360 --> 00:01:40.440]   have been given this sudden shock and grief.
[00:01:40.440 --> 00:01:42.340]   You're not alone.
[00:01:42.340 --> 00:01:45.420]   Our entire nation grieves with you.
[00:01:45.420 --> 00:01:52.340]   And those you loved will always have the respect and gratitude of this country.
[00:01:52.340 --> 00:01:57.060]   The cause in which they died will continue.
[00:01:57.060 --> 00:01:59.440]   Mankind is led into the darkness.
[00:01:59.440 --> 00:02:02.200]   But we will not be left behind.
[00:02:02.200 --> 00:02:04.200]   We will be led into the darkness.
[00:02:04.200 --> 00:02:06.200]   We will be led into the darkness.
[00:02:06.200 --> 00:02:08.200]   We will be led into the darkness.
[00:02:08.200 --> 00:02:10.200]   We will be led into the darkness.
[00:02:10.200 --> 00:02:12.200]   We will be led into the darkness.
[00:02:12.200 --> 00:02:14.200]   We will be led into the darkness.
[00:02:14.200 --> 00:02:16.200]   We will be led into the darkness.
[00:02:16.200 --> 00:02:18.200]   We will be led into the darkness.
[00:02:18.200 --> 00:02:20.200]   We will be led into the darkness.
[00:02:20.200 --> 00:02:22.200]   We will be led into the darkness.
[00:02:22.200 --> 00:02:24.200]   We will be led into the darkness.
[00:02:24.200 --> 00:02:26.200]   We will be led into the darkness.
[00:02:26.200 --> 00:02:28.200]   We will be led into the darkness.
[00:02:28.200 --> 00:02:29.300]   We will be led into the darkness.

@mtrazzi
Copy link

mtrazzi commented Jan 21, 2024

any updates on this? I had the same problem using the large model v3

@Lavrikov
Copy link

try to use flag -mc 0. It helps to avoid adding previous text prompt to new one

@brbrainerd
Copy link

brbrainerd commented Jul 6, 2024

try to use flag -mc 0. It helps to avoid adding previous text prompt to new one

Great solution. I thought I'd post a Python function that removes sequentially repeated lines in case you would like to keep your token history. It's worked successfully on a media library with ~14,000 videos:

def check_repeated_lines(vtt_file):
    """Check for and remove repeated subtitle lines in the VTT file."""
    logging.debug(f"Checking for repeated lines in {vtt_file}")
    with open(vtt_file, 'r') as file:
        content = file.readlines()

    cleaned_content = []
    previous_line = None
    skip_next = False

    i = 0
    while i < len(content):
        line = content[i]
        if re.match(r'^[0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{3} --> [0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{3}$', line):
            if skip_next:
                skip_next = False
                i += 2  # Skip the current timestamp and the next line
                continue
            cleaned_content.append(line)
            if i + 1 < len(content):
                next_line = content[i + 1].strip()
                if next_line == previous_line:
                    cleaned_content.pop()  # Remove the last timestamp
                    skip_next = True
                else:
                    cleaned_content.append(content[i + 1])
                previous_line = next_line
            i += 2
        else:
            cleaned_content.append(line)
            i += 1

    # Remove any remaining blank lines
    cleaned_content = [line for line in cleaned_content if line.strip()]

    with open(vtt_file, 'w') as file:
        file.writelines(cleaned_content)

    logging.debug(f"Finished cleaning repeated lines in {vtt_file}")
    return False

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants