Large model starts to repeat itself / gets stuck on a phrase #924

timmermansjoy · 2023-05-15T07:29:57Z

Hi there. using the large model it sometimes starts to go in a loop or so and just repeat a sentence the rest of the transcript. with the following command:

./main -m models/ggml-large.bin "$output_file" -t 11 -l nl --output-txt --print-colors --best-of 3

the audio file is 20 min long. but I have seen it with other files aswell

The text was updated successfully, but these errors were encountered:

mrfragger · 2023-05-18T11:40:07Z

I have it happen kinda. Using medium.en model here I have transcribing working till 13 h 57m mark then just . . . . . till 25 hours at which point I killed it. It was a 36h 2 m audio segment. I'm gonna keep audio segments around 30 hours to hopefully avoid this issue.

dhx · 2023-05-20T11:12:22Z

Possibly related to: openai/whisper#1253

jingyibo123 · 2024-01-08T12:26:31Z

This can be easily reproduced with the sample:

./main -m ./models/ggml-large-v3-q5_0.bin -f samples/gb1.wav

whisper_init_from_file_with_params_no_state: loading model from './models/ggml-large-v3-q5_0.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51866
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1280
whisper_model_load: n_audio_head  = 20
whisper_model_load: n_audio_layer = 32
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1280
whisper_model_load: n_text_head   = 20
whisper_model_load: n_text_layer  = 32
whisper_model_load: n_mels        = 128
whisper_model_load: ftype         = 8
whisper_model_load: qntvr         = 2
whisper_model_load: type          = 5 (large v3)
whisper_model_load: adding 1609 extra tokens
whisper_model_load: n_langs       = 100
whisper_model_load:      CPU buffer size =  1080.97 MB
whisper_model_load: model size    = 1080.47 MB
whisper_init_state: kv self size  =  220.20 MB
whisper_init_state: kv cross size =  245.76 MB
whisper_init_state: compute buffer (conv)   =   32.42 MB
whisper_init_state: compute buffer (encode) =  212.42 MB
whisper_init_state: compute buffer (cross)  =    9.38 MB
whisper_init_state: compute buffer (decode) =   99.24 MB

system_info: n_threads = 1 / 16 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | METAL = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0 | 

main: processing 'samples/gb1.wav' (3179927 samples, 198.7 sec), 1 threads, 1 processors, 5 beams + best of 5, lang = en, task = transcribe, timestamps = 1 ...


[00:00:00.980 --> 00:00:08.720]   My fellow Americans, this day has brought terrible news and great sadness to our country.
[00:00:08.720 --> 00:00:17.280]   At 9:00 this morning, Mission Control in Houston lost contact with our space shuttle Columbia.
[00:00:17.280 --> 00:00:24.640]   A short time later, debris was seen falling from the skies above Texas.
[00:00:24.640 --> 00:00:27.200]   The Columbia is lost.
[00:00:27.200 --> 00:00:29.860]   There are no survivors.
[00:00:29.860 --> 00:00:32.920]   On board was a crew of seven.
[00:00:32.920 --> 00:00:39.760]   Colonel Rick Husband, Lieutenant Colonel Michael Anderson, Commander Laurel Clark, Captain
[00:00:39.760 --> 00:00:50.120]   David Brown, Commander William McCool, Dr. Kulpna Shavla, and Ilan Ramon, a colonel in
[00:00:50.120 --> 00:00:52.780]   the Israeli Air Force.
[00:00:52.780 --> 00:00:59.720]   These men and women assumed great risk in the service to all humanity in an age when
[00:00:59.720 --> 00:01:03.100]   flight has come to seem almost routine.
[00:01:03.100 --> 00:01:08.720]   It is easy to overlook the dangers of travel by rocket and the difficulties of navigating
[00:01:08.720 --> 00:01:12.580]   the fierce outer atmosphere of the Earth.
[00:01:12.580 --> 00:01:19.220]   These astronauts knew the dangers, and they faced them willingly, knowing they had a high
[00:01:19.220 --> 00:01:22.940]   and noble purpose in life.
[00:01:22.940 --> 00:01:29.580]   Because of their courage and daring and idealism, we will miss them all the more.
[00:01:29.580 --> 00:01:36.360]   All Americans today are thinking as well of the families of these men and women who
[00:01:36.360 --> 00:01:40.440]   have been given this sudden shock and grief.
[00:01:40.440 --> 00:01:42.340]   You're not alone.
[00:01:42.340 --> 00:01:45.420]   Our entire nation grieves with you.
[00:01:45.420 --> 00:01:52.340]   And those you loved will always have the respect and gratitude of this country.
[00:01:52.340 --> 00:01:57.060]   The cause in which they died will continue.
[00:01:57.060 --> 00:01:59.440]   Mankind is led into the darkness.
[00:01:59.440 --> 00:02:02.200]   But we will not be left behind.
[00:02:02.200 --> 00:02:04.200]   We will be led into the darkness.
[00:02:04.200 --> 00:02:06.200]   We will be led into the darkness.
[00:02:06.200 --> 00:02:08.200]   We will be led into the darkness.
[00:02:08.200 --> 00:02:10.200]   We will be led into the darkness.
[00:02:10.200 --> 00:02:12.200]   We will be led into the darkness.
[00:02:12.200 --> 00:02:14.200]   We will be led into the darkness.
[00:02:14.200 --> 00:02:16.200]   We will be led into the darkness.
[00:02:16.200 --> 00:02:18.200]   We will be led into the darkness.
[00:02:18.200 --> 00:02:20.200]   We will be led into the darkness.
[00:02:20.200 --> 00:02:22.200]   We will be led into the darkness.
[00:02:22.200 --> 00:02:24.200]   We will be led into the darkness.
[00:02:24.200 --> 00:02:26.200]   We will be led into the darkness.
[00:02:26.200 --> 00:02:28.200]   We will be led into the darkness.
[00:02:28.200 --> 00:02:29.300]   We will be led into the darkness.

mtrazzi · 2024-01-21T09:13:02Z

any updates on this? I had the same problem using the large model v3

Lavrikov · 2024-01-31T07:38:54Z

try to use flag -mc 0. It helps to avoid adding previous text prompt to new one

brbrainerd · 2024-07-06T10:20:07Z

try to use flag -mc 0. It helps to avoid adding previous text prompt to new one

Great solution. I thought I'd post a Python function that removes sequentially repeated lines in case you would like to keep your token history. It's worked successfully on a media library with ~14,000 videos:

def check_repeated_lines(vtt_file):
    """Check for and remove repeated subtitle lines in the VTT file."""
    logging.debug(f"Checking for repeated lines in {vtt_file}")
    with open(vtt_file, 'r') as file:
        content = file.readlines()

    cleaned_content = []
    previous_line = None
    skip_next = False

    i = 0
    while i < len(content):
        line = content[i]
        if re.match(r'^[0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{3} --> [0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{3}$', line):
            if skip_next:
                skip_next = False
                i += 2  # Skip the current timestamp and the next line
                continue
            cleaned_content.append(line)
            if i + 1 < len(content):
                next_line = content[i + 1].strip()
                if next_line == previous_line:
                    cleaned_content.pop()  # Remove the last timestamp
                    skip_next = True
                else:
                    cleaned_content.append(content[i + 1])
                previous_line = next_line
            i += 2
        else:
            cleaned_content.append(line)
            i += 1

    # Remove any remaining blank lines
    cleaned_content = [line for line in cleaned_content if line.strip()]

    with open(vtt_file, 'w') as file:
        file.writelines(cleaned_content)

    logging.debug(f"Finished cleaning repeated lines in {vtt_file}")
    return False

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large model starts to repeat itself / gets stuck on a phrase #924

Large model starts to repeat itself / gets stuck on a phrase #924

timmermansjoy commented May 15, 2023

mrfragger commented May 18, 2023

dhx commented May 20, 2023

jingyibo123 commented Jan 8, 2024

mtrazzi commented Jan 21, 2024

Lavrikov commented Jan 31, 2024

brbrainerd commented Jul 6, 2024 •

edited

Loading

Large model starts to repeat itself / gets stuck on a phrase #924

Large model starts to repeat itself / gets stuck on a phrase #924

Comments

timmermansjoy commented May 15, 2023

mrfragger commented May 18, 2023

dhx commented May 20, 2023

jingyibo123 commented Jan 8, 2024

mtrazzi commented Jan 21, 2024

Lavrikov commented Jan 31, 2024

brbrainerd commented Jul 6, 2024 • edited Loading

brbrainerd commented Jul 6, 2024 •

edited

Loading