Guided Decoding #178

johnml1135 · 2023-07-18T18:51:37Z

From the papers out there, determine the best path forward, research and implement guided decoding. Assess the improved Bleu score and user assessment of quality. Address concerns with different types of languages with preixes and suffixes on proper names and key terms.

johnml1135 · 2023-07-24T17:51:12Z

https://huggingface.co/blog/constrained-beam-search
huggingface/transformers#15761

johnml1135 · 2023-07-24T18:01:12Z

We should likely just use the huggingface implementation from 2022 (see links above) - but it may need to be modified. A few reasons:

It does not use alignment to guide when to add tokens
It cannot deal with wildcards - which is useful in many languages we deal with - but we may be able to build it on disjunctive constraints

ddaspit · 2023-08-03T19:01:55Z

I added preliminary support for using HF constrained beam search to silnlp. From the experiments I have run, it doesn't work very well.

johnml1135 · 2023-08-04T17:22:46Z

@ddaspit - do you know why the tests didn't go that well? Do you have the results documented somewhere? Is it "keyterms with asterisks don't work well" or "their algorithm is poor" or "certain languages don't do well with this"? Is it worth more research now, or do we want someone else to lead the charge? Do we need to add alignment data to enhance it? LILT appears to have been able to get this working well enough to integrate into their main offering - so I am inclined to believe it is possible to have it be advantageous.

ddaspit · 2023-08-04T19:33:26Z

There definitely seems to be something wrong with the implementation in HF. Here is an issue that describes the problems I was seeing.

johnml1135 · 2024-04-17T18:47:03Z

While not implemented, this may do better than the current hugging face implementation: https://arxiv.org/pdf/2112.08726.pdf - with this code: https://github.com/GXimingLu/a_star_neurologic.

johnml1135 · 2025-02-07T13:24:38Z

It appears that the old "constraint" decoding is not supported anymore, but rather sequence_bias and repetition_penalty in the GenerationConfig.

I strongly suspect that this new set of updates may be worth another round of research to see how well it fairs.

johnml1135 · 2025-02-07T13:46:25Z

Actually, I took a look at sequence_bias, and it is a bit different than constrained decoding - potentially more for LLM's than NLLB - still it may be worth another look. When we checked out constrained decoding last, it was verifiably broken, and it is not due to be fixed or maintained by the huggingface team. If we want something that will work today and tomorrow, we should investigate sequence_bias.

ddaspit · 2025-02-07T15:29:29Z

Sequence bias looks interesting. Hopefully, it works better than constrained decoding. At some point, we should update the existing code to use sequence bias instead.

johnml1135 added the enhancement New feature or request label Jul 18, 2023

johnml1135 added this to Serval Jul 18, 2023

johnml1135 moved this to 🆕 New in Serval Jul 18, 2023

johnml1135 added this to the 1.4 NMT Dynamic Suggestions milestone Jul 27, 2023

johnml1135 assigned ddaspit Aug 1, 2023

ddaspit moved this from 🆕 New to 📋 Backlog in Serval Sep 4, 2023

johnml1135 removed this from Serval Dec 1, 2023

johnml1135 added this to SIL-NLP Research Dec 1, 2023

ddaspit moved this to 🆕 New in SIL-NLP Research Jun 7, 2024

ddaspit removed this from the Research 2024 Q2 milestone Jun 7, 2024

ddaspit removed their assignment Jun 7, 2024

ddaspit added research Research topics and removed enhancement New feature or request labels Jun 7, 2024

ddaspit changed the title ~~Research Guided Decoding~~ Guided Decoding Jun 7, 2024

ddaspit moved this from 🆕 New to 📋 Backlog in SIL-NLP Research Jun 7, 2024

This was referenced Feb 11, 2025

Guided decoding with "enhanced" keyterms #652

Open

Best usage of Keyterms (Proper Names) #653

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Guided Decoding #178

Guided Decoding #178

johnml1135 commented Jul 18, 2023

johnml1135 commented Jul 24, 2023 •

edited

Loading

Uh oh!

johnml1135 commented Jul 24, 2023

Uh oh!

ddaspit commented Aug 3, 2023

Uh oh!

johnml1135 commented Aug 4, 2023

Uh oh!

ddaspit commented Aug 4, 2023

Uh oh!

johnml1135 commented Apr 17, 2024

Uh oh!

johnml1135 commented Feb 7, 2025 •

edited

Loading

Uh oh!

johnml1135 commented Feb 7, 2025

Uh oh!

ddaspit commented Feb 7, 2025

Uh oh!

Uh oh!

Guided Decoding #178

Guided Decoding #178

Comments

johnml1135 commented Jul 18, 2023

johnml1135 commented Jul 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

johnml1135 commented Jul 24, 2023

Uh oh!

ddaspit commented Aug 3, 2023

Uh oh!

johnml1135 commented Aug 4, 2023

Uh oh!

ddaspit commented Aug 4, 2023

Uh oh!

johnml1135 commented Apr 17, 2024

Uh oh!

johnml1135 commented Feb 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

johnml1135 commented Feb 7, 2025

Uh oh!

ddaspit commented Feb 7, 2025

Uh oh!

johnml1135 commented Jul 24, 2023 •

edited

Loading

johnml1135 commented Feb 7, 2025 •

edited

Loading