Reasoning effort experiments #12339

supersteves · 2025-03-11T16:49:40Z

supersteves
Mar 11, 2025

I'm trying to find a way to influence thinking time in deepseek distills that emit <think>...</think> before the main output. Similar to OAI's reasoning_effort.

Motivation: I want to limit the amount of thinking. Often it's thinking for too long. But without disabling it altogether (I can do that with grammar).

I came across this:
#11351

And this:
https://www.reddit.com/r/LocalLLaMA/comments/1j85snw/experimental_control_the_thinking_effort_of_qwq/

I've been playing with llama.cpp to try and get this to work. I'm using the built in web chat interface, with custom JSON config, such as:

{
  "logit_bias": [
    [ "</think>", 1.5 ]
  ]
}

(I note it has a different structure to the OAI property of the same name).

It doesn't seem to have any effect, until I increase it to about 15 or 20, when I suddenly get an endless stream of tokens.

Similarly, -15 makes it never generate . You get the same amount of thinking, but it ends with , and no output.

(I guess the threshold of "interpreted as infinity" is around +/- 15 or so. That's fine.)

Smaller numbers (1.5, zero, -1.5) seem to have no effect whatsoever.

As an experiment, I also tried asking "list 10 colours" with:

{
  "logit_bias": [
    [ "red", false ], 
    [ "Red", false ]
  ]
}

And I got plenty of Red and red in both the thinking and regular output! Something's not working as I'd expect.
(I also tried with the numerical tokens. No different.)

This is a build from source, about a week ago - 5e43f10.

steampunque · 2025-03-12T00:22:54Z

steampunque
Mar 12, 2025

I'm trying to find a way to influence thinking time in deepseek distills that emit <think>...</think> before the main output. Similar to OAI's reasoning_effort.

I found simple prompt engineering can go a long way to influence the R1 distills. If you can't inject a think bootstrap as I showed in #11351 it might be possible to just add to your prompt something like "limit thinking as much as possible". Examples:

sally.txt = "Sally (a girl) has 3 brothers. Each brother has 2 sisters. How many sisters does Sally have? Use step by step reasoning to solve problem."

Freestyle think (R1 7B), very verbose:

bash-5.1$ lm sally.txt
<think>
Okay, so I have this problem here: Sally has 3 brothers, and each brother has 2 sisters. The question is asking how many sisters Sally has. Hmm, let me try to figure this out step by step.

First, let's break down the information given. Sally is a girl, and she has 3 brothers. So, in the family, there are Sally plus her 3 brothers. That makes a total of 4 boys so far, right? Wait, no, hold on. Sally is a girl, so her brothers are male siblings. So, Sally plus her 3 brothers would make 4 children in total, but all of them are male except Sally. Hmm, maybe I should think about the family structure.

Now, each brother has 2 sisters. So, each of Sally's brothers has 2 sisters. But Sally is one sister, so there must be another sister besides her. Let me write this down to visualize it better.

Let's denote Sally as S. She has 3 brothers: let's call them B1, B2, and B3. Each of these brothers has 2 sisters. Sally is one sister, so the other sister must be another girl in the family. Let's call her S2.

So, the family consists of Sally (S), her 3 brothers (B1, B2, B3), and another sister (S2). That makes a total of 5 children: 4 boys and 1 girl? Wait, no, hold on. Wait, Sally is a girl, so she's one sister, and S2 is another sister. So, that's 2 girls and 3 boys. So, in total, the family has 5 children: 2 girls and 3 boys.

But let me make sure I'm not making a mistake here. Each brother has 2 sisters. So, each brother's sisters are Sally and S2. That makes 2 sisters for each brother, which matches the information given. So, that seems consistent.

Wait, but Sally is a girl, so she has S2 as her sister. So, Sally has 1 sister. But let me think again. If each brother has 2 sisters, and Sally is one, then the other sister must be another girl in the family. So, Sally plus S2 are the two sisters. Therefore, Sally has 1 sister.

But hold on, is that correct? Let me think about it differently. If Sally has 3 brothers, and each brother has 2 sisters, that means in the family, there are 2 sisters in total. Sally is one, so the other sister must be another girl. Therefore, Sally has 1 sister.

Wait, but sometimes these problems can be tricky because of the way they're phrased. Let me make sure I'm not missing anything.

So, Sally is a girl with 3 brothers. Each brother has 2 sisters. So, the sisters are Sally and another girl. So, Sally has 1 sister. That seems straightforward.

But let me try to visualize the family. The family has Sally, her 3 brothers, and another sister. So, in total, 5 children: 2 girls and 3 boys. Each brother has 2 sisters: Sally and the other sister. That fits the description.

Alternatively, if I think about it from the perspective of one of the brothers, he has Sally and another sister, making 2 sisters. So, that's consistent.

Wait, but sometimes people get confused because they might think that Sally is included in the count for each brother, but no, each brother's sisters are the two girls in the family: Sally and the other sister.

So, Sally has 1 sister. Therefore, the answer should be 1 sister.

But hold on, let me make sure I'm not overcomplicating this. Sometimes these problems can be tricky because of the wording. Let me rephrase the problem.

Sally has 3 brothers. So, in the family, there are Sally and her 3 brothers, making 4 children. Each brother has 2 sisters. So, each brother has Sally and another sister. Therefore, the family must have Sally, her 3 brothers, and another sister. So, that's 5 children in total: 2 girls and 3 boys.

Therefore, Sally has 1 sister. So, the answer is 1.

Wait, but let me think again. If Sally had 2 sisters, then each brother would have 3 sisters, right? Because Sally plus two more sisters. But the problem says each brother has 2 sisters. So, that would mean Sally has only 1 sister.

Yes, that makes sense. So, Sally has 1 sister.

I think that's the correct answer. So, Sally has 1 sister.

**Final Answer**
Sally has \boxed{1} sister.
</think>

Sally is a girl who has 3 brothers. Each of her brothers has 2 sisters. To determine how many sisters Sally has, we analyze the family structure.

1. Sally has 3 brothers, so there are 4 boys in total (including Sally's brothers).
2. Each brother has 2 sisters. Since Sally is one sister, there must be another sister in the family.
3. Therefore, the family consists of Sally, her 3 brothers, and another sister, making a total of 5 children: 2 girls and 3 boys.
4. Sally has 1 sister.

Thus, the number of sisters Sally has is \boxed{1}.

Limit thinking with prompt:

bash-5.1$ lm "@sally.txt Think for as short a time as possible."
<think>
First, Sally has 3 brothers.

Each of these brothers has 2 sisters.

Since Sally is one of the sisters, there must be one additional sister.

Therefore, Sally has 1 sister.
</think>

Let's solve the problem step by step.

**Given:**
- Sally has 3 brothers.
- Each brother has 2 sisters.

**Step 1:** Sally is one of the sisters.

**Step 2:** Since each brother has 2 sisters, and Sally is one of them, there must be another sister.

**Conclusion:** Sally has **1 sister**.

\boxed{1}

Another deterministic way to do this is to use triggered prompt injection on one of the continuation phrases as I discussed in #11351. I added a TRIG function to accomplish this in my experimental server. It is guaranteed to halt the thinking process. In this example I trigger on one of the continuation phrases as long as 256 tokens have already been generated. In general the TRIG can be an array of trigger records with a whole bunch of continuation phrases the model uses. This most likely will reduce accuracy on harder problems though as models desired generation is truncated and there is no guarantee it is not being halted without correctly solving the (harder) problem.

bash-5.1$ TRIG='["But hold on,"," I think thats right.\n</think>", 256]' lm sally.txt
<think>
Okay, so I have this problem here: Sally has 3 brothers, and each brother has 2 sisters. The question is asking how many sisters Sally has. Hmm, let me try to figure this out step by step.

First, let's break down the information given. Sally is a girl, and she has 3 brothers. So, in the family, there are Sally plus her 3 brothers. That makes a total of 4 boys so far, right? Wait, no, hold on. Sally is a girl, so her brothers are male siblings. So, Sally plus her 3 brothers would make 4 children in total, but all of them are male except Sally. Hmm, maybe I should think about the family structure.

Now, each brother has 2 sisters. So, each of Sally's brothers has 2 sisters. But Sally is one sister, so there must be another sister besides her. Let me write this down to visualize it better.

Let's denote Sally as S. She has 3 brothers: let's call them B1, B2, and B3. Each of these brothers has 2 sisters. Sally is one sister, so the other sister must be another girl in the family. Let's call her S2.

So, the family consists of Sally (S), her 3 brothers (B1, B2, B3), and another sister (S2). That makes a total of 5 children: 4 boys and 1 girl? Wait, no, hold on. Wait, Sally is a girl, so she's one sister, and S2 is another sister. So, that's 2 girls and 3 boys. So, in total, the family has 5 children: 2 girls and 3 boys.

But let me make sure I'm not making a mistake here. Each brother has 2 sisters. So, each brother's sisters are Sally and S2. That makes 2 sisters for each brother, which matches the information given. So, that seems consistent.

Wait, but Sally is a girl, so she has S2 as her sister. So, Sally has 1 sister. But let me think again. If each brother has 2 sisters, and Sally is one, then the other sister must be another girl in the family. So, Sally plus S2 are the two sisters. Therefore, Sally has 1 sister.

But hold on, I think thats right.
</think>

Sally has 1 sister. 

Step-by-step explanation:

1. Sally is a girl with 3 brothers.
2. Each brother has 2 sisters.
3. Since Sally is one sister, there must be another sister (S2) to make it 2 sisters for each brother.
4. Therefore, Sally has 1 sister.

0 replies

supersteves · 2025-03-12T08:50:20Z

supersteves
Mar 12, 2025
Author

Thanks for your suggestions @steampunque. I'm using the OAI chat completions endpoint of vanilla llama.cpp, so this may limit what I can do.

I think one of your suggestions is to supply an incomplete thinking prelude which the autoregressive model will continue as if it had generated it itself, containing the hints about how to think. This feels to me like a refinement of simply giving a system prompt in chat completions. (I tried that superficially, didn't notice much of a difference, but I must confess I didn't try very hard.) To do it this way, I presumably would need to use the (non chat) completions endpoint. I'm not hugely familar with how OAI chat completions decomposes into lower level completions, so it might be hard for me to adapt my chat+instruct+grammar based approach.

The other suggestion appears to involve triggers. During generation, upon seeing some state (e.g. some tokens appear, or a certain length of thinking be reached), hijack the current completion, and insert extra text like "Hang on, I've been thinking too long, I better wrap this up" (crudely speaking). I'm not sure how you are doing this.

1 reply

steampunque Mar 12, 2025

Thanks for your suggestions @steampunque. I'm using the OAI chat completions endpoint of vanilla llama.cpp, so this may limit what I can do.

The other suggestion appears to involve triggers. During generation, upon seeing some state (e.g. some tokens appear, or a certain length of thinking be reached), hijack the current completion, and insert extra text like "Hang on, I've been thinking too long, I better wrap this up" (crudely speaking). I'm not sure how you are doing this.

I leveraged the "STOP" processing logic to detect trigger strings in "process_token" routine, and added programmable forced generation to the sampler subsystem. When the sampler is called I check if the force generation is active and if so prioritize injecting the tokens from the force token vector and ignore all other sampler processing. This same force sampler I use if I want to bootstrap the generation start or compute perplexity on a prompt with the server (its quite a versatile capability). For bootstrapping generation start you can also just modify the prompt template to include the text you want at the beginning. I found this mostly useful to force generation of different (deterministic) answers vs. limiting the thinking.

Simply appending "Think for as short a time as possible." to your prompt might be able to get the job done depending on the questions being given to the model (hard questions may just inherently need a lot of inference time compute to get good answers).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reasoning effort experiments #12339

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

Reasoning effort experiments #12339

supersteves Mar 11, 2025

Replies: 2 comments · 1 reply

steampunque Mar 12, 2025

supersteves Mar 12, 2025 Author

steampunque Mar 12, 2025

supersteves
Mar 11, 2025

Replies: 2 comments 1 reply

steampunque
Mar 12, 2025

supersteves
Mar 12, 2025
Author