ValueError: from_string: error parsing grammar file: parsed_grammar.rules is empty #615

talhalatifkhan · 2023-08-15T19:10:52Z

Discussed in #614

^{Originally posted by talhalatifkhan August 16, 2023}
I am trying to make sure that my output follow a json format every time, i stumbled upon jsonformer and from there i stumbled upon grammar-based sampling, I used json-schema-to-grammar.py to convert json schema.

I want to know if grammar based sampling is used for this specific purpose and if so then how do i use it.

Json schema

json_schema = {
    "type": "object",
    "properties": {
        "Stage": {
            "type": "string",
            "enum": ["first", "second"]
        },
        "Task Finished": {"type": "boolean"},
        "Statement": {"type": "string"},
        "Assistant": {"type": "string"}
    }
}

Llama grammar

space ::= " "?
string ::=  "\"" (
        [^"\\] |
        "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
      )* "\"" space 
Stage ::= "\"first\"" | "\"second\""
boolean ::= ("true" | "false") space
root ::= "{" space "\"Assistant\"" space ":" space string "," space "\"Stage\"" space ":" space Stage "," space "\"Statement\"" space ":" space string "," space "\"Task Finished\"" space ":" space boolean "}" space

Here is my code

from llama_cpp import Llama, LlamaGrammar

fs_template = """
You are a precise AI comparer. Your task is to match the user's intent to the statements in the context and confirm if the identified intent is correct.
Your responses should strictly follow the format below:
    Stage: [print 'first']
    User Intent: [insert user intent statement here]
    Task Finished: [insert boolean value based on whether user intent is confirmed]
    Assistant: [inser Assistant response here ]


Adhere to the following instructions to complete the task:
1. Start by trying to match the user's question to the statements in the context.
2. If you identify the matching statement to the user's question then confirm it from the user.
3. If the user's intent is unclear or doesn't match the context, ask follow-up questions by providing the options in the context.
4. Once you have confirmed the user intent, set "Task Finished: True" and proceed with your response.
5. You will fail your task if the output generated does not follow the format mentioned above.

Context: (only knowledge base you have)
------------
sample context
-----------
"""

schema = '''
space ::= " "?
string ::=  "\"" (
        [^"\\] |
        "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
      )* "\"" space 
Stage ::= "\"first\"" | "\"second\""
boolean ::= ("true" | "false") space
root ::= "{" space "\"Assistant\"" space ":" space string "," space "\"Stage\"" space ":" space Stage "," space "\"Statement\"" space ":" space string "," space "\"Task Finished\"" space ":" space boolean "}" space
'''


def get_prompt(question: str, chat_history: list,
               system_prompt: str) -> str:
    texts = [f'[INST] <<SYS>>\n{system_prompt}\n<</SYS>>\n\n']
    for user_input, response in chat_history:
        texts.append(f'{user_input.strip()} [/INST] {response.strip()} </s><s> [INST] ')
    texts.append(f'{question.strip()} [/INST]')
    return ''.join(texts)


history = []
prompt = get_prompt("user query", history, fs_template)

grammar = LlamaGrammar.from_string(grammar=schema, verbose=True)
print(grammar)
client = Llama(
    model_path="model/llama-2-13b-chat.ggmlv3.q8_0.bin",
    n_ctx=4098,
    n_threads=16,
    last_n_tokens_size=70,
)

answer = client(
    prompt,
    grammar=grammar,
    stream=False,
    temperature=0.0,
    top_p=0.95,
    top_k=50,
    repeat_penalty=1.3,
    max_tokens=4000,
)
print(answer)

This is the error i am getting

parse: error parsing grammar: expecting newline or end at \] |
        "\" (["\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
      )* """ space 
Stage ::= ""first"" | ""second""
boolean ::= ("true" | "false") space
root ::= "{" space ""Assistant"" space ":" space string "," space ""Stage"" space ":" space Stage "," space ""Statement"" space ":" space string "," space ""Task Finished"" space ":" space boolean "}" space

Traceback (most recent call last):
  File "/home/talha/CloudWhisper/jformer.py", line 49, in <module>
    grammar = LlamaGrammar.from_string(grammar=schema,verbose=True)
  File "/home/talha/.local/lib/python3.10/site-packages/llama_cpp/llama_grammar.py", line 66, in from_string
    raise ValueError(
ValueError: from_string: error parsing grammar file: parsed_grammar.rules is empty

The text was updated successfully, but these errors were encountered:

c0sogi · 2023-08-17T11:53:38Z

Oh I see. That's not a bug. Just put r in front of ''''.
The escape character will screw up your schema.

So, the correct version of your schema should be:

schema = r'''
space ::= " "?
string ::=  "\"" (
        [^"\\] |
        "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
      )* "\"" space 
Stage ::= "\"first\"" | "\"second\""
boolean ::= ("true" | "false") space
root ::= "{" space "\"Assistant\"" space ":" space string "," space "\"Stage\"" space ":" space Stage "," space "\"Statement\"" space ":" space string "," space "\"Task Finished\"" space ":" space boolean "}" space
'''

imaurer · 2023-08-17T14:34:42Z

An issue that doesn't seem to be impacting the functionality, but my grammar doesn't get redisplayed in the output logs:

from_string grammar:
print_grammar: error printing grammar: malformed rule, does not end with LLAMA_GRETYPE_END: 0

Otherwise, this feature works great. Any idea when it will be released so I can pip install via pypi?

c0sogi · 2023-08-17T14:41:02Z

An issue that doesn't seem to be impacting the functionality, but my grammar doesn't get redisplayed in the output logs:
from_string grammar:
print_grammar: error printing grammar: malformed rule, does not end with LLAMA_GRETYPE_END: 0
Otherwise, this feature works great. Any idea when it will be released so I can pip install via pypi?

The printing error should be resolved with this:
#621

talhalatifkhan · 2023-08-17T15:27:51Z

Thankyou @c0sogi , its working fine now

...there was no check. ported upstream from zanussbaum/gpt4all.cpp#2 (I dont see any clean path for upstream patches)

gjmulder added the bug Something isn't working label Aug 16, 2023

c0sogi mentioned this issue Aug 17, 2023

Add grammar-based sampling #572

Merged

talhalatifkhan closed this as completed Aug 17, 2023

antoine-lizee pushed a commit to antoine-lizee/llama-cpp-python that referenced this issue Oct 30, 2023

make : fix darwin f16c flags check (abetlen#615)

1f0414f

...there was no check. ported upstream from zanussbaum/gpt4all.cpp#2 (I dont see any clean path for upstream patches)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: from_string: error parsing grammar file: parsed_grammar.rules is empty #615

ValueError: from_string: error parsing grammar file: parsed_grammar.rules is empty #615

talhalatifkhan commented Aug 15, 2023

c0sogi commented Aug 17, 2023

imaurer commented Aug 17, 2023

c0sogi commented Aug 17, 2023 •

edited

Loading

talhalatifkhan commented Aug 17, 2023

ValueError: from_string: error parsing grammar file: parsed_grammar.rules is empty #615

ValueError: from_string: error parsing grammar file: parsed_grammar.rules is empty #615

Comments

talhalatifkhan commented Aug 15, 2023

Discussed in #614

c0sogi commented Aug 17, 2023

imaurer commented Aug 17, 2023

c0sogi commented Aug 17, 2023 • edited Loading

talhalatifkhan commented Aug 17, 2023

c0sogi commented Aug 17, 2023 •

edited

Loading