Skip to content

Commit 74fe95d

Browse files
Merge remote-tracking branch 'github/develop' into fix/colang-2-runtime-issues
2 parents 1b8ea70 + 48b5b96 commit 74fe95d

File tree

172 files changed

+8835
-799
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

172 files changed

+8835
-799
lines changed

CHANGELOG-Colang.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,12 +8,15 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
88

99
### Added
1010

11+
* [#673](https://github.com/NVIDIA/NeMo-Guardrails/pull/673) Add support for new Colang 2 keyword `deactivate`.
12+
1113
### Changed
1214

1315
* [#669](https://github.com/NVIDIA/NeMo-Guardrails/pull/669) Merged (and removed) utils library file with core library.
1416

1517
### Fixed
1618

19+
* [#672](https://github.com/NVIDIA/NeMo-Guardrails/pull/672) Fixes a event group match bug (e.g. `match $flow_ref.Finished() or $flow_ref.Failed()`)
1720
* [#699](https://github.com/NVIDIA/NeMo-Guardrails/pull/699) Fix issues with ActionUpdated events and user utterance action extraction.
1821

1922
## [2.0-beta.2] - 2024-07-25

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -289,7 +289,7 @@ NeMo Guardrails integrates seamlessly with LangChain. You can easily wrap a guar
289289

290290
Evaluating the safety of a LLM-based conversational application is a complex task and still an open research question. To support proper evaluation, NeMo Guardrails provides the following:
291291

292-
1. An [evaluation tool](./nemoguardrails/eval/README.md), i.e. `nemoguardrails evaluate`, with support for topical rails, fact-checking, moderation (jailbreak and output moderation) and hallucination.
292+
1. An [evaluation tool](nemoguardrails/evaluate/README.md), i.e. `nemoguardrails evaluate`, with support for topical rails, fact-checking, moderation (jailbreak and output moderation) and hallucination.
293293
2. An experimental [red-teaming interface](https://docs.nvidia.com/nemo/guardrails/security/red-teaming.html).
294294
3. Sample LLM Vulnerability Scanning Reports, e.g, [ABC Bot - LLM Vulnerability Scan Results](https://docs.nvidia.com/nemo/guardrails/evaluation/llm-vulnerability-scanning.html)
295295

docs/colang_2/language_reference/more-on-flows.rst

Lines changed: 46 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,13 @@ More on Flows
1212
1313
In section :ref:`Defining Flows<defining-flows>` we learned the core mechanisms of flows. In this section will look at more advanced topics that are related to flows.
1414

15+
.. _more-on-flows-activate-a-flow:
16+
1517
----------------------------------------
1618
Activate a Flow
1719
----------------------------------------
1820

19-
We already have seen the ``start`` and ``await`` keywords to trigger a flow. We are now introducing the third keyword ``activate`` that can start a flow. The difference to ``start`` lies in the behavior of the flow when it has finished or failed. If a flow was activated it will always automatically restart a new instance of the flow as soon as it has ended.
21+
We already have seen the ``start`` and ``await`` keywords to trigger a flow. We are now introducing the third keyword ``activate`` that can start a flow. The difference to ``start`` lies in the behavior of the flow when it has finished or failed. If a flow was activated it will always automatically restart a new instance of the flow as soon as it has ended. Furthermore, a specific flow configuration (with identical flow parameters) can only be activated once and will not start new instances, even if activated multiple times.
2022

2123
.. important::
2224
Flow activation statement syntax definition:
@@ -25,8 +27,8 @@ We already have seen the ``start`` and ``await`` keywords to trigger a flow. We
2527
2628
activate <Flow> [and <Flow>]…
2729
28-
- Currently, reference assignments for activated flows is not supported since the instance will change after a restart
29-
- Only flow and-groups are supported
30+
- Reference assignments for activated flows is not supported since the instance will change after a restart
31+
- Only and-groups are supported and not or-groups
3032

3133
Examples:
3234

@@ -35,6 +37,10 @@ We already have seen the ``start`` and ``await`` keywords to trigger a flow. We
3537
# Activate a single flow
3638
activate handling user presents
3739
40+
# Activate two different instances of the same flow with parameters
41+
activate handling user said "Hi"
42+
activate handling user said "Bye"
43+
3844
# Activate a group of flows
3945
activate handling user presents and handling question repetition 5.0
4046
@@ -82,10 +88,10 @@ Running this example you will see the bot responding with "Hello again" as long
8288
8389
In contrast, you can only say "Bye" once before you restart the story.
8490

85-
Activating a flow enables you to keep matching the interaction event sequence against the pattern defined in the flow, even if the pattern previously successfully matched the interaction event sequence or failed. This is often used for passively observing a pattern to update a state or to trigger actions that don't compete with the main interaction pattern.
91+
Activating a flow enables you to keep matching the interaction event sequence against the pattern defined in the flow, even if the pattern previously successfully matched the interaction event sequence (finished) or failed. Since the same flow configuration can only be activated once, you can use the flow activation directly wherever you require the flow's functionality. This on demand pattern is better than activating it once in the beginning before you actually know if it is needed.
8692

8793
.. important::
88-
Activating a flow will start a flow and automatically restart it when it has ended to match to reoccurring interaction patterns.
94+
Activating a flow will start a flow and automatically restart it when it has ended (finished or failed) to match to reoccurring interaction patterns.
8995

9096
.. important::
9197
The main flow behaves also like an activated flow. As soon as it reaches the end it will restart automatically.
@@ -120,6 +126,8 @@ See, how the main flow does not require any match statement at the end and will
120126
.. important::
121127
An activated flow that immediately finished (does not wait for any event) will only be run once and will stay activated.
122128

129+
.. _more-on-flows-start-a-new-flow-instance:
130+
123131
----------------------------------------
124132
Start a new Flow Instance
125133
----------------------------------------
@@ -206,6 +214,39 @@ Since the first instance already started a new instance (second one) it will not
206214
.. note::
207215
You can think of the ``start_new_flow_instance`` label being at the end of each activated flow. Defining it in a different position will move it up from the default position at the end.
208216

217+
.. _more-on-flows-deactivate-a-flow:
218+
219+
----------------------------------------
220+
Deactivate a Flow
221+
----------------------------------------
222+
223+
An activated flow will usually stay alive since it always restarts when it finishes or fails. To deactivate an activated flow you can use the `deactivate` keyword:
224+
225+
.. important::
226+
Flow deactivation statement syntax definition:
227+
228+
.. code-block:: text
229+
230+
deactivate <Flow>
231+
232+
Examples:
233+
234+
.. code-block:: colang
235+
236+
# Deactivate a single flow
237+
deactivate handling user presents
238+
239+
# Deactivate two different instances of the same flow with different parameters
240+
deactivate handling user said "Hi"
241+
deactivate handling user said "Bye"
242+
243+
Under the hood the `deactivate` keyword will abort the flow and disable the restart. It is a shortcut for this statement:
244+
245+
.. code-block:: colang
246+
247+
send StopFlow(flow_id="flow name", deactivate=True)
248+
249+
209250
.. _more-on-flows-override-flows:
210251

211252
---------------

docs/evaluation/README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ For the initial evaluation experiments for dialog rails, we have used two datase
4545

4646
The datasets were transformed into a NeMo Guardrails app by defining canonical forms for each intent, specific dialogue flows, and even bot messages (for the _chit-chat_ dataset alone).
4747
The two datasets have a large number of user intents, thus dialog rails. One of them is very generic and has higher-grained intents (_chit-chat_), while the _banking_ dataset is domain-specific and more fine-grained.
48-
More details about running the dialog rails evaluation experiments and the evaluation datasets are available [here](./../../nemoguardrails/eval/data/topical/README.md).
48+
More details about running the dialog rails evaluation experiments and the evaluation datasets are available [here](../../nemoguardrails/evaluate/data/topical/README.md).
4949

5050
Preliminary evaluation results follow next. In all experiments, we have chosen to have a balanced test set with at most 3 samples per intent.
5151
For both datasets, we have assessed the performance for various LLMs and also for the number of samples (`k = all, 3, 1`) per intent that are indexed in the vector database.
@@ -163,15 +163,15 @@ Here is a list of arguments that you can use to configure the fact-checking rail
163163
- `output-dir`: The directory to save the output to. The default is `eval_outputs/factchecking`.
164164
- `write-outputs`: Whether to write the outputs to a file or not. The default is `True`.
165165
166-
More details on how to set up the data in the right format and run the evaluation on your own dataset can be found [here](./../../nemoguardrails/eval/data/factchecking/README.md).
166+
More details on how to set up the data in the right format and run the evaluation on your own dataset can be found [here](../../nemoguardrails/evaluate/data/factchecking/README.md).
167167
168168
#### Evaluation Results
169169
170170
Evaluation Date - Nov 23, 2023 (Mar 7 2024 for `gemini-1.0-pro`).
171171
172172
We evaluate the performance of the fact-checking rail on the [MSMARCO](https://huggingface.co/datasets/ms_marco) dataset using the Self-Check and the AlignScore approaches. To build the dataset, we randomly sample 100 (question, correct answer, evidence) triples, and then, for each triple, build a non-factual or incorrect answer to yield 100 (question, incorrect answer, evidence) triples.
173173
174-
We breakdown the performance into positive entailment accuracy and negative entailment accuracy. Positive entailment accuracy is the accuracy of the model in correctly identifying answers that are grounded in the evidence passage. Negative entailment accuracy is the accuracy of the model in correctly identifying answers that are **not** supported in the evidence. Details on how to create synthetic negative examples can be found [here](./../../nemoguardrails/eval/data/factchecking/README.md)
174+
We breakdown the performance into positive entailment accuracy and negative entailment accuracy. Positive entailment accuracy is the accuracy of the model in correctly identifying answers that are grounded in the evidence passage. Negative entailment accuracy is the accuracy of the model in correctly identifying answers that are **not** supported in the evidence. Details on how to create synthetic negative examples can be found [here](../../nemoguardrails/evaluate/data/factchecking/README.md)
175175
176176
| Model | Positive Entailment Accuracy | Negative Entailment Accuracy | Overall Accuracy | Average Time Per Checked Fact (ms) |
177177
|------------------------|------------------------------|------------------------------|------------------|------------------------------------|
@@ -224,7 +224,7 @@ To evaluate the output moderation rail only, use the following command:
224224

225225
```nemoguardrails evaluate moderation --check-input False --config=path/to/guardrails/config```
226226

227-
More details on how to set up the data in the right format and run the evaluation on your own dataset can be found [here](./../../nemoguardrails/eval/data/moderation/README.md).
227+
More details on how to set up the data in the right format and run the evaluation on your own dataset can be found [here](../../nemoguardrails/evaluate/data/moderation/README.md).
228228

229229
#### Evaluation Results
230230

@@ -315,7 +315,7 @@ To evaluate the hallucination rail on your own dataset, you can follow the creat
315315

316316
#### Evaluation Results
317317

318-
To evaluate the hallucination rail, we manually curate a set of [questions](./../../nemoguardrails/eval/data/hallucination/sample.txt) which mainly consists of questions with a false premise, i.e., questions that cannot have a correct answer.
318+
To evaluate the hallucination rail, we manually curate a set of [questions](../../nemoguardrails/evaluate/data/hallucination/sample.txt) which mainly consists of questions with a false premise, i.e., questions that cannot have a correct answer.
319319

320320
For example, the question "What is the capital of the moon?" has a false premise since the moon does not have a capital. Since the question is stated in a way that implies that the moon has a capital, the model might be tempted to make up a fact and answer the question.
321321

docs/getting_started/installation-guide.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,7 @@ NeMo Guardrails is under active development and the main branch always contains
9191
The `nemoguardrails` package also defines the following extra dependencies:
9292

9393
- `dev`: packages required by some extra Guardrails features for developers, such as the **autoreload** feature.
94-
- `eval`: packages used for the Guardrails [evaluation tools](../../nemoguardrails/eval/README.md).
94+
- `eval`: packages used for the Guardrails [evaluation tools](../../nemoguardrails/evaluate/README.md).
9595
- `openai`: installs the latest `openai` package supported by NeMo Guardrails.
9696
- `sdd`: packages used by the [sensitive data detector](../user_guides/guardrails-library.md#sensitive-data-detection) integrated in NeMo Guardrails.
9797
- `all`: installs all extra packages.

docs/user_guides/advanced/embedding-search-providers.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ core:
1616
use_batching: False
1717
max_batch_size: 10
1818
max_batch_hold: 0.01
19+
search_threshold: None
1920
cache:
2021
enabled: False
2122
key_generator: md5
@@ -31,6 +32,7 @@ knowledge_base:
3132
use_batching: False
3233
max_batch_size: 10
3334
max_batch_hold: 0.01
35+
search_threshold: None
3436
cache:
3537
enabled: False
3638
key_generator: md5

docs/user_guides/cli.md

Lines changed: 56 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -3,17 +3,20 @@
33
**NOTE: THIS SECTION IS WORK IN PROGRESS.**
44

55
## Guardrails CLI
6+
67
For testing purposes, the Guardrails toolkit provides a command line chat that can be used to interact with the LLM.
8+
79
```
810
> nemoguardrails chat --config examples/ [--verbose] [--verbose-llm-calls]
911
```
12+
1013
## Options
14+
1115
- `--config`: The configuration that should be used. Can be a folder or a .co/.yml file.
1216
- `--verbose`: In verbose mode, detailed debugging information is also shown.
1317
- `--verbose-llm-calls`: In verbose LLM calls mode, the debugging information includes the entire prompt that is sent to the LLM and the completion.
1418

15-
16-
7. You should now be able to invoke the `nemoguardrails` CLI.
19+
You should now be able to invoke the `nemoguardrails` CLI.
1720

1821
```bash
1922
> nemoguardrails --help
@@ -31,11 +34,14 @@ For testing purposes, the Guardrails toolkit provides a command line chat that c
3134
Commands:
3235
actions-server Starts a NeMo Guardrails actions server.
3336
chat Starts an interactive chat session.
37+
convert Convert a Colang 1.0 directory to Colang 2.0 format.
38+
evaluate Run an evaluation task.
3439
server Starts a NeMo Guardrails server.
3540
```
3641

3742
You can also use the `--help` flag to learn more about each of the `nemoguardrails` commands:
3843

44+
#### actions-server
3945
```bash
4046
> nemoguardrails actions-server --help
4147

@@ -48,6 +54,7 @@ For testing purposes, the Guardrails toolkit provides a command line chat that c
4854
--help Show this message and exit.
4955
```
5056

57+
#### chat
5158
```bash
5259
> nemoguardrails chat --help
5360

@@ -82,15 +89,54 @@ For testing purposes, the Guardrails toolkit provides a command line chat that c
8289
[default: None]
8390
--help Show this message and exit.
8491
```
92+
#### server
93+
```bash
94+
> nemoguardrails server --help
95+
96+
Usage: nemoguardrails server [OPTIONS]
97+
98+
Starts a NeMo Guardrails server.
99+
100+
Options:
101+
--port INTEGER The port that the server should listen on. [default: 8000]
102+
--config TEXT Path to a directory containing multiple configuration sub-folders.
103+
--verbose --no-verbose: If the server should be verbose and output detailed logs including prompts. [default: no-verbose]
104+
--disable-chat-ui --no-disable-chat-ui Weather the ChatUI should be disabled [default: no-disable-chat-ui]
105+
--auto-reload --no-auto-reload Enable auto reload option. [default: no-auto-reload]
106+
--prefix TEXT A prefix that should be added to all server paths. Should start with '/'.
107+
--help Show this message and exit.
108+
```
85109
86-
```bash
87-
> nemoguardrails server --help
110+
#### evaluate
111+
```bash
112+
> nemoguardrails evaluate --help
88113

89-
Usage: nemoguardrails server [OPTIONS]
114+
Usage: nemoguardrails evaluate [OPTIONS] COMMAND [ARGS]...
90115

91-
Starts a NeMo Guardrails server.
116+
Options:
117+
--help: Show this message and exit.
92118

93-
Options:
94-
--port INTEGER The port that the server should listen on. [default: 8000]
95-
--help Show this message and exit.
96-
```
119+
Commands:
120+
fact-checking: Evaluate the performance of the fact-checking rails defined in a Guardrails application.
121+
hallucination: Evaluate the performance of the hallucination rails defined in a Guardrails application.
122+
moderation: Evaluate the performance of the moderation rails defined in a Guardrails application.
123+
topical: Evaluates the performance of the topical rails defined in a Guardrails application. Computes accuracy for canonical form detection, next step generation, and next bot message generation. Only a single Guardrails application can be specified in the config option.
124+
```
125+
126+
#### convert
127+
```bash
128+
> nemoguardrails convert --help
129+
130+
Usage: nemoguardrails convert [OPTIONS] PATH
131+
132+
Convert a Colang 1.0 directory to Colang 2.0.
133+
134+
Arguments:
135+
path TEXT The path to the file or directory to migrate. [default: None] [required]
136+
137+
Options:
138+
--verbose --no-verbose If the migration should be verbose and output detailed logs. [default: no-verbose]
139+
--validate --no-validate If the migration should validate the output using Colang Parser. [default: no-validate]
140+
--use-active-decorator --no-use-active-decorator If the migration should use the active decorator. [default: use-active-decorator]
141+
--help Show this message and exit.
142+
```

0 commit comments

Comments
 (0)