Skip to content

Commit c5da691

Browse files
authored
docs: remove NeMo Service (nemollm) documentation (#1077)
1 parent 940d691 commit c5da691

File tree

3 files changed

+21
-60
lines changed

3 files changed

+21
-60
lines changed

docs/evaluation/README.md

-1
Original file line numberDiff line numberDiff line change
@@ -248,7 +248,6 @@ These results are using the _Simple_ prompt defined in the LLM Self-Checking met
248248
| gpt-3.5-turbo-instruct | 78 | 0 | 97 |
249249
| gpt-3.5-turbo | 70 | 0 | 100 |
250250
| text-davinci-003 | 80 | 0 | 97 |
251-
| nemollm-43b | 88 | 0 | 84 |
252251
| gemini-1.0-pro | 63 | 36<sup>*</sup> | 97 |
253252

254253
<sup>*</sup> Note that as of Mar 13, 2024 `gemini-1.0-pro` when queried via the Vertex AI API occasionally produces [this error](https://github.com/GoogleCloudPlatform/generative-ai/issues/344). Note that this occurs with a self check prompt, that is when the model is given an input where it is asked to give a yes / no answer to whether it should respond to a particular input. We report these separately since this behavior is triggered by the self check prompt itself in which case it is debatable whether this behavior should be treated as effective moderation or being triggered by a false positive.

docs/user-guides/configuration-guide.md

+1-39
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,7 @@ To use any of the providers, you must install additional packages; when you firs
9191
```
9292

9393
```{important}
94-
Although you can instantiate any of the previously mentioned LLM providers, depending on the capabilities of the model, the NeMo Guardrails toolkit works better with some providers than others. The toolkit includes prompts that have been optimized for certain types of models, such as `openai` and `nemollm`. For others, you can optimize the prompts yourself following the information in the [LLM Prompts](#llm-prompts) section.
94+
Although you can instantiate any of the previously mentioned LLM providers, depending on the capabilities of the model, the NeMo Guardrails toolkit works better with some providers than others. The toolkit includes prompts that have been optimized for certain types of models, such as models provided by`openai` or `llama3` models. For others, you can optimize the prompts yourself following the information in the [LLM Prompts](#llm-prompts) section.
9595
```
9696

9797
#### Using LLMs with Reasoning Traces
@@ -197,44 +197,6 @@ models:
197197
base_url: http://your_base_url
198198
```
199199
200-
#### NeMo LLM Service
201-
202-
In addition to the LLM providers supported by LangChain, NeMo Guardrails also supports NeMo LLM Service. For example, to use the GPT-43B-905 model as the main LLM, you should use the following configuration:
203-
204-
```yaml
205-
models:
206-
- type: main
207-
engine: nemollm
208-
model: gpt-43b-905
209-
```
210-
211-
You can also use customized NeMo LLM models for specific tasks, e.g., self-checking the user input or the bot output. For example:
212-
213-
```yaml
214-
models:
215-
# ...
216-
- type: self_check_input
217-
engine: nemollm
218-
model: gpt-43b-002
219-
parameters:
220-
tokens_to_generate: 10
221-
customization_id: 6e5361fa-f878-4f00-8bc6-d7fbaaada915
222-
```
223-
224-
You can specify additional parameters when using NeMo LLM models using the `parameters` key. The supported parameters are:
225-
226-
- `temperature`: the temperature that should be used for making the calls;
227-
- `api_host`: points to the NeMo LLM Service host (default '<https://api.llm.ngc.nvidia.com>');
228-
- `api_key`: the NeMo LLM Service key that should be used;
229-
- `organization_id`: the NeMo LLM Service organization ID that should be used;
230-
- `tokens_to_generate`: the maximum number of tokens to generate;
231-
- `stop`: the list of stop words that should be used;
232-
- `customization_id`: if a customization is used, the id should be specified.
233-
234-
The `api_host`, `api_key`, and `organization_id` are fetched automatically from the environment variables `NGC_API_HOST`, `NGC_API_KEY`, and `NGC_ORGANIZATION_ID`, respectively.
235-
236-
For more details, please refer to the NeMo LLM Service documentation and check out the [NeMo LLM example configuration](https://github.com/NVIDIA/NeMo-Guardrails/tree/develop/examples/configs/llm/nemollm/README.md).
237-
238200
#### TRT-LLM
239201
240202
NeMo Guardrails also supports connecting to a TRT-LLM server.

docs/user-guides/llm-support.md

+20-20
Original file line numberDiff line numberDiff line change
@@ -20,26 +20,26 @@ Any new LLM available in Guardrails should be evaluated using at least this set
2020
The following tables summarize the LLM support for the main features of NeMo Guardrails, focusing on the different rails available out of the box.
2121
If you want to use an LLM and you cannot see a prompt in the [prompts folder](https://github.com/NVIDIA/NeMo-Guardrails/tree/develop/nemoguardrails/llm/prompts), please also check the configuration defined in the [LLM examples' configurations](https://github.com/NVIDIA/NeMo-Guardrails/tree/develop/examples/configs/llm/README.md).
2222

23-
| Feature | gpt-3.5-turbo-instruct | text-davinci-003 | nemollm-43b | llama-2-13b-chat | falcon-7b-instruct | gpt-3.5-turbo | gpt-4 | gpt4all-13b-snoozy | vicuna-7b-v1.3 | mpt-7b-instruct | dolly-v2-3b | HF Pipeline model |
24-
|----------------------------------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|--------------------|----------------------|----------------------|----------------------|----------------------|------------------------------------|
25-
| Dialog Rails | ✔ (0.74) | ✔ (0.83) | ✔ (0.82) | ✔ (0.77) | ✔ (0.76) | ❗ (0.45) || ❗ (0.54) | ❗ (0.54) | ❗ (0.50) | ❗ (0.40) |_(DEPENDS ON MODEL)_ |
26-
| • Single LLM call | ✔ (0.83) | ✔ (0.81) | | |||||||||
27-
| • Multi-step flow generation | _EXPERIMENTAL_ | _EXPERIMENTAL_ |||| |||||||
28-
| Streaming ||| | - | - ||| - | - | - | - ||
29-
| Hallucination detection (SelfCheckGPT with AskLLM) |||||| |||||||
30-
| AskLLM rails | | | | | | | | | | | | |
31-
| • Jailbreak detection | ✔ (0.88) | ✔ (0.88) | ✔ (0.86) | || ✔ (0.85) |||||||
32-
| • Output moderation ||| | || ✔ (0.85) |||||||
33-
| • Fact-checking | ✔ (0.81) | ✔ (0.82) | ✔ (0.81) | ✔ (0.80) || ✔ (0.83) ||||||_(DEPENDS ON MODEL)_ |
34-
| AlignScore fact-checking _(LLM independent)_ | ✔ (0.89) ||||| |||||||
35-
| ActiveFence moderation _(LLM independent)_ |||||| |||||||
36-
| Llama Guard moderation _(LLM independent)_ |||||| |||||||
37-
| Got It AI RAG TruthChecker _(LLM independent)_ |||||| |||||||
38-
| Patronus Lynx RAG Hallucination detection _(LLM independent)_ |||||| |||||||
39-
| GCP Text Moderation _(LLM independent)_ |||||| |||||||
40-
| Patronus Evaluate API _(LLM independent)_ |||||| |||||||
41-
| Fiddler Fast Faitfhulness Hallucination Detection _(LLM independent)_ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔
42-
| Fiddler Fast Safety & Jailbreak Detection _(LLM independent)_ |||||| |||||||
23+
| Feature | gpt-3.5-turbo-instruct | text-davinci-003 | llama-2-13b-chat | falcon-7b-instruct | gpt-3.5-turbo | gpt-4 | gpt4all-13b-snoozy | vicuna-7b-v1.3 | mpt-7b-instruct | dolly-v2-3b | HF Pipeline model |
24+
|----------------------------------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|--------------------|----------------------|----------------------|----------------------|----------------------|------------------------------------|
25+
| Dialog Rails | ✔ (0.74) | ✔ (0.83) | ✔ (0.77) | ✔ (0.76) | ❗ (0.45) || ❗ (0.54) | ❗ (0.54) | ❗ (0.50) | ❗ (0.40) |_(DEPENDS ON MODEL)_ |
26+
| • Single LLM call | ✔ (0.83) | ✔ (0.81) ||||||||||
27+
| • Multi-step flow generation | _EXPERIMENTAL_ | _EXPERIMENTAL_ ||||||||||
28+
| Streaming ||| - | - ||| - | - | - | - ||
29+
| Hallucination detection (SelfCheckGPT with AskLLM) ||||||||||||
30+
| AskLLM rails | | | | | | | | | | | |
31+
| • Jailbreak detection | ✔ (0.88) | ✔ (0.88) ||| ✔ (0.85) |||||||
32+
| • Output moderation ||||| ✔ (0.85) |||||||
33+
| • Fact-checking | ✔ (0.81) | ✔ (0.82) | ✔ (0.80) || ✔ (0.83) ||||||_(DEPENDS ON MODEL)_ |
34+
| AlignScore fact-checking _(LLM independent)_ | ✔ (0.89) |||||||||||
35+
| ActiveFence moderation _(LLM independent)_ ||||||||||||
36+
| Llama Guard moderation _(LLM independent)_ ||||||||||||
37+
| Got It AI RAG TruthChecker _(LLM independent)_ ||||||||||||
38+
| Patronus Lynx RAG Hallucination detection _(LLM independent)_ ||||||||||||
39+
| GCP Text Moderation _(LLM independent)_ ||||||||||||
40+
| Patronus Evaluate API _(LLM independent)_ ||||||||||||
41+
| Fiddler Fast Faitfhulness Hallucination Detection _(LLM independent)_ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔
42+
| Fiddler Fast Safety & Jailbreak Detection _(LLM independent)_ ||||||||||||
4343

4444
Table legend:
4545

0 commit comments

Comments
 (0)