NVIDIA · Pouyanpi · Apr 10, 2025 · Mar 31, 2025
diff --git a/docs/evaluation/README.md b/docs/evaluation/README.md
@@ -248,7 +248,6 @@ These results are using the _Simple_ prompt defined in the LLM Self-Checking met
 | gpt-3.5-turbo-instruct | 78                           | 0                                         | 97                           |
 | gpt-3.5-turbo          | 70                           | 0                                         | 100                          |
 | text-davinci-003       | 80                           | 0                                         | 97                           |
-| nemollm-43b            | 88                           | 0                                         | 84                           |
 | gemini-1.0-pro         | 63                           | 36<sup>*</sup>                            | 97                           |
 
 <sup>*</sup> Note that as of Mar 13, 2024 `gemini-1.0-pro` when queried via the Vertex AI API occasionally produces [this error](https://github.com/GoogleCloudPlatform/generative-ai/issues/344). Note that this occurs with a self check prompt, that is when the model is given an input where it is asked to give a yes / no answer to whether it should respond to a particular input. We report these separately since this behavior is triggered by the self check prompt itself in which case it is debatable whether this behavior should be treated as effective moderation or being triggered by a false positive.

diff --git a/docs/user-guides/configuration-guide.md b/docs/user-guides/configuration-guide.md
@@ -91,7 +91,7 @@ To use any of the providers, you must install additional packages; when you firs
 ```
 
 ```{important}
-Although you can instantiate any of the previously mentioned LLM providers, depending on the capabilities of the model, the NeMo Guardrails toolkit works better with some providers than others. The toolkit includes prompts that have been optimized for certain types of models, such as `openai` and `nemollm`. For others, you can optimize the prompts yourself following the information in the [LLM Prompts](#llm-prompts) section.
+Although you can instantiate any of the previously mentioned LLM providers, depending on the capabilities of the model, the NeMo Guardrails toolkit works better with some providers than others. The toolkit includes prompts that have been optimized for certain types of models, such as models provided by`openai` or `llama3` models. For others, you can optimize the prompts yourself following the information in the [LLM Prompts](#llm-prompts) section.
 ```
 
 #### Using LLMs with Reasoning Traces
@@ -197,44 +197,6 @@ models:
       base_url: http://your_base_url
 ```
 
-#### NeMo LLM Service
-
-In addition to the LLM providers supported by LangChain, NeMo Guardrails also supports NeMo LLM Service. For example, to use the GPT-43B-905 model as the main LLM, you should use the following configuration:
-
-```yaml
-models:
-  - type: main
-    engine: nemollm
-    model: gpt-43b-905
-```
-
-You can also use customized NeMo LLM models for specific tasks, e.g., self-checking the user input or the bot output. For example:
-
-```yaml
-models:
-  # ...
-  - type: self_check_input
-    engine: nemollm
-    model: gpt-43b-002
-    parameters:
-      tokens_to_generate: 10
-      customization_id: 6e5361fa-f878-4f00-8bc6-d7fbaaada915
-```
-
-You can specify additional parameters when using NeMo LLM models using the `parameters` key. The supported parameters are:
-
-- `temperature`: the temperature that should be used for making the calls;
-- `api_host`: points to the NeMo LLM Service host (default '<https://api.llm.ngc.nvidia.com>');
-- `api_key`: the NeMo LLM Service key that should be used;
-- `organization_id`: the NeMo LLM Service organization ID that should be used;
-- `tokens_to_generate`: the maximum number of tokens to generate;
-- `stop`: the list of stop words that should be used;
-- `customization_id`: if a customization is used, the id should be specified.
-
-The `api_host`, `api_key`, and `organization_id` are fetched automatically from the environment variables `NGC_API_HOST`, `NGC_API_KEY`, and `NGC_ORGANIZATION_ID`, respectively.
-
-For more details, please refer to the NeMo LLM Service documentation and check out the [NeMo LLM example configuration](https://github.com/NVIDIA/NeMo-Guardrails/tree/develop/examples/configs/llm/nemollm/README.md).
-
 #### TRT-LLM
 
 NeMo Guardrails also supports connecting to a TRT-LLM server.

diff --git a/docs/user-guides/llm-support.md b/docs/user-guides/llm-support.md
@@ -20,26 +20,26 @@ Any new LLM available in Guardrails should be evaluated using at least this set
 The following tables summarize the LLM support for the main features of NeMo Guardrails, focusing on the different rails available out of the box.
 If you want to use an LLM and you cannot see a prompt in the [prompts folder](https://github.com/NVIDIA/NeMo-Guardrails/tree/develop/nemoguardrails/llm/prompts), please also check the configuration defined in the [LLM examples' configurations](https://github.com/NVIDIA/NeMo-Guardrails/tree/develop/examples/configs/llm/README.md).
 
-| Feature                                            | gpt-3.5-turbo-instruct    | text-davinci-003          | nemollm-43b               | llama-2-13b-chat          | falcon-7b-instruct        | gpt-3.5-turbo             | gpt-4              | gpt4all-13b-snoozy   | vicuna-7b-v1.3       | mpt-7b-instruct      | dolly-v2-3b          | HF Pipeline model                  |
-|----------------------------------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|--------------------|----------------------|----------------------|----------------------|----------------------|------------------------------------|
-| Dialog Rails                                       | ✔ (0.74)                  | ✔ (0.83)                  | ✔ (0.82)                  | ✔ (0.77)                  | ✔ (0.76)                  | ❗ (0.45)                  | ❗                  | ❗ (0.54)             | ❗ (0.54)             | ❗ (0.50)             | ❗ (0.40)             | ❗ _(DEPENDS ON MODEL)_             |
-| • Single LLM call                                  | ✔ (0.83)                  | ✔ (0.81)                  | ✔                         | ✖                         | ✖                         | ✖                         | ✖                  | ✖                    | ✖                    | ✖                    | ✖                    | ✖                                 |
-| • Multi-step flow generation                       | _EXPERIMENTAL_            | _EXPERIMENTAL_            | ✖                         | ✖                         | ✖                         | ✖                         | ✖                  | ✖                    | ✖                    | ✖                    | ✖                    | ✖                                 |
-| Streaming                                          | ✔                         | ✔                         | ✔                         | -                         | -                         | ✔                         | ✔                  | -                    | -                    | -                    | -                    | ✔                                 |
-| Hallucination detection (SelfCheckGPT with AskLLM) | ✔                         | ✔                         | ✖                         | ✖                         | ✖                         | ✖                         | ✖                  | ✖                    | ✖                    | ✖                    | ✖                    | ✖                                 |
-| AskLLM rails                                       |                           |                           |                           |                           |                           |                           |                    |                      |                      |                      |                      |                                    |
-| • Jailbreak detection                              | ✔ (0.88)                  | ✔ (0.88)                  | ✔ (0.86)                  | ✖                         | ✖                         | ✔ (0.85)                  | ✖                  | ✖                    | ✖                    | ✖                    | ✖                    | ✖                                 |
-| • Output moderation                                | ✔                         | ✔                         | ✔                         | ✖                         | ✖                         | ✔ (0.85)                  | ✖                  | ✖                    | ✖                    | ✖                    | ✖                    | ✖                                 |
-| • Fact-checking                                    | ✔ (0.81)                  | ✔ (0.82)                  | ✔ (0.81)                  | ✔ (0.80)                  | ✖                         | ✔ (0.83)                  | ✖                  | ✖                    | ✖                    | ✖                    | ✖                    | ❗ _(DEPENDS ON MODEL)_             |
-| AlignScore fact-checking _(LLM independent)_       | ✔ (0.89)                  | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                  | ✔                    | ✔                    | ✔                    | ✔                    | ✔                                 |
-| ActiveFence moderation _(LLM independent)_         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                  | ✔                    | ✔                    | ✔                    | ✔                    | ✔                                 |
-| Llama Guard moderation _(LLM independent)_         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                  | ✔                    | ✔                    | ✔                    | ✔                    | ✔                                 |
-| Got It AI RAG TruthChecker _(LLM independent)_     | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                  | ✔                    | ✔                    | ✔                    | ✔                    | ✔                                 |
-| Patronus Lynx RAG Hallucination detection _(LLM independent)_ | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                  | ✔                    | ✔                    | ✔                    | ✔                    | ✔                                 |
-| GCP Text Moderation _(LLM independent)_            | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                  | ✔                    | ✔                    | ✔                    | ✔                    | ✔                                 |
-| Patronus Evaluate API _(LLM independent)_          | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                  | ✔                    | ✔                    | ✔                    | ✔                    | ✔                                 |
-| Fiddler Fast Faitfhulness Hallucination Detection _(LLM independent)_          | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                  | ✔                    | ✔                    | ✔                    | ✔                    | ✔
-| Fiddler Fast Safety & Jailbreak Detection _(LLM independent)_          | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                  | ✔                    | ✔                    | ✔                    | ✔                    | ✔                     |
+| Feature                                            | gpt-3.5-turbo-instruct    | text-davinci-003          | llama-2-13b-chat          | falcon-7b-instruct        | gpt-3.5-turbo             | gpt-4              | gpt4all-13b-snoozy   | vicuna-7b-v1.3       | mpt-7b-instruct      | dolly-v2-3b          | HF Pipeline model                  |
+|----------------------------------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|--------------------|----------------------|----------------------|----------------------|----------------------|------------------------------------|
+| Dialog Rails                                       | ✔ (0.74)                  | ✔ (0.83)                  | ✔ (0.77)                  | ✔ (0.76)                  | ❗ (0.45)                  | ❗                  | ❗ (0.54)             | ❗ (0.54)             | ❗ (0.50)             | ❗ (0.40)             | ❗ _(DEPENDS ON MODEL)_             |
+| • Single LLM call                                  | ✔ (0.83)                  | ✔ (0.81)                  | ✖                         | ✖                         | ✖                         | ✖                  | ✖                    | ✖                    | ✖                    | ✖                    | ✖                                 |
+| • Multi-step flow generation                       | _EXPERIMENTAL_            | _EXPERIMENTAL_            | ✖                         | ✖                         | ✖                         | ✖                  | ✖                    | ✖                    | ✖                    | ✖                    | ✖                                 |
+| Streaming                                          | ✔                         | ✔                         | -                         | -                         | ✔                         | ✔                  | -                    | -                    | -                    | -                    | ✔                                 |
+| Hallucination detection (SelfCheckGPT with AskLLM) | ✔                         | ✔                         | ✖                         | ✖                         | ✖                         | ✖                  | ✖                    | ✖                    | ✖                    | ✖                    | ✖                                 |
+| AskLLM rails                                       |                           |                           |                           |                           |                           |                    |                      |                      |                      |                      |                                    |
+| • Jailbreak detection                              | ✔ (0.88)                  | ✔ (0.88)                  | ✖                         | ✖                         | ✔ (0.85)                  | ✖                  | ✖                    | ✖                    | ✖                    | ✖                    | ✖                                 |
+| • Output moderation                                | ✔                         | ✔                         | ✖                         | ✖                         | ✔ (0.85)                  | ✖                  | ✖                    | ✖                    | ✖                    | ✖                    | ✖                                 |
+| • Fact-checking                                    | ✔ (0.81)                  | ✔ (0.82)                  | ✔ (0.80)                  | ✖                         | ✔ (0.83)                  | ✖                  | ✖                    | ✖                    | ✖                    | ✖                    | ❗ _(DEPENDS ON MODEL)_             |
+| AlignScore fact-checking _(LLM independent)_       | ✔ (0.89)                  | ✔                         | ✔                         | ✔                         | ✔                         | ✔                  | ✔                    | ✔                    | ✔                    | ✔                    | ✔                                 |
+| ActiveFence moderation _(LLM independent)_         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                  | ✔                    | ✔                    | ✔                    | ✔                    | ✔                                 |
+| Llama Guard moderation _(LLM independent)_         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                  | ✔                    | ✔                    | ✔                    | ✔                    | ✔                                 |
+| Got It AI RAG TruthChecker _(LLM independent)_     | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                  | ✔                    | ✔                    | ✔                    | ✔                    | ✔                                 |
+| Patronus Lynx RAG Hallucination detection _(LLM independent)_ | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                  | ✔                    | ✔                    | ✔                    | ✔                    | ✔                                 |
+| GCP Text Moderation _(LLM independent)_            | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                  | ✔                    | ✔                    | ✔                    | ✔                    | ✔                                 |
+| Patronus Evaluate API _(LLM independent)_          | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                  | ✔                    | ✔                    | ✔                    | ✔                    | ✔                                 |
+| Fiddler Fast Faitfhulness Hallucination Detection _(LLM independent)_          | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                  | ✔                    | ✔                    | ✔                    | ✔                    | ✔
+| Fiddler Fast Safety & Jailbreak Detection _(LLM independent)_          | ✔                         | ✔                         | ✔                         | ✔                         | ✔                         | ✔                  | ✔                    | ✔                    | ✔                    | ✔                    | ✔                     |
 
 Table legend: