Skip to content

Commit a0c334a

Browse files
authored
optimize model router&mapper (#1866)
1 parent 9e6bd6d commit a0c334a

File tree

4 files changed

+35
-28
lines changed

4 files changed

+35
-28
lines changed

plugins/wasm-cpp/extensions/model_mapper/plugin.cc

+3-2
Original file line numberDiff line numberDiff line change
@@ -44,8 +44,8 @@ static RegisterContextFactory register_ModelMapper(
4444
namespace {
4545

4646
constexpr std::string_view SetDecoderBufferLimitKey =
47-
"SetRequestBodyBufferLimit";
48-
constexpr std::string_view DefaultMaxBodyBytes = "10485760";
47+
"set_decoder_buffer_limit";
48+
constexpr std::string_view DefaultMaxBodyBytes = "104857600";
4949

5050
} // namespace
5151

@@ -166,6 +166,7 @@ FilterHeadersStatus PluginRootContext::onHeader(
166166
}
167167
removeRequestHeader(Wasm::Common::Http::Header::ContentLength);
168168
setFilterState(SetDecoderBufferLimitKey, DefaultMaxBodyBytes);
169+
LOG_INFO(absl::StrCat("SetRequestBodyBufferLimit: ", DefaultMaxBodyBytes));
169170
return FilterHeadersStatus::StopIteration;
170171
}
171172

plugins/wasm-cpp/extensions/model_router/README.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
1-
# 功能说明
1+
## 功能说明
22
`model-router`插件实现了基于LLM协议中的model参数路由的功能
33

4-
# 配置字段
4+
## 配置字段
55

66
| 名称 | 数据类型 | 填写要求 | 默认值 | 描述 |
77
| ----------- | --------------- | ----------------------- | ------ | ------------------------------------------- |
88
| `modelKey` | string | 选填 | model | 请求body中model参数的位置 |
99
| `addProviderHeader` | string | 选填 | - | 从model参数中解析出的provider名字放到哪个请求header中 |
1010
| `modelToHeader` | string | 选填 | - | 直接将model参数放到哪个请求header中 |
11-
| `enableOnPathSuffix` | array of string | 选填 | ["/v1/chat/completions"] | 只对这些特定路径后缀的请求生效 |
11+
| `enableOnPathSuffix` | array of string | 选填 | ["/v1/chat/completions"] | 只对这些特定路径后缀的请求生效,可以配置为 "*" 以匹配所有路径 |
1212

1313
## 运行属性
1414

plugins/wasm-cpp/extensions/model_router/README_EN.md

+21-21
Original file line numberDiff line numberDiff line change
@@ -1,31 +1,31 @@
1-
## Function Description
2-
The `model-router` plugin implements the function of routing based on the model parameter in the LLM protocol.
1+
## Feature Description
2+
The `model-router` plugin implements routing functionality based on the model parameter in LLM protocols.
33

44
## Configuration Fields
55

6-
| Name | Data Type | Filling Requirement | Default Value | Description |
7-
| ----------- | --------------- | ----------------------- | ------ | ------------------------------------------- |
8-
| `modelKey` | string | Optional | model | The location of the model parameter in the request body |
9-
| `addProviderHeader` | string | Optional | - | Which request header to place the provider name parsed from the model parameter |
10-
| `modelToHeader` | string | Optional | - | Which request header to directly place the model parameter |
11-
| `enableOnPathSuffix` | array of string | Optional | ["/v1/chat/completions"] | Only effective for requests with these specific path suffixes |
6+
| Name | Data Type | Requirement | Default Value | Description |
7+
| ----------- | --------------- | ----------------------- | ------ | ------------------------------------------- |
8+
| `modelKey` | string | Optional | model | Location of the model parameter in the request body |
9+
| `addProviderHeader` | string | Optional | - | Which request header to add the provider name parsed from the model parameter |
10+
| `modelToHeader` | string | Optional | - | Which request header to directly add the model parameter to |
11+
| `enableOnPathSuffix` | array of string | Optional | ["/v1/chat/completions"] | Only effective for requests with these specific path suffixes, can be configured as "*" to match all paths |
1212

13-
## Runtime Attributes
13+
## Runtime Properties
1414

1515
Plugin execution phase: Authentication phase
1616
Plugin execution priority: 900
1717

1818
## Effect Description
1919

20-
### Routing Based on the model Parameter
20+
### Routing Based on Model Parameter
2121

22-
The following configuration is required:
22+
The following configuration is needed:
2323

2424
```yaml
2525
modelToHeader: x-higress-llm-model
2626
```
2727
28-
The plugin will extract the model parameter from the request and set it in the x-higress-llm-model request header, which can be used for subsequent routing. For example, the original LLM request body:
28+
The plugin extracts the model parameter from the request and sets it to the x-higress-llm-model request header for subsequent routing. For example, the original LLM request body is:
2929
3030
```json
3131
{
@@ -35,29 +35,29 @@ The plugin will extract the model parameter from the request and set it in the x
3535
"stream": false,
3636
"messages": [{
3737
"role": "user",
38-
"content": "What is the GitHub address of the main repository for the higress project"
38+
"content": "What is the GitHub address of the Higress project's main repository?"
3939
}],
4040
"presence_penalty": 0,
4141
"temperature": 0.7,
4242
"top_p": 0.95
4343
}
4444
```
4545

46-
After processing by this plugin, the following request header (which can be used for route matching) will be added:
46+
After processing by this plugin, the following request header will be added (can be used for route matching):
4747

4848
x-higress-llm-model: qwen-long
4949

50-
### Extracting the provider Field from the model Parameter for Routing
50+
### Extracting Provider Field from Model Parameter for Routing
5151

52-
> Note that this mode requires the client to specify the provider using a `/` separator in the model parameter.
52+
> Note that this mode requires the client to specify the provider in the model parameter using the `/` delimiter
5353
54-
The following configuration is required:
54+
The following configuration is needed:
5555

5656
```yaml
5757
addProviderHeader: x-higress-llm-provider
5858
```
5959
60-
The plugin will extract the provider part (if present) from the model parameter in the request and set it in the x-higress-llm-provider request header, which can be used for subsequent routing, and rewrite the model parameter to the model name part. For example, the original LLM request body:
60+
The plugin extracts the provider part (if any) from the model parameter in the request, sets it to the x-higress-llm-provider request header for subsequent routing, and rewrites the model parameter to only contain the model name part. For example, the original LLM request body is:
6161
6262
```json
6363
{
@@ -67,15 +67,15 @@ The plugin will extract the provider part (if present) from the model parameter
6767
"stream": false,
6868
"messages": [{
6969
"role": "user",
70-
"content": "What is the GitHub address of the main repository for the higress project"
70+
"content": "What is the GitHub address of the Higress project's main repository?"
7171
}],
7272
"presence_penalty": 0,
7373
"temperature": 0.7,
7474
"top_p": 0.95
7575
}
7676
```
7777

78-
After processing by this plugin, the following request header (which can be used for route matching) will be added:
78+
After processing by this plugin, the following request header will be added (can be used for route matching):
7979

8080
x-higress-llm-provider: dashscope
8181

@@ -89,7 +89,7 @@ The original LLM request body will be changed to:
8989
"stream": false,
9090
"messages": [{
9191
"role": "user",
92-
"content": "What is the GitHub address of the main repository for the higress project"
92+
"content": "What is the GitHub address of the Higress project's main repository?"
9393
}],
9494
"presence_penalty": 0,
9595
"temperature": 0.7,

plugins/wasm-cpp/extensions/model_router/plugin.cc

+8-2
Original file line numberDiff line numberDiff line change
@@ -44,8 +44,8 @@ static RegisterContextFactory register_ModelRouter(
4444
namespace {
4545

4646
constexpr std::string_view SetDecoderBufferLimitKey =
47-
"SetRequestBodyBufferLimit";
48-
constexpr std::string_view DefaultMaxBodyBytes = "10485760";
47+
"set_decoder_buffer_limit";
48+
constexpr std::string_view DefaultMaxBodyBytes = "104857600";
4949

5050
} // namespace
5151

@@ -137,6 +137,11 @@ FilterHeadersStatus PluginRootContext::onHeader(
137137
}
138138
bool enable = false;
139139
for (const auto& enable_suffix : rule.enable_on_path_suffix_) {
140+
// Support wildcard "*" to enable for all paths
141+
if (enable_suffix == "*") {
142+
enable = true;
143+
break;
144+
}
140145
if (absl::EndsWith({path.c_str(), uri_end}, enable_suffix)) {
141146
enable = true;
142147
break;
@@ -153,6 +158,7 @@ FilterHeadersStatus PluginRootContext::onHeader(
153158
}
154159
removeRequestHeader(Wasm::Common::Http::Header::ContentLength);
155160
setFilterState(SetDecoderBufferLimitKey, DefaultMaxBodyBytes);
161+
LOG_INFO(absl::StrCat("SetRequestBodyBufferLimit: ", DefaultMaxBodyBytes));
156162
return FilterHeadersStatus::StopIteration;
157163
}
158164

0 commit comments

Comments
 (0)