Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ai-json-resp] Extract JSON from LLM, Validate with Schema, Ensure Valid JSON, Auto-Retry #1236

Merged
merged 26 commits into from
Sep 3, 2024
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
47f67be
init for ai-json-resp
Suchun-sv Aug 21, 2024
fa3ab93
add errcode and errmsg
Suchun-sv Aug 22, 2024
90d7bae
Merge branch 'main' into ai-json-resp-fix
Suchun-sv Aug 22, 2024
f10f3e4
add error message
Suchun-sv Aug 22, 2024
8ecee8f
add support for the openai compatible service
Suchun-sv Aug 22, 2024
5a6df3c
test qwen, passed
Suchun-sv Aug 23, 2024
daa88b0
add support for serviceUrl && update README.md
Suchun-sv Aug 23, 2024
0413faf
update README.md
Suchun-sv Aug 23, 2024
5bff01d
add verification for json schema depth
Suchun-sv Aug 23, 2024
3e8d365
Merge branch 'alibaba:main' into ai-json-resp-fix
Suchun-sv Aug 23, 2024
84ac205
Merge branch 'main' into ai-json-resp-fix
Suchun-sv Aug 26, 2024
afb3f34
Optimize variable naming
Suchun-sv Aug 26, 2024
eab1419
Optimize variable naming
Suchun-sv Aug 26, 2024
2edd4c9
fix bugs
Suchun-sv Aug 27, 2024
e629d7d
Merge branch 'main' into ai-json-resp-fix
Suchun-sv Aug 27, 2024
f105ccd
fix bugs
Suchun-sv Aug 30, 2024
10cd442
Merge branch 'main' into ai-json-resp-fix
Suchun-sv Aug 30, 2024
9a308ba
fix bugs
Suchun-sv Aug 30, 2024
9b1c688
Merge branch 'main' into ai-json-resp-fix
CH3CHO Aug 30, 2024
af43538
fix bugs
Suchun-sv Aug 30, 2024
0f6e903
add comment
Suchun-sv Aug 30, 2024
97bcca1
delete maxDepth config
Suchun-sv Sep 1, 2024
55b56ee
change info when maxDepth exceeded
Suchun-sv Sep 1, 2024
9403655
Merge branch 'main' into ai-json-resp-fix
Suchun-sv Sep 1, 2024
9dfedce
change JSON to Json to ensure the consistency
Suchun-sv Sep 1, 2024
6e53578
Update README.md
johnlanni Sep 3, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
199 changes: 199 additions & 0 deletions plugins/wasm-go/extensions/ai-json-resp/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
## 简介

**Note**

> 需要数据面的proxy wasm版本大于等于0.2.100
>

> 编译时,需要带上版本的tag,例如:tinygo build -o main.wasm -scheduler=none -target=wasi -gc=custom -tags="custommalloc nottinygc_finalizer proxy_wasm_version_0_2_100" ./
>

LLM响应结构化插件,用于根据默认或用户配置的Json Schema对AI的响应进行结构化,以便后续插件处理。注意目前只支持 `非流式响应`。

### 配置说明

| Name | Type | Requirement | Default | **Description** |
| --- | --- | --- | --- | --- |
| serviceName | str | required | - | AI服务或支持AI-Proxy的网关服务名称 |
| serviceDomain | str | optional | - | AI服务或支持AI-Proxy的网关服务域名/IP地址 |
| servicePath | str | optional | '/v1/chat/completions' | AI服务或支持AI-Proxy的网关服务基础路径 |
| serviceUrl | str | optional | - | AI服务或支持AI-Proxy的网关服务URL, 插件将自动提取Domain和Path, 用于填充未配置的serviceDomain或servicePath|
| servicePort | int | optional | 443 | 网关服务端口 |
| serviceTimeout | int | optional | 50000 | 默认请求超时时间 |
| maxRetry | int | optional | 3 | 若回答无法正确提取格式化时重试次数 |
| contentPath | str | optional | "choices.0.message.content” | 从LLM回答中提取响应结果的gpath路径 |
| jsonSchema | str (json) | optional | - | 验证请求所参照的jsonSchema, 为空只验证并返回合法Json格式响应 |
| enableSwagger | bool | optional | false | 是否启用Swagger协议进行验证 |
| enableOas3 | bool | optional | true | 是否启用Oas3协议进行验证 |
| jsonSchemaMaxDepth | int | optional | 5 | 由于插件性能限制,为防止递归耗尽资源,需指定支持的 JSON Schema 最大深度,超过该深度的 Schema 不会用于验证响应|
| rejectOnDepthExceeded | bool | optional | false | 若为 true,当 JSON Schema 的深度超过 maxJsonSchemaDepth 时,插件将直接返回错误;若为 false,则将仍将Json Schema用于LLM提示构造并继续执行 |

### 请求和返回参数说明

- **请求参数**: 本插件请求格式为openai请求格式,包含`model`和`messages`字段,其中`model`为AI模型名称,`messages`为对话消息列表,每个消息包含`role`和`content`字段,`role`为消息角色,`content`为消息内容。
```json
{
"model": "gpt-4",
"messages": [
{"role": "user", "content": "give me a api doc for add the variable x to x+5"}
]
}
```
其他请求参数需参考配置的ai服务或网关服务的相应文档。
- **返回参数**:
- 返回满足定义的Json Schema约束的 `Json格式响应`
- 若未定义Json Schema,则返回合法的`Json格式响应`
- 若出现内部错误,则返回 `{ "Code": 10XX, "Msg": "错误信息提示" }`。

## 请求示例

```bash
curl -X POST "http://localhost:8001/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4",
"messages": [
{"role": "user", "content": "give me a api doc for add the variable x to x+5"}
]
}'

```

## 返回示例
### 正常返回
在正常情况下,系统应返回经过 JSON Schema 验证的 JSON 数据。如果未配置 JSON Schema,系统将返回符合 JSON 标准的合法 JSON 数据。
```json
{
"apiVersion": "1.0",
"request": {
"endpoint": "/add_to_five",
"method": "POST",
"port": 8080,
"headers": {
"Content-Type": "application/json"
},
"body": {
"x": 7
}
}
}
```

### 异常返回
在发生错误时,返回状态码为 `500`,返回内容为 JSON 格式的错误信息。包含错误码 `Code` 和错误信息 `Msg` 两个字段。
```json
{
"Code": 1006,
"Msg": "retry count exceed max retry count"
}
```

### 错误码说明
| 错误码 | 说明 |
| --- | --- |
| 1001 | 配置的Json Schema不是合法Json格式|
| 1002 | 配置的Json Schema编译失败,不是合法的Json Schema 格式或深度超出 jsonSchemaMaxDepth 且 rejectOnDepthExceeded 为true|
| 1003 | 无法在响应中提取合法的Json|
| 1004 | 响应为空字符串|
| 1005 | 响应不符合Json Schema定义|
| 1006 | 重试次数超过最大限制|
| 1007 | 无法获取响应内容,可能是上游服务配置错误或获取内容的ContentPath路径错误|
| 1008 | serciveDomain为空, 请注意serviceDomian或serviceUrl不能同时为空|

## 服务配置说明
本插件需要配置上游服务来支持出现异常时的自动重试机制, 支持的配置主要包括`支持openai接口的AI服务`或`本地网关服务`

### 支持openai接口的AI服务
以qwen为例,基本配置如下:

Yaml格式配置如下
```yaml
serviceName: qwen
serviceDomain: dashscope.aliyuncs.com
apiKey: [Your API Key]
servicePath: /compatible-mode/v1/chat/completions
jsonSchema:
title: ReasoningSchema
type: object
properties:
reasoning_steps:
type: array
items:
type: string
description: The reasoning steps leading to the final conclusion.
answer:
type: string
description: The final answer, taking into account the reasoning steps.
required:
- reasoning_steps
- answer
additionalProperties: false
```

JSON 格式配置
```json
{
"serviceName": "qwen",
"serviceUrl": "https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions",
"apiKey": "[Your API Key]",
"jsonSchema": {
"title": "ActionItemsSchema",
"type": "object",
"properties": {
"action_items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": {
"type": "string",
"description": "Description of the action item."
},
"due_date": {
"type": ["string", "null"],
"description": "Due date for the action item, can be null if not specified."
},
"owner": {
"type": ["string", "null"],
"description": "Owner responsible for the action item, can be null if not specified."
}
},
"required": ["description", "due_date", "owner"],
"additionalProperties": false
},
"description": "List of action items from the meeting."
}
},
"required": ["action_items"],
"additionalProperties": false
}
}
```

### 本地网关服务
为了能复用已经配置好的服务,本插件也支持配置本地网关服务。例如,若网关已经配置好了[AI-proxy服务](../ai-proxy/README.md),则可以直接配置如下:
1. 创建一个固定IP为127.0.0.1的服务,例如localservice.static
```yaml
- name: outbound|10000||localservice.static
connect_timeout: 30s
type: LOGICAL_DNS
dns_lookup_family: V4_ONLY
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: outbound|8001||localservice.static
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.1
port_value: 10000
```
2. 配置文件中添加localservice.static的服务配置
```yaml
serviceName: localservice
serviceDomain: 127.0.0.1
servicePort: 10000
```
3. 自动提取请求的Path,Header等信息
插件会自动提取请求的Path,Header等信息,从而避免对AI服务的重复配置。
21 changes: 21 additions & 0 deletions plugins/wasm-go/extensions/ai-json-resp/go.mod
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
module github.com/alibaba/higress/plugins/wasm-go/extensions/hello-world

go 1.18

replace github.com/alibaba/higress/plugins/wasm-go => ../..

require (
github.com/alibaba/higress/plugins/wasm-go v1.4.2
github.com/higress-group/proxy-wasm-go-sdk v0.0.0-20240711023527-ba358c48772f
)

require (
github.com/google/uuid v1.3.0 // indirect
github.com/higress-group/nottinygc v0.0.0-20231101025119-e93c4c2f8520 // indirect
github.com/magefile/mage v1.14.0 // indirect
github.com/santhosh-tekuri/jsonschema v1.2.4 // indirect
github.com/tidwall/gjson v1.14.3 // indirect
github.com/tidwall/match v1.1.1 // indirect
github.com/tidwall/pretty v1.2.0 // indirect
github.com/tidwall/resp v0.1.1 // indirect
)
26 changes: 26 additions & 0 deletions plugins/wasm-go/extensions/ai-json-resp/go.sum
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/google/uuid v1.3.0 h1:t6JiXgmwXMjEs8VusXIJk2BXHsn+wx8BZdTaoZ5fu7I=
github.com/google/uuid v1.3.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/higress-group/nottinygc v0.0.0-20231101025119-e93c4c2f8520 h1:IHDghbGQ2DTIXHBHxWfqCYQW1fKjyJ/I7W1pMyUDeEA=
github.com/higress-group/nottinygc v0.0.0-20231101025119-e93c4c2f8520/go.mod h1:Nz8ORLaFiLWotg6GeKlJMhv8cci8mM43uEnLA5t8iew=
github.com/higress-group/proxy-wasm-go-sdk v0.0.0-20240226064518-b3dc4646a35a h1:luYRvxLTE1xYxrXYj7nmjd1U0HHh8pUPiKfdZ0MhCGE=
github.com/higress-group/proxy-wasm-go-sdk v0.0.0-20240226064518-b3dc4646a35a/go.mod h1:hNFjhrLUIq+kJ9bOcs8QtiplSQ61GZXtd2xHKx4BYRo=
github.com/higress-group/proxy-wasm-go-sdk v0.0.0-20240318034951-d5306e367c43/go.mod h1:hNFjhrLUIq+kJ9bOcs8QtiplSQ61GZXtd2xHKx4BYRo=
github.com/higress-group/proxy-wasm-go-sdk v0.0.0-20240327114451-d6b7174a84fc/go.mod h1:hNFjhrLUIq+kJ9bOcs8QtiplSQ61GZXtd2xHKx4BYRo=
github.com/higress-group/proxy-wasm-go-sdk v0.0.0-20240711023527-ba358c48772f h1:ZIiIBRvIw62gA5MJhuwp1+2wWbqL9IGElQ499rUsYYg=
github.com/higress-group/proxy-wasm-go-sdk v0.0.0-20240711023527-ba358c48772f/go.mod h1:hNFjhrLUIq+kJ9bOcs8QtiplSQ61GZXtd2xHKx4BYRo=
github.com/magefile/mage v1.14.0 h1:6QDX3g6z1YvJ4olPhT1wksUcSa/V0a1B+pJb73fBjyo=
github.com/magefile/mage v1.14.0/go.mod h1:z5UZb/iS3GoOSn0JgWuiw7dxlurVYTu+/jHXqQg881A=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/santhosh-tekuri/jsonschema v1.2.4 h1:hNhW8e7t+H1vgY+1QeEQpveR6D4+OwKPXCfD2aieJis=
github.com/santhosh-tekuri/jsonschema v1.2.4/go.mod h1:TEAUOeZSmIxTTuHatJzrvARHiuO9LYd+cIxzgEHCQI4=
github.com/stretchr/testify v1.8.4 h1:CcVxjf3Q8PM0mHUKJCdn+eZZtm5yQwehR5yeSVQQcUk=
github.com/tidwall/gjson v1.14.3 h1:9jvXn7olKEHU1S9vwoMGliaT8jq1vJ7IH/n9zD9Dnlw=
github.com/tidwall/gjson v1.14.3/go.mod h1:/wbyibRr2FHMks5tjHJ5F8dMZh3AcwJEMf5vlfC0lxk=
github.com/tidwall/match v1.1.1 h1:+Ho715JplO36QYgwN9PGYNhgZvoUSc9X2c80KVTi+GA=
github.com/tidwall/match v1.1.1/go.mod h1:eRSPERbgtNPcGhD8UCthc6PmLEQXEWd3PRB5JTxsfmM=
github.com/tidwall/pretty v1.2.0 h1:RWIZEg2iJ8/g6fDDYzMpobmaoGh5OLl4AXtGUGPcqCs=
github.com/tidwall/pretty v1.2.0/go.mod h1:ITEVvHYasfjBbM0u2Pg8T2nJnzm8xPwvNhhsoaGGjNU=
github.com/tidwall/resp v0.1.1 h1:Ly20wkhqKTmDUPlyM1S7pWo5kk0tDu8OoC/vFArXmwE=
github.com/tidwall/resp v0.1.1/go.mod h1:3/FrruOBAxPTPtundW0VXgmsQ4ZBA0Aw714lVYgwFa0=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
Loading
Loading