-
Notifications
You must be signed in to change notification settings - Fork 470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Enhance ai-cache Plugin with Vector Similarity-Based LLM Cache Recall and Multi-DB Support #1248
base: main
Are you sure you want to change the base?
Conversation
update update: 注意在使用http协议的时候不要用tls update: add lobechat add: makefile for ai-proxy fix bugs fix bugs fix: redis connection fix: dashvector and dashscope cluster fix: change vdb collection feat: add chroma logic docs: 增加 api 说明 update: no callback version fix: change to callback fix: finish chrome remove: key update: gitignore
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1248 +/- ##
==========================================
+ Coverage 35.91% 44.22% +8.31%
==========================================
Files 69 75 +6
Lines 11576 9823 -1753
==========================================
+ Hits 4157 4344 +187
+ Misses 7104 5150 -1954
- Partials 315 329 +14 |
fix: remove key
…to feat/chroma
@@ -66,7 +66,8 @@ func onHttpRequestHeader(ctx wrapper.HttpContext, pluginConfig config.PluginConf | |||
apiName := getOpenAiApiName(path.Path) | |||
if apiName == "" { | |||
log.Debugf("[onHttpRequestHeader] unsupported path: %s", path.Path) | |||
_ = util.SendResponse(404, "ai-proxy.unknown_api", util.MimeTypeTextPlain, "API not found: "+path.Path) | |||
// _ = util.SendResponse(404, "ai-proxy.unknown_api", util.MimeTypeTextPlain, "API not found: "+path.Path) | |||
log.Debugf("[onHttpRequestHeader] no send response") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里是为啥?
@@ -0,0 +1,4 @@ | |||
.DEFAULT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
测试用文件不建议提交
@@ -1,5 +1,5 @@ | |||
# File generated by hgctl. Modify as required. | |||
|
|||
docker-compose-test/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个 .gitignore
和上面提交上来的 docker-compose-test/envoy.yaml
一起看有点喜感。。。
} | ||
|
||
func (c *ProviderConfig) FromJson(json gjson.Result) { | ||
c.typ = json.Get("VectorStoreProviderType").String() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这部分字段名的大小写可以再统一一下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
另外,timeout 是不是可以统一配置,每个 provider 单独有一个 timeout 字段的有什么好处呢?每个 provider 的默认超时是如何确定的呢?
c.PineconeThreshold = json.Get("PineconeThreshold").Float() | ||
if c.PineconeThreshold == 0 { | ||
c.PineconeThreshold = 0.5 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
感觉这部分配置解析逻辑可以拆分到各个 Provider 的 initializer 里。你们可以商量一下这块怎么做代码结构更好。
return | ||
} | ||
|
||
d.client.Post( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Post 返回的 error 是要处理的
}, | ||
requestBody, | ||
func(statusCode int, responseHeaders http.Header, responseBody []byte) { | ||
log.Infof("Query embedding response: %d, %s", statusCode, responseBody) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里记录 Debug 好一点
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vector 和 embeding 部分代码比较通用,建议从 ai-cache 目录下挪出来,放到 wasm-go 的 ai-utils 目录下
Ⅰ. Describe what this PR did
This PR extends the functionality of the
ai-cache
plugin, enabling more efficient AI application development by introducing vector similarity-based caching and recall mechanisms.Ⅱ. Does this pull request fix one issue?
Please refer to issue #1040 and #1041.
Ⅲ. Why don't you add test cases (unit test/integration test)?
Test cases will be added later.
Ⅳ. Describe how to verify it
After filling in the
apikey
andChromaCollectionID
indocker-compose-test/envoy.yaml
, execute the following code:cd docker-compose-test/ docker compose up
Then test it by accessing the LLM via cURL:
Ⅴ. Special notes for reviews