Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Enhance ai-cache Plugin with Vector Similarity-Based LLM Cache Recall and Multi-DB Support #1248

Draft
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

EnableAsync
Copy link
Contributor

@EnableAsync EnableAsync commented Aug 25, 2024

Ⅰ. Describe what this PR did

This PR extends the functionality of the ai-cache plugin, enabling more efficient AI application development by introducing vector similarity-based caching and recall mechanisms.

Ⅱ. Does this pull request fix one issue?

Please refer to issue #1040 and #1041.

Ⅲ. Why don't you add test cases (unit test/integration test)?

Test cases will be added later.

Ⅳ. Describe how to verify it

After filling in the apikey and ChromaCollectionID in docker-compose-test/envoy.yaml, execute the following code:

cd docker-compose-test/
docker compose up

Then test it by accessing the LLM via cURL:

curl http://172.17.0.1:10000/v1/chat/completions -X POST -d '{"model":"gpt-4o-mini","messages":[{"content":"今天中午吃什么","role":"user"}]}' -H "Content-Type: application/json"

Ⅴ. Special notes for reviews

johnlanni and others added 8 commits August 1, 2024 15:09
update

update: 注意在使用http协议的时候不要用tls

update: add lobechat

add: makefile for ai-proxy

fix bugs

fix bugs

fix: redis connection

fix: dashvector and dashscope cluster

fix: change vdb collection

feat: add chroma logic

docs: 增加 api 说明

update: no callback version

fix: change to callback

fix: finish chrome

remove: key

update: gitignore
@CLAassistant
Copy link

CLAassistant commented Aug 25, 2024

CLA assistant check
All committers have signed the CLA.

@EnableAsync EnableAsync changed the title feat: Add Chroma vector database support to ai-cache WASM Plugin feat: Add ai-cache WASM Plugin Aug 25, 2024
@EnableAsync EnableAsync changed the title feat: Add ai-cache WASM Plugin feat: Enhance ai-cache Plugin with Vector Similarity-Based LLM Cache Recall and Multi-DB Support Aug 25, 2024
@codecov-commenter
Copy link

codecov-commenter commented Sep 6, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 44.22%. Comparing base (ef31e09) to head (a40f5e9).
Report is 89 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1248      +/-   ##
==========================================
+ Coverage   35.91%   44.22%   +8.31%     
==========================================
  Files          69       75       +6     
  Lines       11576     9823    -1753     
==========================================
+ Hits         4157     4344     +187     
+ Misses       7104     5150    -1954     
- Partials      315      329      +14     

see 90 files with indirect coverage changes

@@ -66,7 +66,8 @@ func onHttpRequestHeader(ctx wrapper.HttpContext, pluginConfig config.PluginConf
apiName := getOpenAiApiName(path.Path)
if apiName == "" {
log.Debugf("[onHttpRequestHeader] unsupported path: %s", path.Path)
_ = util.SendResponse(404, "ai-proxy.unknown_api", util.MimeTypeTextPlain, "API not found: "+path.Path)
// _ = util.SendResponse(404, "ai-proxy.unknown_api", util.MimeTypeTextPlain, "API not found: "+path.Path)
log.Debugf("[onHttpRequestHeader] no send response")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是为啥?

@@ -0,0 +1,4 @@
.DEFAULT:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

测试用文件不建议提交

@@ -1,5 +1,5 @@
# File generated by hgctl. Modify as required.

docker-compose-test/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个 .gitignore 和上面提交上来的 docker-compose-test/envoy.yaml 一起看有点喜感。。。

}

func (c *ProviderConfig) FromJson(json gjson.Result) {
c.typ = json.Get("VectorStoreProviderType").String()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这部分字段名的大小写可以再统一一下

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

另外,timeout 是不是可以统一配置,每个 provider 单独有一个 timeout 字段的有什么好处呢?每个 provider 的默认超时是如何确定的呢?

c.PineconeThreshold = json.Get("PineconeThreshold").Float()
if c.PineconeThreshold == 0 {
c.PineconeThreshold = 0.5
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感觉这部分配置解析逻辑可以拆分到各个 Provider 的 initializer 里。你们可以商量一下这块怎么做代码结构更好。

return
}

d.client.Post(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Post 返回的 error 是要处理的

},
requestBody,
func(statusCode int, responseHeaders http.Header, responseBody []byte) {
log.Infof("Query embedding response: %d, %s", statusCode, responseBody)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里记录 Debug 好一点

Copy link
Collaborator

@johnlanni johnlanni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vector 和 embeding 部分代码比较通用,建议从 ai-cache 目录下挪出来,放到 wasm-go 的 ai-utils 目录下

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants