feat: Enhance ai-cache Plugin with Vector Similarity-Based LLM Cache Recall and Multi-DB Support #1248

EnableAsync · 2024-08-25T06:46:25Z

Ⅰ. Describe what this PR did

This PR extends the functionality of the ai-cache plugin, enabling more efficient AI application development by introducing vector similarity-based caching and recall mechanisms.

Ⅱ. Does this pull request fix one issue?

Please refer to issue #1040 and #1041.

Ⅲ. Why don't you add test cases (unit test/integration test)?

Test cases will be added later.

Ⅳ. Describe how to verify it

After filling in the apikey and ChromaCollectionID in docker-compose-test/envoy.yaml, execute the following code:

cd docker-compose-test/
docker compose up

Then test it by accessing the LLM via cURL:

curl http://172.17.0.1:10000/v1/chat/completions -X POST -d '{"model":"gpt-4o-mini","messages":[{"content":"今天中午吃什么","role":"user"}]}' -H "Content-Type: application/json"

Ⅴ. Special notes for reviews

update update: 注意在使用http协议的时候不要用tls update: add lobechat add: makefile for ai-proxy fix bugs fix bugs fix: redis connection fix: dashvector and dashscope cluster fix: change vdb collection feat: add chroma logic docs: 增加 api 说明 update: no callback version fix: change to callback fix: finish chrome remove: key update: gitignore

CLAassistant · 2024-08-25T06:46:32Z

All committers have signed the CLA.

codecov-commenter · 2024-09-06T03:34:11Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 44.22%. Comparing base (ef31e09) to head (a40f5e9).
Report is 89 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1248      +/-   ##
==========================================
+ Coverage   35.91%   44.22%   +8.31%     
==========================================
  Files          69       75       +6     
  Lines       11576     9823    -1753     
==========================================
+ Hits         4157     4344     +187     
+ Misses       7104     5150    -1954     
- Partials      315      329      +14

see 90 files with indirect coverage changes

fix: remove key

…to feat/chroma

CH3CHO · 2024-09-08T06:03:10Z

plugins/wasm-go/extensions/ai-proxy/main.go

@@ -66,7 +66,8 @@ func onHttpRequestHeader(ctx wrapper.HttpContext, pluginConfig config.PluginConf
 	apiName := getOpenAiApiName(path.Path)
 	if apiName == "" {
 		log.Debugf("[onHttpRequestHeader] unsupported path: %s", path.Path)
-		_ = util.SendResponse(404, "ai-proxy.unknown_api", util.MimeTypeTextPlain, "API not found: "+path.Path)
+		// _ = util.SendResponse(404, "ai-proxy.unknown_api", util.MimeTypeTextPlain, "API not found: "+path.Path)
+		log.Debugf("[onHttpRequestHeader] no send response")


这里是为啥？

CH3CHO · 2024-09-08T06:03:44Z

plugins/wasm-go/extensions/ai-proxy/Makefile

@@ -0,0 +1,4 @@
+.DEFAULT:


测试用文件不建议提交

CH3CHO · 2024-09-08T06:04:56Z

plugins/wasm-go/extensions/ai-cache/.gitignore

@@ -1,5 +1,5 @@
 # File generated by hgctl. Modify as required.
-
+docker-compose-test/


这个 .gitignore 和上面提交上来的 docker-compose-test/envoy.yaml 一起看有点喜感。。。

CH3CHO · 2024-09-08T07:30:56Z

plugins/wasm-go/extensions/ai-cache/vector/provider.go

+}
+
+func (c *ProviderConfig) FromJson(json gjson.Result) {
+	c.typ = json.Get("VectorStoreProviderType").String()


这部分字段名的大小写可以再统一一下

另外，timeout 是不是可以统一配置，每个 provider 单独有一个 timeout 字段的有什么好处呢？每个 provider 的默认超时是如何确定的呢？

CH3CHO · 2024-09-08T07:34:39Z

plugins/wasm-go/extensions/ai-cache/vector/provider.go

+	c.PineconeThreshold = json.Get("PineconeThreshold").Float()
+	if c.PineconeThreshold == 0 {
+		c.PineconeThreshold = 0.5
+	}


感觉这部分配置解析逻辑可以拆分到各个 Provider 的 initializer 里。你们可以商量一下这块怎么做代码结构更好。

CH3CHO · 2024-09-08T07:36:18Z

plugins/wasm-go/extensions/ai-cache/vector/weaviate.go

+		return
+	}
+
+	d.client.Post(


Post 返回的 error 是要处理的

CH3CHO · 2024-09-08T07:38:20Z

plugins/wasm-go/extensions/ai-cache/vector/weaviate.go

+		},
+		requestBody,
+		func(statusCode int, responseHeaders http.Header, responseBody []byte) {
+			log.Infof("Query embedding response: %d, %s", statusCode, responseBody)


这里记录 Debug 好一点

johnlanni

vector 和 embeding 部分代码比较通用，建议从 ai-cache 目录下挪出来，放到 wasm-go 的 ai-utils 目录下

johnlanni and others added 8 commits August 1, 2024 15:09

fix bugs

4f7bfbd

fix bugs

0f9e816

fix bugs

ff1bce6

fix conflict

f2a9ff6

Merge branch 'alibaba:main' into main

5cbae03

alter some errors

27b2f71

fix: embedding error

130f2ee

EnableAsync changed the title ~~feat: Add Chroma vector database support to ai-cache WASM Plugin~~ feat: Add ai-cache WASM Plugin Aug 25, 2024

EnableAsync changed the title ~~feat: Add ai-cache WASM Plugin~~ feat: Enhance ai-cache Plugin with Vector Similarity-Based LLM Cache Recall and Multi-DB Support Aug 25, 2024

EnableAsync and others added 3 commits August 25, 2024 17:20

feat: add elasticsearch

3d7e85c

Merge branch 'alibaba:main' into feat/chroma

57bc863

add: makefile for weaviate

d6c643f

EnableAsync added 5 commits September 6, 2024 15:32

feat: add weaviate

3f3a1bc

feat: add pinecone

71cc25b

fix: remove key

Merge branch 'feat/chroma' of https://github.com/Suchun-sv/higress in…

65aafbd

…to feat/chroma

fix: format

bfaed4c

update

a40f5e9

CH3CHO requested changes Sep 8, 2024

View reviewed changes

johnlanni requested changes Sep 12, 2024

View reviewed changes

johnlanni mentioned this pull request Sep 12, 2024

[ai-cache] Implement a WASM plugin for LLM result retrieval based on vector similarity #1290

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Enhance ai-cache Plugin with Vector Similarity-Based LLM Cache Recall and Multi-DB Support #1248

feat: Enhance ai-cache Plugin with Vector Similarity-Based LLM Cache Recall and Multi-DB Support #1248

EnableAsync commented Aug 25, 2024 •

edited

Loading

CLAassistant commented Aug 25, 2024 •

edited

Loading

codecov-commenter commented Sep 6, 2024 •

edited

Loading

CH3CHO Sep 8, 2024

CH3CHO Sep 8, 2024

CH3CHO Sep 8, 2024

CH3CHO Sep 8, 2024

CH3CHO Sep 8, 2024

CH3CHO Sep 8, 2024

CH3CHO Sep 8, 2024

CH3CHO Sep 8, 2024

johnlanni left a comment

		@@ -1,5 +1,5 @@
		# File generated by hgctl. Modify as required.

		docker-compose-test/

feat: Enhance ai-cache Plugin with Vector Similarity-Based LLM Cache Recall and Multi-DB Support #1248

Are you sure you want to change the base?

feat: Enhance ai-cache Plugin with Vector Similarity-Based LLM Cache Recall and Multi-DB Support #1248

Conversation

EnableAsync commented Aug 25, 2024 • edited Loading

Ⅰ. Describe what this PR did

Ⅱ. Does this pull request fix one issue?

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

CLAassistant commented Aug 25, 2024 • edited Loading

codecov-commenter commented Sep 6, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnlanni left a comment

Choose a reason for hiding this comment

EnableAsync commented Aug 25, 2024 •

edited

Loading

CLAassistant commented Aug 25, 2024 •

edited

Loading

codecov-commenter commented Sep 6, 2024 •

edited

Loading