Better generation of optimized search query #641

Vegoo89 · 2023-09-16T09:59:45Z

This issue is for a: (mark with an `x`)

- [ ] bug report -> please search issues before submitting
- [x ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

This is not really a bug so I am marking it as feature

We are productionizing this PoC for corporate usage and found out few things that make the bot works better / smoother and generate more predictable queries that are sent to Cognitive Search - at least in our tests.

Right now in code, optimized search query in chatreadretrieveread.py is generated by gluing together:

prompt -> few shots -> whole history -> user query prefixed by "Generate search query for: "

This works pretty well, however on longer conversation chain we found out that that query can get messy, as after few shots there is real conversation history - with question answers - which seems out of place here.

We came up with a simple idea of keeping history of user questions and queries generated by the bot as separate field in the request and response, which allows as to bounce these and keep the backend stateless.

So in the end - after implementation - generation of optimized search query messages would look like this:

prompt -> few shots -> query messages history (user queries along with optimized query response from OpenAI) -> user query prefixed by "Generate search query for: "

If you guys think this approach sounds good, I can open PR with proposed changes. Thanks!

The text was updated successfully, but these errors were encountered:

shulkx · 2023-09-18T13:52:42Z

I am very interested in your suggestions because I also encounter the same problem as you mentioned. Can you state in detail the structure you built for query messages history (user queries along with optimized query response from OpenAI)?

Vegoo89 · 2023-09-18T15:39:33Z

Currently in query messages history we keep only user questions + optimized search query from OpenAI endpoint. Example (I wrote it myself now, didn't copy it from actual queries):

[
  {
    "role": "user"
    "content": "what is abc?"
  },
  {
    "role": "assistant"
    "content": "abc definition"
  },
  {
    "role": "user"
    "content": "what is def?"
  },
  {
    "role": "assistant"
    "content": "def definition"
  },
  {
    "role": "user"
    "content": "define both"
  },
  {
    "role": "assistant"
    "content": "definition of abc and def"
  },
]

As stated, we glue prompt + few shots on start and add current user query in the end (with the prefix). After that prefix is not present anymore in 'real' query messages history.

pamelafox · 2023-09-18T16:10:39Z

Thanks so much for sharing your approach!
I'm going to CC @srbalakr from the ACS team who worked most recently on the query generation for their thoughts. I think PRs are always great to share with the community, even those that don't get merged, but if this produces overall better response quality across many queries/knowledge bases, then we may want it in main.

srbalakr · 2023-09-19T18:39:21Z

Yes please share the PR, I have also put up a PR to stabilize the generation for lengthy chats using function calls. It should address most of the concerns.

github-actions · 2023-11-19T01:47:36Z

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this issue will be closed.

pamelafox · 2023-11-20T23:07:46Z

Re-opened, I'm still interested in this. I don't have multi-turn evaluation setup yet, only single-turn (as you can see in #967) so I haven't been able to evaluate this change programmatically.

github-actions · 2024-01-27T01:43:10Z

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this issue will be closed.

Vegoo89 linked a pull request Sep 20, 2023 that will close this issue

queryMessages field added & query generation optimization #653

Open

github-actions bot added the Stale label Nov 19, 2023

pamelafox removed the Stale label Nov 19, 2023

github-actions bot added the Stale label Jan 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better generation of optimized search query #641

Better generation of optimized search query #641

Vegoo89 commented Sep 16, 2023 •

edited

Loading

shulkx commented Sep 18, 2023

Vegoo89 commented Sep 18, 2023

pamelafox commented Sep 18, 2023

srbalakr commented Sep 19, 2023

github-actions bot commented Nov 19, 2023

pamelafox commented Nov 20, 2023

github-actions bot commented Jan 27, 2024

Better generation of optimized search query #641

Better generation of optimized search query #641

Comments

Vegoo89 commented Sep 16, 2023 • edited Loading

This issue is for a: (mark with an x)

shulkx commented Sep 18, 2023

Vegoo89 commented Sep 18, 2023

pamelafox commented Sep 18, 2023

srbalakr commented Sep 19, 2023

github-actions bot commented Nov 19, 2023

pamelafox commented Nov 20, 2023

github-actions bot commented Jan 27, 2024

Vegoo89 commented Sep 16, 2023 •

edited

Loading

This issue is for a: (mark with an `x`)