Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for radial search in exact search #2174

Open
wants to merge 3 commits into
base: feature/build-vector-ds-greedily
Choose a base branch
from

Conversation

VijayanB
Copy link
Member

@VijayanB VijayanB commented Oct 1, 2024

Description

When threshold value is set, knn plugin will not be creating graph. Hence, when search request is trigged during that time, exact search will return valid results. However, radial search was never included as part of exact search. This will break radial search when threshold is added and radial search is requested. In this commit, new method
is introduced to accept min score and return documents that are greater than min score, similar to how radial search is performed by native engines. This search is independent of engine, but, radial search is supported only for FAISS engine out of all native engines.

Related Issues

Part of #1942

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@VijayanB
Copy link
Member Author

VijayanB commented Oct 1, 2024

This depends on #2136

@VijayanB
Copy link
Member Author

VijayanB commented Oct 1, 2024

I am skipping changelog. While merging back to main will update changelog.

@VijayanB VijayanB marked this pull request as draft October 1, 2024 21:13
@VijayanB VijayanB force-pushed the support-exact-search-for-radius branch 3 times, most recently from e0ab310 to 0b33a1b Compare October 1, 2024 21:54
@VijayanB VijayanB marked this pull request as ready for review October 1, 2024 21:56
*/
private boolean isExactSearchRequire(final LeafReaderContext context, final int filterIdsCount, final int annResultCount) {
if (annResultCount == 0 && isMissingNativeEngineFiles(context)) {
log.debug("Perform exact search after approximate search since no native engine files are available");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this case unexpected so we use debug level log?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is in path of query, recommendation is to avoid info.

@VijayanB VijayanB force-pushed the support-exact-search-for-radius branch 3 times, most recently from dc81e8f to 230f286 Compare October 2, 2024 01:29
Copy link
Contributor

@shatejas shatejas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There doesn't seem to be updates in unit tests, will it be present in another iteration?

@VijayanB VijayanB requested a review from shatejas October 2, 2024 23:35
When threshold value is set, knn plugin will not be creating graph.
Hence, when search request is trigged during that time, exact search
will return valid results. However, radial search was never included
as part of exact search. This will break radial search when threshold
is added and radial search is requested. In this commit, new method
is introduced to accept min score and return documents that are greater
than min score, similar to how radial search is performed by native
engines. This search is independent of engine, but, radial search is
supported only for FAISS engine out of all native engines.

Signed-off-by: Vijayan Balasubramanian <[email protected]>
@VijayanB VijayanB force-pushed the support-exact-search-for-radius branch 4 times, most recently from 6c47860 to abc6e69 Compare October 3, 2024 05:55
@VijayanB
Copy link
Member Author

VijayanB commented Oct 3, 2024

There doesn't seem to be updates in unit tests, will it be present in another iteration?

Added unit test

Comment on lines +766 to +767
@SneakyThrows
public void testRadialSearch_whenNoEngineFiles_thenPerformExactSearch() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought I see we are using this test to check ExactSearch for radial search and avoided the Unit test for ExactSearch with radial search threshold. See if you want to mock ExactSearcher search class and all its invocation to make it proper Unit test. Since exact search is not getting added more and more extra logic do you see we should have separate unit test class for ExactSearcher class.

Copy link
Member Author

@VijayanB VijayanB Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I can create Github issue for refactor. I see two option, 1) Keep this PR as it is and refactor as part of GH issue since there are more than 3 test cases that uses exact search, or 2) I can only do it for radial search, as part of GH issue others can be moved later

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the option 2. Lets do atleast for this test case and create a GH issue too for the test refactor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated this PR. Will create GH issue

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signed-off-by: Vijayan Balasubramanian <[email protected]>
@VijayanB VijayanB force-pushed the support-exact-search-for-radius branch 2 times, most recently from 7466091 to efa7cb3 Compare October 3, 2024 19:07
Signed-off-by: Vijayan Balasubramanian <[email protected]>
@VijayanB VijayanB force-pushed the support-exact-search-for-radius branch from efa7cb3 to e51ae11 Compare October 3, 2024 19:11
@@ -71,15 +106,17 @@ private Map<Integer, Float> scoreAllDocs(KNNIterator iterator) throws IOExceptio
return docToScore;
}

private Map<Integer, Float> searchTopK(KNNIterator iterator, int k) throws IOException {
private Map<Integer, Float> searchTopCandidates(KNNIterator iterator, int limit, @NonNull Predicate<Float> filterScore)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why k variable is renamed to limit?

Copy link
Member Author

@VijayanB VijayanB Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method will be called by radialSearch as well. Hence, renamed it from K to be more generic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants