Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NOT sort by score option #28

Closed
peterbe opened this issue Jul 28, 2021 · 14 comments
Closed

NOT sort by score option #28

peterbe opened this issue Jul 28, 2021 · 14 comments
Assignees
Labels
enhancement New feature or request released Usually used to mark an enhancement or a bug as resolved and released in a pkg

Comments

@peterbe
Copy link

peterbe commented Jul 28, 2021

The find method always sorts by .score.
What I need is that it doesn't sort at all but leaves the order of the original list.

So, looking at the Basic usage that example returns:

Output: lisp, kotlin, elixir

but what I need it to be is:

Output: kotlin, elixir, lisp

because that's the order they appear in the list option:

const list = ['go', 'javascript', 'python', 'rust', 
              'swift', 'kotlin', 'elixir', 'java', 
              'lisp', 'v', 'zig', 'nim', 'rescript', 
              'd', 'haskell']

The reason for needing this is that my list that I search on are sorted by "popularity".
If the user only type something really short like f I really want to favor the popular items first because the assumed likelihood that it's what the user expects to find.

So ideal would be something like:

const entries = fzf.find('li', {sort: (item) => item})

or

const entries = fzf.find('li', {sortByScore: false})

so that it only "filters" the list.

@peterbe
Copy link
Author

peterbe commented Jul 28, 2021

To give context, on MDN Web docs you can fuzzy search the document URIs by starting your autocomplete search with a / character (and no spaces).
Screen Shot 2021-07-27 at 8 25 19 PM

Try the full Elasticsearch search: https://developer.mozilla.org/en-US/search?q=foreach
There are a lot of documents that match that but it seems most people are looking for this one according to our pageviews analytics.

But our current implementation isn't great and fzf looks like a really good candidate to move to.

@ajitid
Copy link
Owner

ajitid commented Jul 28, 2021

Hi @peterbe! Optionally providing a sorter is something that I do have in mind and is already noted here (first point). There are few perf related experiments that I'm doing, after which this task will be picked up.

our current implementation isn't great

What issues are you facing with the current implementation? Is it fuzzy matching, less relevant results on searching document URIs, performance or something else?

Also, could you please point me to the code where the current fuzzy finding implementation is written?

@peterbe
Copy link
Author

peterbe commented Jul 28, 2021

Also, could you please point me to the code where the current fuzzy finding implementation is written?

It's here: https://github.com/mdn/yari/blob/main/client/src/fuzzy-search.ts
It's quite a mouthful because our code is called fuzzy-search.ts and it depends on 2 third-party libraries:

  • fuzzysearch
  • fuzzyjs

One of them is fast and I use that to reduce down the list of 10,000+ strings down to 1,000 strings. That weeds out the ones that don't even match. Then, the second one is used to get the positions so that I can build the React code for highlighting.

My hope is that fzf could replace almost all of that and generally yield better results.

My very very rough experimentation with switch to fzf is here: mdn/yari@main...peterbe:experimenting-with-fzf-for-fuzzy-search

@peterbe
Copy link
Author

peterbe commented Jul 28, 2021

What issues are you facing with the current implementation? Is it fuzzy matching, less relevant results on searching document URIs, performance or something else?

Yeah, relevance. I'd rather display the "best" document ahead of any that "matches" better.
We're doing the same with our full site-search. There, you can actually change the sort order between "best", "relevance", and "popularity" where "best" means a combination of "relevance" and "popularity".
See sample search

VSCode does something similar and different.
Screen Shot 2021-07-28 at 9 30 23 AM
There, they make two result lists: recently opened files, all other files.
But in this example, they show testing.yml ahead of package.json (both contain g) because I think I touched testing.yml more recently which is how I think they rank "popularity".

@peterbe
Copy link
Author

peterbe commented Jul 28, 2021

Funny and timely! I just discovered a bug in our existing code (that doesn't use fzf).
I filed this for ourselves: mdn/yari#4346
I don't know how or where it regressed but I'll need to worry about that a little.

@peterbe
Copy link
Author

peterbe commented Jul 28, 2021

Ok. So I set down to try to figure out what the heck is going on with mdn/yari#4346 and how I can fix it with the existing stack.
Then it occurred to me, why not try to jump over to fzf straight away it already seems more powerful and convenient.
So I cooked up mdn/yari#4347

@ajitid I know it's a big ask but would you be interested in helping to review that. Or at least test it or give it a quick look-over.
The mdn/yari repo is the public (MPL-2.0) repo for MDN Web Docs.

@ajitid
Copy link
Owner

ajitid commented Jul 28, 2021

Reviewed it and the code looks fine.

From mdn/yari#4347 (comment):

I realized I had to rewrite how we use fuzzyjs but I've also experimented with fzf and it's proven to be very performant.

I won't be able to confirm as I haven't tested fzf against fuzzyjs + fuzzysearch though I've seen some initial tests done here.

@peterbe
Copy link
Author

peterbe commented Jul 28, 2021

That was not a fair comparison for the record. :)

By the way, I think contributing a PR to this code base would be very feasible now that I've browsed the code a bit.
But it would be nice to have your "blessing" in what you think the new option should be called. Some options:

  • sortByScore: boolean
  • sort override
  • disableSortByScore: boolean

@ajitid
Copy link
Owner

ajitid commented Jul 28, 2021

I'm already working on a branch https://github.com/ajitid/fzf-for-js/tree/feat/skip-score-sorting that will add a boolean sort.

@ajitid
Copy link
Owner

ajitid commented Aug 1, 2021

sort option is now available in v0.3.2. This can be used to disable sorting by score:

const fzf = new Fzf(list, {
  sort: false
})

@ajitid ajitid closed this as completed Aug 1, 2021
@ajitid ajitid added the released Usually used to mark an enhancement or a bug as resolved and released in a pkg label Aug 1, 2021
@ajitid
Copy link
Owner

ajitid commented Aug 1, 2021

I think contributing a PR to this code base would be very feasible

Thanks @peterbe! While we've merged the (sort) feature, this project still needs contributions in particular areas, most notably performance. Any contribution, either in the form of a PR or a suggestion would be helpful. Cheers!

@peterbe
Copy link
Author

peterbe commented Aug 2, 2021

@ajitid You're going to hate me for this but, on some basic tests I did, I'm now going to "reverse" my opinion on the default sort function in fzf. Check these out:

sort=false
sort=false

sort=true
sort=true

What happens is if you let the sort be by the documents that have any match, it misses out on those searches where a lot of the characters are together.

@ajitid ajitid reopened this Aug 2, 2021
@peterbe
Copy link
Author

peterbe commented Aug 2, 2021

I didn't for you to re-open it. What I meant was that after some more testing it certainly "feels" like it's better to use fzf's sorting rather than relying on the natural order of the documents that it searches.

@ajitid
Copy link
Owner

ajitid commented Aug 2, 2021

Okay, I'll close it for now then.

@ajitid ajitid closed this as completed Aug 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request released Usually used to mark an enhancement or a bug as resolved and released in a pkg
Projects
None yet
Development

No branches or pull requests

2 participants