GitHub - openfoodfacts/search-a-licious: 🍊🔎 A pluggable search service for large collections of objects (like Open Food Facts)

NOTE: This is a prototype which is being heavily evolved to be more generic, more robust and have much more functionalities.

This API is currently in development. Read Search-a-licious roadmap architecture notes to understand where we are headed.

Organization

There is a Lit/JS Frontend and a Python (FastAPI) Backend (current README) located on this repository.

Backend

The main file is api.py, and the schema is in models/product.py.

A CLI is available to perform common tasks.

Running the project on your machine

Note: the Makefile will align the user id with your own uid for a smooth editing experience.

Before running the services, you need to make sure that your system mmap count is high enough for Elasticsearch to run. You can do this by running:

sudo sysctl -w vm.max_map_count=262144

Then build the services with:

make build

Start docker:

docker compose up -d

Note

You may encounter a permission error if your user is not part of the docker group, in which case you should either add it or modify the Makefile to prefix sudo to all docker and docker compose commands. Update container crash because we are not connected to any Redis

Docker spins up:

Two elasticsearch nodes
Elasticvue
The search service on port 8000
Redis on port 6379

You will then need to import from a JSONL dump (see instructions below).

Development

Pre-requisites

Installing Docker

First of all, you need to have Docker installed on your machine. You can download it here.
Be sure you can run docker without sudo

Installing Direnv

For Linux and macOS users, You can follow our tutorial to install direnv.¹

Get your user id and group id by running id -u and id -g in your terminal. Add a .envrc file at the root of the project with the following content:

export USER_GID=<your_user_gid>
export USER_UID=<your_user_uid>

export CONFIG_PATH=data/config/openfoodfacts.yml
export OFF_API_URL=https://world.openfoodfacts.org
export ALLOWED_ORIGINS='http://localhost,http://127.0.0.1,https://*.openfoodfacts.org,https://*.openfoodfacts.net'

Installing Pre-commit

You can follow the following tutorial to install pre-commit on your machine.

Installing mmap

Be sure that your system mmap count is high enough for Elasticsearch to run. You can do this by running:

sudo sysctl -w vm.max_map_count=262144

To make the change permanent, you need to add a line vm.max_map_count=262144 to the /etc/sysctl.conf file and run the command sudo sysctl -p to apply the changes. This will ensure that the modified value of vm.max_map_count is retained even after a system reboot. Without this step, the value will be reset to its default value after a reboot.

Running your local instance using Docker

Now you can run the project with Docker docker compose up . After that run the following command on another shell to compile the project: make tsc_watch. Do this for next installation steps and to run the project.

Exploring Elasticsearch data

Go to http://127.0.0.1:8080/welcome
Click on "Add Elasticsearch cluster"
change the cluster name to "docker-cluster"
Click on "Connect"

Importing data into your development environment

Import Taxonomies: make import-taxonomies
Import products :

    # get some sample data
    curl https://world.openfoodfacts.org/data/exports/products.random-modulo-10000.jsonl.gz --output data/products.random-modulo-10000.jsonl.gz
    gzip -d data/products.random-modulo-10000.jsonl.gz
    # we skip updates because we are not connected to any redis
    make import-dataset filepath='products.random-modulo-10000.jsonl' args='--skip-updates'

#### Pages
Now you can go to :
- http://localhost:8000 to have a simple search page without use lit components
or 
- http://localhost:8000/static/off.html to access to lit components search page

To look into the data, you may use elasticvue, going to http://127.0.0.1:8080/ and reaching  http://127.0.0.1:9200 cluster: `docker-cluster` (unless you changed env variables).

#### Pre-Commit

This repo uses [pre-commit](https://pre-commit.com/) to enforce code styling, etc. To use it:
```console
pre-commit install

To run tests without committing:

pre-commit run

Debugging the backend app

To debug the backend app:

stop API instance: docker compose stop api
add a pdb.set_trace() at the point you want,
then launch docker compose run --rm --use-aliases api uvicorn app.api:app --proxy-headers --host 0.0.0.0 --port 8000 --reload[^use_aliases]

Running the full import (45-60 min)

To import data from the JSONL export, download the dataset in the data folder, then run:

make import-dataset filepath='products.jsonl.gz'

If you get errors, try adding more RAM (12GB works well if you have that spare), or slow down the indexing process by setting num_processes to 1 in the command above.

Typical import time is 45-60 minutes.

If you want to skip updates (eg. because you don't have a Redis installed), use make import-dataset filepath='products.jsonl.gz' args="--skip-updates"

You should also import taxonomies:

make import-taxonomies

Using sort script

See How to use scripts

Thank you to our sponsors !

This project has received financial support from the NGI Search (New Generation Internet) program, funded by the 🇪🇺 European Commission. Thank you for supporting Open-Souce, Open Data, and the Commons.

For Windows users, the .envrc is only taken into account by the make commands. ↩

Name		Name	Last commit message	Last commit date
Latest commit History 324 Commits
.cov		.cov
.github		.github
app		app
assets		assets
confs		confs
data		data
docker		docker
docs		docs
frontend		frontend
scripts		scripts
tests		tests
.env		.env
.env.openfoodfacts		.env.openfoodfacts
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CODEOWNERS		CODEOWNERS
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
json_schema.json		json_schema.json
mkdocs.yml		mkdocs.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Organization

Backend

Running the project on your machine

Development

Pre-requisites

Installing Docker

Installing Direnv

Installing Pre-commit

Installing mmap

Running your local instance using Docker

Exploring Elasticsearch data

Importing data into your development environment

Debugging the backend app

Running the full import (45-60 min)

Using sort script

Thank you to our sponsors !

About

Releases 7

Sponsor this project

Packages

Contributors 13

Languages

License

openfoodfacts/search-a-licious

Folders and files

Latest commit

History

Repository files navigation

Organization

Backend

Running the project on your machine

Development

Pre-requisites

Installing Docker

Installing Direnv

Installing Pre-commit

Installing mmap

Running your local instance using Docker

Exploring Elasticsearch data

Importing data into your development environment

Debugging the backend app

Running the full import (45-60 min)

Using sort script

Thank you to our sponsors !

Footnotes

About

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases 7

Sponsor this project

Packages 0

Contributors 13

Languages

Packages