0c14790d89 | ||
---|---|---|
.github/workflows | ||
frontend | ||
qdrant_demo | ||
.dockerignore | ||
.gitignore | ||
Dockerfile | ||
README.md | ||
docker-compose-local.yaml | ||
docker-compose.yaml | ||
poetry.lock | ||
pyproject.toml | ||
sync.sh |
README.md
Neural search demo
With Qdrant + BERT + FastAPI
This repository contains a code for Neural Search for startups demo.
The demo is based on the vector search engine Qdrant.
Requirements
Install python requirements:
pip install poetry
poetry install
You will also need Docker and docker-compose
Quick Start
To launch this demo locally you will need to download data first.
The source of the original data is https://www.startups-list.com/
You can download the data via the following command:
wget https://storage.googleapis.com/generall-shared-data/startups_demo.json -P data/
To launch service locally, use
docker-compose -f docker-compose-local.yaml up
After service is started you can upload initial data to the search engine.
# Init neural index
python -m qdrant_demo.init_collection_startups
After a successful upload, neural search API will be available at http://localhost:8000/docs
You can play with the data in the following Colab Notebook.
Start with Crunchbase data
Alternatively, you can use larger dataset of companies provided by Crunchbase.
You will need to register at https://www.crunchbase.com/ and get an API key.
# Download data
wget 'https://api.crunchbase.com/odm/v4/odm.tar.gz?user_key=<CRUNCHBASE-API-KEY>' -O odm.tar.gz
Decompress data and put organizations.csv
into ./data
folder.
# Decompress data
tar -xvf odm.tar.gz
mv odm/organizations.csv ./data
After that, you can run indexing of Crunchbase data into Qdrant.
# Init neural index
python -m qdrant_demo.init_collection_crunchbase