Go to file
George 0c14790d89
fix: update docker compose for macos, update gitignore (#22)
2024-01-06 14:50:29 +01:00
.github/workflows deployment ready 2023-09-24 23:00:52 +02:00
frontend Merge pull request #13 from qdrant/dependabot/npm_and_yarn/frontend/babel/traverse-7.23.2 2023-12-26 14:03:57 +00:00
qdrant_demo new: update readme to use fastembed, update startups filename 2024-01-06 13:03:08 +01:00
.dockerignore indexing for old data + readme updates 2023-09-24 23:53:40 +02:00
.gitignore fix: update docker compose for macos, update gitignore (#22) 2024-01-06 14:50:29 +01:00
Dockerfile fix: clean apt cache 2024-01-05 11:18:37 +01:00
README.md fix: fix wget call 2024-01-06 14:37:05 +01:00
docker-compose-local.yaml fix: update docker compose for macos, update gitignore (#22) 2024-01-06 14:50:29 +01:00
docker-compose.yaml Docker remodified 2023-09-22 13:00:21 +05:30
poetry.lock migrate to fastembed 2023-10-17 00:06:08 +02:00
pyproject.toml migrate to fastembed 2023-10-17 00:06:08 +02:00
sync.sh deployment ready 2023-09-24 23:00:52 +02:00

README.md

Neural search demo

With Qdrant + BERT + FastAPI

This repository contains a code for Neural Search for startups demo.

The demo is based on the vector search engine Qdrant.

Requirements

Install python requirements:

pip install poetry
poetry install

You will also need Docker and docker-compose

Quick Start Run on Repl.it

To launch this demo locally you will need to download data first.

The source of the original data is https://www.startups-list.com/

You can download the data via the following command:

wget https://storage.googleapis.com/generall-shared-data/startups_demo.json -P data/

To launch service locally, use

docker-compose -f docker-compose-local.yaml up

After service is started you can upload initial data to the search engine.

# Init neural index
python -m qdrant_demo.init_collection_startups

After a successful upload, neural search API will be available at http://localhost:8000/docs

You can play with the data in the following Colab Notebook.

Open In Colab

Start with Crunchbase data

Alternatively, you can use larger dataset of companies provided by Crunchbase.

You will need to register at https://www.crunchbase.com/ and get an API key.

# Download data
wget 'https://api.crunchbase.com/odm/v4/odm.tar.gz?user_key=<CRUNCHBASE-API-KEY>' -O odm.tar.gz

Decompress data and put organizations.csv into ./data folder.

# Decompress data
tar -xvf odm.tar.gz
mv odm/organizations.csv ./data

After that, you can run indexing of Crunchbase data into Qdrant.

# Init neural index
python -m qdrant_demo.init_collection_crunchbase