qdrant_demo/README.md

76 lines
2.3 KiB
Markdown
Raw Permalink Normal View History

2020-11-28 21:45:55 +08:00
# Neural search demo
## With Qdrant + BERT + FastAPI
2020-11-28 21:45:55 +08:00
This repository contains a code for Neural Search for startups [demo](https://demo.qdrant.tech).
The demo is based on the vector search engine [Qdrant](https://github.com/qdrant/qdrant).
## Requirements
Install python requirements:
```
pip install poetry
poetry install
```
You will also need [Docker](https://docs.docker.com/get-docker/) and [docker-compose](https://docs.docker.com/compose/install/)
2023-11-23 08:40:44 +08:00
## Quick Start <a href="https://replit.com/new/github/qdrant/qdrant_demo"><img align="right" src="https://replit.com/badge/github/qdrant/qdrant_demo" alt="Run on Repl.it"></a>
To launch this demo locally you will need to download data first.
The source of the original data is [https://www.startups-list.com/](https://www.startups-list.com/)
You can download the data via the following command:
```bash
2024-01-06 21:37:05 +08:00
wget https://storage.googleapis.com/generall-shared-data/startups_demo.json -P data/
```
2021-06-15 18:19:33 +08:00
To launch service locally, use
```
docker-compose -f docker-compose-local.yaml up
```
After service is started you can upload initial data to the search engine.
```
# Init neural index
2023-09-25 05:53:40 +08:00
python -m qdrant_demo.init_collection_startups
```
After a successful upload, neural search API will be available at [http://localhost:8000/docs](http://localhost:8000/docs)
2023-09-25 05:53:40 +08:00
You can play with the data in the following [Colab Notebook](https://colab.research.google.com/drive/1kPktoudAP8Tu8n8l-iVMOQhVmHkWV_L9?usp=sharing).
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1kPktoudAP8Tu8n8l-iVMOQhVmHkWV_L9?usp=sharing)
2023-09-25 05:53:40 +08:00
## Start with Crunchbase data
Alternatively, you can use larger dataset of companies provided by [Crunchbase](https://www.crunchbase.com/).
You will need to register at [https://www.crunchbase.com/](https://www.crunchbase.com/) and get an API key.
```bash
# Download data
wget 'https://api.crunchbase.com/odm/v4/odm.tar.gz?user_key=<CRUNCHBASE-API-KEY>' -O odm.tar.gz
```
Decompress data and put `organizations.csv` into `./data` folder.
```bash
# Decompress data
tar -xvf odm.tar.gz
mv odm/organizations.csv ./data
```
After that, you can run indexing of Crunchbase data into Qdrant.
```bash
# Init neural index
python -m qdrant_demo.init_collection_crunchbase
```