new: update readme to use fastembed, update startups filename

This commit is contained in:
George Panchuk 2024-01-06 13:03:08 +01:00
parent 10beb7b12f
commit 614953ddf4
2 changed files with 10 additions and 8 deletions

View File

@ -18,16 +18,15 @@ You will also need [Docker](https://docs.docker.com/get-docker/) and [docker-com
## Quick Start <a href="https://replit.com/new/github/qdrant/qdrant_demo"><img align="right" src="https://replit.com/badge/github/qdrant/qdrant_demo" alt="Run on Repl.it"></a>
To launch this demo locally you will need to prepare data first.
To launch this demo locally you will need to download data first.
The source of the original data is [https://www.startups-list.com/](https://www.startups-list.com/)
Code for initial data preparation could be found in [Colab Notebook](https://colab.research.google.com/drive/1kPktoudAP8Tu8n8l-iVMOQhVmHkWV_L9?usp=sharing).
You can download the data via the following command:
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1kPktoudAP8Tu8n8l-iVMOQhVmHkWV_L9?usp=sharing)
After evaluating Colab you should get startup
records in file `./data/startups.json` and encoded vectors in file `./data/startup_vectors.npy`
```bash
wget https://storage.googleapis.com/generall-shared-data/startups_demo.json
```
To launch service locally, use
@ -42,9 +41,12 @@ After service is started you can upload initial data to the search engine.
python -m qdrant_demo.init_collection_startups
```
After a successful upload, neural search API will be available at [http://localhost:8000/docs](http://localhost:8000/docs)
You can play with the data in the following [Colab Notebook](https://colab.research.google.com/drive/1kPktoudAP8Tu8n8l-iVMOQhVmHkWV_L9?usp=sharing).
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1kPktoudAP8Tu8n8l-iVMOQhVmHkWV_L9?usp=sharing)
## Start with Crunchbase data

View File

@ -16,7 +16,7 @@ def upload_embeddings():
client.set_model(EMBEDDINGS_MODEL)
payload_path = os.path.join(DATA_DIR, 'startups.json')
payload_path = os.path.join(DATA_DIR, 'startups_demo.json')
payload = []
documents = []