docs: add config in readme
This commit is contained in:
parent
5c9f8ae42e
commit
34c762f7e7
|
@ -20,7 +20,7 @@ jobs:
|
||||||
core.setFailed('token are not equivalent!')
|
core.setFailed('token are not equivalent!')
|
||||||
if: github.event.inputs.release_token != env.release_token
|
if: github.event.inputs.release_token != env.release_token
|
||||||
env:
|
env:
|
||||||
release_token: ${{ secrets.LCSERVE_RELEASE_TOKEN }}
|
release_token: ${{ secrets.VECTORDB_RELEASE_TOKEN }}
|
||||||
|
|
||||||
update-docker:
|
update-docker:
|
||||||
needs: token-check
|
needs: token-check
|
||||||
|
@ -32,7 +32,7 @@ jobs:
|
||||||
token: ${{ secrets.JINA_DEV_BOT }}
|
token: ${{ secrets.JINA_DEV_BOT }}
|
||||||
inputs: '{ "release_token": "${{ env.release_token }}", "triggered_by": "TAG"}'
|
inputs: '{ "release_token": "${{ env.release_token }}", "triggered_by": "TAG"}'
|
||||||
env:
|
env:
|
||||||
release_token: ${{ secrets.LCSERVE_RELEASE_TOKEN }}
|
release_token: ${{ secrets.VECTORDB_RELEASE_TOKEN }}
|
||||||
|
|
||||||
regular-release:
|
regular-release:
|
||||||
needs: token-check
|
needs: token-check
|
||||||
|
|
41
README.md
41
README.md
|
@ -34,8 +34,8 @@ use and develop vector databases.
|
||||||
- Serverless capacity: `vectordb` can be deployed in the cloud in serverless mode, allowing you to save resources and have the data available only when needed.
|
- Serverless capacity: `vectordb` can be deployed in the cloud in serverless mode, allowing you to save resources and have the data available only when needed.
|
||||||
|
|
||||||
- Multiple ANN algorithms: `vectordb` contains different implementations of ANN algorithms. These are the ones offered so far, we plan to integrate more:
|
- Multiple ANN algorithms: `vectordb` contains different implementations of ANN algorithms. These are the ones offered so far, we plan to integrate more:
|
||||||
- Exact NN Search: Implements Simple Nearest Neighbour Algorithm.
|
- InMemoryExactNNVectorDB (Exact NN Search): Implements Simple Nearest Neighbour Algorithm.
|
||||||
- HNSWLib: Based on [HNSWLib](https://github.com/nmslib/hnswlib)
|
- HNSWVectorDB (based on HNSW): Based on [HNSWLib](https://github.com/nmslib/hnswlib)
|
||||||
|
|
||||||
<!--(THIS CAN BE SHOWN WHEN FILTER IS ENABLED)- Filter capacity: `vectordb` allows you to have filters on top of the ANN search. -->
|
<!--(THIS CAN BE SHOWN WHEN FILTER IS ENABLED)- Filter capacity: `vectordb` allows you to have filters on top of the ANN search. -->
|
||||||
|
|
||||||
|
@ -43,7 +43,7 @@ use and develop vector databases.
|
||||||
|
|
||||||
## 🏁 Getting Started
|
## 🏁 Getting Started
|
||||||
|
|
||||||
To get started with Vector Database, simply follow these easy steps, in this example we are going to use `HNSWVecDB` as example:
|
To get started with Vector Database, simply follow these easy steps, in this example we are going to use `InMemoryExactNNVectorDB` as example:
|
||||||
|
|
||||||
1. Install `vectordb`:
|
1. Install `vectordb`:
|
||||||
|
|
||||||
|
@ -62,10 +62,10 @@ class MyTextDoc(TextDoc):
|
||||||
|
|
||||||
Make sure that the schema has a field `schema` as a `tensor` type with shape annotation as in the example.
|
Make sure that the schema has a field `schema` as a `tensor` type with shape annotation as in the example.
|
||||||
|
|
||||||
3. Use any of the pre-built databases with the document schema (InMemoryExactNNVectorDB or HNSWLibDB):
|
3. Use any of the pre-built databases with the document schema (InMemoryExactNNVectorDB or HNSWVectorDB):
|
||||||
|
|
||||||
```python
|
```python
|
||||||
from vectordb import InMemoryExactNNVectorDB, HNSWLibDB
|
from vectordb import InMemoryExactNNVectorDB, HNSWVectorDB
|
||||||
db = InMemoryExactNNVectorDB[MyTextDoc](workspace='./workspace_path')
|
db = InMemoryExactNNVectorDB[MyTextDoc](workspace='./workspace_path')
|
||||||
|
|
||||||
db.index(inputs=DocList[MyTextDoc]([MyTextDoc(text=f'index {i}', embedding=np.random.rand(128)) for i in range(1000)]))
|
db.index(inputs=DocList[MyTextDoc]([MyTextDoc(text=f'index {i}', embedding=np.random.rand(128)) for i in range(1000)]))
|
||||||
|
@ -210,6 +210,37 @@ You can then list and delete your deployed DBs with `jc` command:
|
||||||
|
|
||||||
## ⚙️ Configure
|
## ⚙️ Configure
|
||||||
|
|
||||||
|
Here you can find the list of parameters you can use to configure the behavior for each of the `VectorDB` types.
|
||||||
|
|
||||||
|
### InMemoryExactNNVectorDB
|
||||||
|
|
||||||
|
This database type does an exhaustive search on the embeddings and therefore has a very limited configuration setting:
|
||||||
|
|
||||||
|
- workspace: The folder where the required data will be persisted.
|
||||||
|
|
||||||
|
```python
|
||||||
|
InMemoryExactNNVectorDB[MyDoc](workspace='./vectordb')
|
||||||
|
InMemoryExactNNVectorDB[MyDoc].serve(workspace='./vectordb')
|
||||||
|
```
|
||||||
|
|
||||||
|
### HNSWVectorDB
|
||||||
|
|
||||||
|
This database implements Approximate Nearest Neighbour based on HNSW algorithm using [HNSWLib](https://github.com/nmslib/hnswlib).
|
||||||
|
|
||||||
|
It containes more configuration options:
|
||||||
|
|
||||||
|
- workspace: The folder where the required data will be persisted.
|
||||||
|
|
||||||
|
Then a set of configurations that tweak the performance and accuracy of the NN search algorithm. You can find more details in [HNSWLib README](https://github.com/nmslib/hnswlib)
|
||||||
|
|
||||||
|
- space: name of the space, related to the similarity metric used (can be one of "l2", "ip", or "cosine"), default: "l2"
|
||||||
|
- max_elements: Initial capacity of the index, which is increased dynamically, default: 1024,
|
||||||
|
- ef_construction: parameter that controls speed/accuracy trade-off during the index construction, default: 200,
|
||||||
|
- ef: parameter controlling query time/accuracy trade-off, default: 10,
|
||||||
|
- M: parameter that defines the maximum number of outgoing connections in the graph, default: 16.
|
||||||
|
- allow_replace_deleted: enables replacing of deleted elements with new added ones, default: False
|
||||||
|
- num_threads: default number of threads to use while `index` and `search` are used, default: 1
|
||||||
|
|
||||||
## 🛣️ Roadmap
|
## 🛣️ Roadmap
|
||||||
|
|
||||||
We have big plans for the future of Vector Database! Here are some of the features we have in the works:
|
We have big plans for the future of Vector Database! Here are some of the features we have in the works:
|
||||||
|
|
|
@ -84,7 +84,7 @@ class InMemoryExactNNIndexer(TypedExecutor):
|
||||||
|
|
||||||
def close(self):
|
def close(self):
|
||||||
if self._index_file_path is not None:
|
if self._index_file_path is not None:
|
||||||
self._indexer.persist(self._index_file_path)
|
self._indexer.persist()
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue