@@ -4,19 +4,24 @@ Extracted Literal Search is a simple method for Retrieval Augmented Generation (
For small datasets where most of the data fields are known literals, this method is easier to implement and might provide better results than Similarity Search of embeddings.
## Installation
Make sure you have `poetry` installed.
> pip install --local poetry
Install the package.
> poetry install
Export your Huggingface token.
> export HUGGING_FACE_HUB_TOKEN=<TOKEN>
Run the vLLM server. For example with a quantized Llama-2 model.
> poetry run python -m outlines.serve.serve --model="TheBloke/Llama-2-7b-Chat-GPTQ" -q gptq
> poetry run python -m outlines.serve.serve --model="TheBloke/Llama-2-7b-Chat-GPTQ" -q gptq