TabbyML / tabby
- понедельник, 10 апреля 2023 г. в 00:14:34
Self-hosted AI coding assistant
Self-hosted AI coding assistant. An opensource / on-prem alternative to GitHub Copilot.
Warning Tabby is still in the alpha phase
NOTE: Tabby requires Pascal or newer NVIDIA GPU.
Before running Tabby, ensure the installation of the NVIDIA Container Toolkit. We suggest using NVIDIA drivers that are compatible with CUDA version 11.8 or higher.
# Create data dir and grant owner to 1000 (Tabby run as uid 1000 in container)
mkdir -p data/hf_cache && chown -R 1000 data
docker run \
--gpus all \
-it --rm \
-v "./data:/data" \
-v "./data/hf_cache:/home/app/.cache/huggingface" \
-p 5000:5000 \
-e MODEL_NAME=TabbyML/J-350M \
-e MODEL_BACKEND=triton \
--name=tabby \
tabbyml/tabby
You can then query the server using /v1/completions
endpoint:
curl -X POST http://localhost:5000/v1/completions -H 'Content-Type: application/json' --data '{
"prompt": "def binarySearch(arr, left, right, x):\n mid = (left +"
}'
We also provides an interactive playground in admin panel localhost:5000/_admin
See deployment/skypilot/README.md
Tabby opens an FastAPI server at localhost:5000, which embeds an OpenAPI documentation of the HTTP API.
Go to development
directory.
make dev
or
make dev-triton # Turn on triton backend (for cuda env developers)