scaleapi / llm-engine
- пятница, 21 июля 2023 г. в 00:00:10
Scale LLM Engine public repository
The open source engine for fine-tuning and serving large language models.
Scale's LLM Engine is the easiest way to customize and serve LLMs. In LLM Engine, models can be accessed via Scale's hosted version or by using the Helm charts in this repository to run model inference and fine-tuning in your own infrastructure.
pip install scale-llm-engine
Foundation models are emerging as the building blocks of AI. However, deploying these models to the cloud and fine-tuning them are expensive operations that require infrastructure and ML expertise. It is also difficult to maintain over time as new models are released and new techniques for both inference and fine-tuning are made available.
LLM Engine is a Python library, CLI, and Helm chart that provides everything you need to serve and fine-tune foundation models, whether you use Scale's hosted infrastructure or do it in your own cloud infrastructure using Kubernetes.
Navigate to Scale Spellbook to first create
an account, and then grab your API key on the Settings
page. Set this API key as the SCALE_API_KEY
environment variable by adding the
following line to your .zshrc
or .bash_profile
:
export SCALE_API_KEY="[Your API key]"
If you run into an "Invalid API Key" error, you may need to run the . ~/.zshrc
command to
re-read your updated .zshrc
.
With your API key set, you can now send LLM Engine requests using the Python client. Try out this starter code:
from llmengine import Completion
response = Completion.create(
model="falcon-7b-instruct",
prompt="I'm opening a pancake restaurant that specializes in unique pancake shapes, colors, and flavors. List 3 quirky names I could name my restaurant.",
max_new_tokens=100,
temperature=0.2,
)
print(response.output.text)
You should see a successful completion of your given prompt!
What's next? Visit the LLM Engine documentation pages for more on
the Completion
and FineTune
APIs and how to use them.