meta-llama / llama-models
- вторник, 10 декабря 2024 г. в 00:00:06
Utilities intended for use with Llama models.
🤗 Models on Hugging Face | Blog | Website | Get Started
Llama is an accessible, open large language model (LLM) designed for developers, researchers, and businesses to build, experiment, and responsibly scale their generative AI ideas. Part of a foundational system, it serves as a bedrock for innovation in the global community. A few key aspects:
Our mission is to empower individuals and industry through this opportunity while fostering an environment of discovery and ethical AI advancements. The model weights are licensed for researchers and commercial entities, upholding the principles of openness.
Model | Launch date | Model sizes | Context Length | Tokenizer | Acceptable use policy | License | Model Card |
---|---|---|---|---|---|---|---|
Llama 2 | 7/18/2023 | 7B, 13B, 70B | 4K | Sentencepiece | Use Policy | License | Model Card |
Llama 3 | 4/18/2024 | 8B, 70B | 8K | TikToken-based | Use Policy | License | Model Card |
Llama 3.1 | 7/23/2024 | 8B, 70B, 405B | 128K | TikToken-based | Use Policy | License | Model Card |
Llama 3.2 | 9/25/2024 | 1B, 3B | 128K | TikToken-based | Use Policy | License | Model Card |
Llama 3.2-Vision | 9/25/2024 | 11B, 90B | 128K | TikToken-based | Use Policy | License | Model Card |
To download the model weights and tokenizer:
Visit the Meta Llama website.
Read and accept the license.
Once your request is approved you will receive a signed URL via email.
Install the Llama CLI: pip install llama-stack
. (<-- Start Here if you have received an email already.)
Run llama model list
to show the latest available models and determine the model ID you wish to download. NOTE:
If you want older versions of models, run llama model list --show-all
to show all the available Llama models.
Run: llama download --source meta --model-id CHOSEN_MODEL_ID
Pass the URL provided when prompted to start the download.
Remember that the links expire after 24 hours and a certain amount of downloads. You can always re-request a link if you start seeing errors such as 403: Forbidden
.
You need to install the following dependencies (in addition to the requirements.txt
in the root directory of this repository) to run the models:
pip install torch fairscale fire blobfile
After installing the dependencies, you can run the example scripts (within llama_models/scripts/
sub-directory) as follows:
#!/bin/bash
CHECKPOINT_DIR=~/.llama/checkpoints/Meta-Llama3.1-8B-Instruct
PYTHONPATH=$(git rev-parse --show-toplevel) torchrun llama_models/scripts/example_chat_completion.py $CHECKPOINT_DIR
The above script should be used with an Instruct (Chat) model. For a Base model, use the script llama_models/scripts/example_text_completion.py
. Note that you can use these scripts with both Llama3 and Llama3.1 series of models.
For running larger models with tensor parallelism, you should modify as:
#!/bin/bash
NGPUS=8
PYTHONPATH=$(git rev-parse --show-toplevel) torchrun \
--nproc_per_node=$NGPUS \
llama_models/scripts/example_chat_completion.py $CHECKPOINT_DIR \
--model_parallel_size $NGPUS
For more flexibility in running inference (including running FP8 inference), please see the Llama Stack
repository.
We also provide downloads on Hugging Face, in both transformers and native llama3
formats. To download the weights from Hugging Face, please follow these steps:
original
folder. You can also download them from the command line if you pip install huggingface-hub
:huggingface-cli download meta-llama/Meta-Llama-3.1-8B-Instruct --include "original/*" --local-dir meta-llama/Meta-Llama-3.1-8B-Instruct
NOTE The original native weights of meta-llama/Meta-Llama-3.1-405B would not be available through this HugginFace repo.
To use with transformers, the following pipeline snippet will download and cache the weights:
import transformers
import torch
model_id = "meta-llama/Meta-Llama-3.1-8B-Instruct"
pipeline = transformers.pipeline(
"text-generation",
model="meta-llama/Meta-Llama-3.1-8B-Instruct",
model_kwargs={"torch_dtype": torch.bfloat16},
device="cuda",
)
You can install this repository as a package by just doing pip install llama-models
Llama models are a new technology that carries potential risks with use. Testing conducted to date has not — and could not — cover all scenarios. To help developers address these risks, we have created the Responsible Use Guide.
Please report any software “bug” or other problems with the models through one of the following means:
For common questions, the FAQ can be found here, which will be updated over time as new questions arise.