NVIDIA / GenerativeAIExamples
- воскресенье, 7 января 2024 г. в 00:00:12
Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.
State-of-the-art Generative AI examples that are easy to deploy, test, and extend. All examples run on the high performance NVIDIA CUDA-X software stack and NVIDIA GPUs.
Generative AI Examples uses resources from the NVIDIA NGC AI Development Catalog.
Sign up for a free NGC developer account to access:
A RAG pipeline embeds multimodal data -- such as documents, images, and video -- into a database connected to a Large Language Model. RAG lets users use an LLM to chat with their own data.
Name | Description | LLM | Framework | Multi-GPU | Multi-node | Embedding | TRT-LLM | Triton | VectorDB | K8s |
---|---|---|---|---|---|---|---|---|---|---|
Linux developer RAG | Single VM, single GPU | llama2-13b | Langchain + Llama Index | No | No | e5-large-v2 | Yes | Yes | Milvus | No |
Windows developer RAG | RAG on Windows | llama2-13b | Llama Index | No | No | NA | Yes | No | FAISS | NA |
Developer LLM Operator for Kubernetes | Single node, single GPU | llama2-13b | Langchain + Llama Index | No | No | e5-large-v2 | Yes | Yes | Milvus | Yes |
NVIDIA LLMs are optimized for building enterprise generative AI applications.
Name | Description | Type | Context Length | Example | License |
---|---|---|---|---|---|
nemotron-3-8b-qa-4k | Q&A LLM customized on knowledge bases | Text Generation | 4096 | No | NVIDIA AI Foundation Models Community License Agreement |
nemotron-3-8b-chat-4k-steerlm | Best out-of-the-box chat model with flexible alignment at inference | Text Generation | 4096 | No | NVIDIA AI Foundation Models Community License Agreement |
nemotron-3-8b-chat-4k-rlhf | Best out-of-the-box chat model performance | Text Generation | 4096 | No | NVIDIA AI Foundation Models Community License Agreement |
In each of the READMEs, we indicate the level of support provided.
We're posting these examples on GitHub to better support the community, facilitate feedback, as well as collect and implement contributions using GitHub Issues and pull requests. We welcome all contributions!