vxcontrol / pentagi
- ััะฑะฑะพัะฐ, 21 ัะตะฒัะฐะปั 2026โฏะณ. ะฒ 00:00:02
โจ Fully autonomous AI Agents system capable of performing complex penetration testing tasks
๐ Join the Community! Connect with security researchers, AI enthusiasts, and fellow ethical hackers. Get support, share insights, and stay updated with the latest PentAGI developments.
PentAGI is an innovative tool for automated security testing that leverages cutting-edge artificial intelligence technologies. The project is designed for information security professionals, researchers, and enthusiasts who need a powerful and flexible solution for conducting penetration tests.
You can watch the video PentAGI overview:

flowchart TB
classDef person fill:#08427B,stroke:#073B6F,color:#fff
classDef system fill:#1168BD,stroke:#0B4884,color:#fff
classDef external fill:#666666,stroke:#0B4884,color:#fff
pentester["๐ค Security Engineer
(User of the system)"]
pentagi["โจ PentAGI
(Autonomous penetration testing system)"]
target["๐ฏ target-system
(System under test)"]
llm["๐ง llm-provider
(OpenAI/Anthropic/Ollama/Bedrock/Gemini/Custom)"]
search["๐ search-systems
(Google/DuckDuckGo/Tavily/Traversaal/Perplexity/Searxng)"]
langfuse["๐ langfuse-ui
(LLM Observability Dashboard)"]
grafana["๐ grafana
(System Monitoring Dashboard)"]
pentester --> |Uses HTTPS| pentagi
pentester --> |Monitors AI HTTPS| langfuse
pentester --> |Monitors System HTTPS| grafana
pentagi --> |Tests Various protocols| target
pentagi --> |Queries HTTPS| llm
pentagi --> |Searches HTTPS| search
pentagi --> |Reports HTTPS| langfuse
pentagi --> |Reports HTTPS| grafana
class pentester person
class pentagi system
class target,llm,search,langfuse,grafana external
linkStyle default stroke:#ffffff,color:#ffffff
graph TB
subgraph Core Services
UI[Frontend UI<br/>React + TypeScript]
API[Backend API<br/>Go + GraphQL]
DB[(Vector Store<br/>PostgreSQL + pgvector)]
MQ[Task Queue<br/>Async Processing]
Agent[AI Agents<br/>Multi-Agent System]
end
subgraph Knowledge Graph
Graphiti[Graphiti<br/>Knowledge Graph API]
Neo4j[(Neo4j<br/>Graph Database)]
end
subgraph Monitoring
Grafana[Grafana<br/>Dashboards]
VictoriaMetrics[VictoriaMetrics<br/>Time-series DB]
Jaeger[Jaeger<br/>Distributed Tracing]
Loki[Loki<br/>Log Aggregation]
OTEL[OpenTelemetry<br/>Data Collection]
end
subgraph Analytics
Langfuse[Langfuse<br/>LLM Analytics]
ClickHouse[ClickHouse<br/>Analytics DB]
Redis[Redis<br/>Cache + Rate Limiter]
MinIO[MinIO<br/>S3 Storage]
end
subgraph Security Tools
Scraper[Web Scraper<br/>Isolated Browser]
PenTest[Security Tools<br/>20+ Pro Tools<br/>Sandboxed Execution]
end
UI --> |HTTP/WS| API
API --> |SQL| DB
API --> |Events| MQ
MQ --> |Tasks| Agent
Agent --> |Commands| PenTest
Agent --> |Queries| DB
Agent --> |Knowledge| Graphiti
Graphiti --> |Graph| Neo4j
API --> |Telemetry| OTEL
OTEL --> |Metrics| VictoriaMetrics
OTEL --> |Traces| Jaeger
OTEL --> |Logs| Loki
Grafana --> |Query| VictoriaMetrics
Grafana --> |Query| Jaeger
Grafana --> |Query| Loki
API --> |Analytics| Langfuse
Langfuse --> |Store| ClickHouse
Langfuse --> |Cache| Redis
Langfuse --> |Files| MinIO
classDef core fill:#f9f,stroke:#333,stroke-width:2px,color:#000
classDef knowledge fill:#ffa,stroke:#333,stroke-width:2px,color:#000
classDef monitoring fill:#bbf,stroke:#333,stroke-width:2px,color:#000
classDef analytics fill:#bfb,stroke:#333,stroke-width:2px,color:#000
classDef tools fill:#fbb,stroke:#333,stroke-width:2px,color:#000
class UI,API,DB,MQ,Agent core
class Graphiti,Neo4j knowledge
class Grafana,VictoriaMetrics,Jaeger,Loki,OTEL monitoring
class Langfuse,ClickHouse,Redis,MinIO analytics
class Scraper,PenTest tools
erDiagram
Flow ||--o{ Task : contains
Task ||--o{ SubTask : contains
SubTask ||--o{ Action : contains
Action ||--o{ Artifact : produces
Action ||--o{ Memory : stores
Flow {
string id PK
string name "Flow name"
string description "Flow description"
string status "active/completed/failed"
json parameters "Flow parameters"
timestamp created_at
timestamp updated_at
}
Task {
string id PK
string flow_id FK
string name "Task name"
string description "Task description"
string status "pending/running/done/failed"
json result "Task results"
timestamp created_at
timestamp updated_at
}
SubTask {
string id PK
string task_id FK
string name "Subtask name"
string description "Subtask description"
string status "queued/running/completed/failed"
string agent_type "researcher/developer/executor"
json context "Agent context"
timestamp created_at
timestamp updated_at
}
Action {
string id PK
string subtask_id FK
string type "command/search/analyze/etc"
string status "success/failure"
json parameters "Action parameters"
json result "Action results"
timestamp created_at
}
Artifact {
string id PK
string action_id FK
string type "file/report/log"
string path "Storage path"
json metadata "Additional info"
timestamp created_at
}
Memory {
string id PK
string action_id FK
string type "observation/conclusion"
vector embedding "Vector representation"
text content "Memory content"
timestamp created_at
}
sequenceDiagram
participant O as Orchestrator
participant R as Researcher
participant D as Developer
participant E as Executor
participant VS as Vector Store
participant KB as Knowledge Base
Note over O,KB: Flow Initialization
O->>VS: Query similar tasks
VS-->>O: Return experiences
O->>KB: Load relevant knowledge
KB-->>O: Return context
Note over O,R: Research Phase
O->>R: Analyze target
R->>VS: Search similar cases
VS-->>R: Return patterns
R->>KB: Query vulnerabilities
KB-->>R: Return known issues
R->>VS: Store findings
R-->>O: Research results
Note over O,D: Planning Phase
O->>D: Plan attack
D->>VS: Query exploits
VS-->>D: Return techniques
D->>KB: Load tools info
KB-->>D: Return capabilities
D-->>O: Attack plan
Note over O,E: Execution Phase
O->>E: Execute plan
E->>KB: Load tool guides
KB-->>E: Return procedures
E->>VS: Store results
E-->>O: Execution status
graph TB
subgraph "Long-term Memory"
VS[(Vector Store<br/>Embeddings DB)]
KB[Knowledge Base<br/>Domain Expertise]
Tools[Tools Knowledge<br/>Usage Patterns]
end
subgraph "Working Memory"
Context[Current Context<br/>Task State]
Goals[Active Goals<br/>Objectives]
State[System State<br/>Resources]
end
subgraph "Episodic Memory"
Actions[Past Actions<br/>Commands History]
Results[Action Results<br/>Outcomes]
Patterns[Success Patterns<br/>Best Practices]
end
Context --> |Query| VS
VS --> |Retrieve| Context
Goals --> |Consult| KB
KB --> |Guide| Goals
State --> |Record| Actions
Actions --> |Learn| Patterns
Patterns --> |Store| VS
Tools --> |Inform| State
Results --> |Update| Tools
VS --> |Enhance| KB
KB --> |Index| VS
classDef ltm fill:#f9f,stroke:#333,stroke-width:2px,color:#000
classDef wm fill:#bbf,stroke:#333,stroke-width:2px,color:#000
classDef em fill:#bfb,stroke:#333,stroke-width:2px,color:#000
class VS,KB,Tools ltm
class Context,Goals,State wm
class Actions,Results,Patterns em
The chain summarization system manages conversation context growth by selectively summarizing older messages. This is critical for preventing token limits from being exceeded while maintaining conversation coherence.
flowchart TD
A[Input Chain] --> B{Needs Summarization?}
B -->|No| C[Return Original Chain]
B -->|Yes| D[Convert to ChainAST]
D --> E[Apply Section Summarization]
E --> F[Process Oversized Pairs]
F --> G[Manage Last Section Size]
G --> H[Apply QA Summarization]
H --> I[Rebuild Chain with Summaries]
I --> J{Is New Chain Smaller?}
J -->|Yes| K[Return Optimized Chain]
J -->|No| C
classDef process fill:#bbf,stroke:#333,stroke-width:2px,color:#000
classDef decision fill:#bfb,stroke:#333,stroke-width:2px,color:#000
classDef output fill:#fbb,stroke:#333,stroke-width:2px,color:#000
class A,D,E,F,G,H,I process
class B,J decision
class C,K output
The algorithm operates on a structured representation of conversation chains (ChainAST) that preserves message types including tool calls and their responses. All summarization operations maintain critical conversation flow while reducing context size.
| Parameter | Environment Variable | Default | Description |
|---|---|---|---|
| Preserve Last | SUMMARIZER_PRESERVE_LAST |
true |
Whether to keep all messages in the last section intact |
| Use QA Pairs | SUMMARIZER_USE_QA |
true |
Whether to use QA pair summarization strategy |
| Summarize Human in QA | SUMMARIZER_SUM_MSG_HUMAN_IN_QA |
false |
Whether to summarize human messages in QA pairs |
| Last Section Size | SUMMARIZER_LAST_SEC_BYTES |
51200 |
Maximum byte size for last section (50KB) |
| Max Body Pair Size | SUMMARIZER_MAX_BP_BYTES |
16384 |
Maximum byte size for a single body pair (16KB) |
| Max QA Sections | SUMMARIZER_MAX_QA_SECTIONS |
10 |
Maximum QA pair sections to preserve |
| Max QA Size | SUMMARIZER_MAX_QA_BYTES |
65536 |
Maximum byte size for QA pair sections (64KB) |
| Keep QA Sections | SUMMARIZER_KEEP_QA_SECTIONS |
1 |
Number of recent QA sections to keep without summarization |
Assistant instances can use customized summarization settings to fine-tune context management behavior:
| Parameter | Environment Variable | Default | Description |
|---|---|---|---|
| Preserve Last | ASSISTANT_SUMMARIZER_PRESERVE_LAST |
true |
Whether to preserve all messages in the assistant's last section |
| Last Section Size | ASSISTANT_SUMMARIZER_LAST_SEC_BYTES |
76800 |
Maximum byte size for assistant's last section (75KB) |
| Max Body Pair Size | ASSISTANT_SUMMARIZER_MAX_BP_BYTES |
16384 |
Maximum byte size for a single body pair in assistant context (16KB) |
| Max QA Sections | ASSISTANT_SUMMARIZER_MAX_QA_SECTIONS |
7 |
Maximum QA sections to preserve in assistant context |
| Max QA Size | ASSISTANT_SUMMARIZER_MAX_QA_BYTES |
76800 |
Maximum byte size for assistant's QA sections (75KB) |
| Keep QA Sections | ASSISTANT_SUMMARIZER_KEEP_QA_SECTIONS |
3 |
Number of recent QA sections to preserve without summarization |
The assistant summarizer configuration provides more memory for context retention compared to the global settings, preserving more recent conversation history while still ensuring efficient token usage.
# Default values for global summarizer logic
SUMMARIZER_PRESERVE_LAST=true
SUMMARIZER_USE_QA=true
SUMMARIZER_SUM_MSG_HUMAN_IN_QA=false
SUMMARIZER_LAST_SEC_BYTES=51200
SUMMARIZER_MAX_BP_BYTES=16384
SUMMARIZER_MAX_QA_SECTIONS=10
SUMMARIZER_MAX_QA_BYTES=65536
SUMMARIZER_KEEP_QA_SECTIONS=1
# Default values for assistant summarizer logic
ASSISTANT_SUMMARIZER_PRESERVE_LAST=true
ASSISTANT_SUMMARIZER_LAST_SEC_BYTES=76800
ASSISTANT_SUMMARIZER_MAX_BP_BYTES=16384
ASSISTANT_SUMMARIZER_MAX_QA_SECTIONS=7
ASSISTANT_SUMMARIZER_MAX_QA_BYTES=76800
ASSISTANT_SUMMARIZER_KEEP_QA_SECTIONS=3The architecture of PentAGI is designed to be modular, scalable, and secure. Here are the key components:
Core Services
Knowledge Graph
Monitoring Stack
Analytics Platform
Security Tools
Memory Systems
The system uses Docker containers for isolation and easy deployment, with separate networks for core services, monitoring, and analytics to ensure proper security boundaries. Each component is designed to scale horizontally and can be configured for high availability in production environments.
PentAGI provides an interactive installer with a terminal-based UI for streamlined configuration and deployment. The installer guides you through system checks, LLM provider setup, search engine configuration, and security hardening.
Supported Platforms:
Quick Installation (Linux amd64):
# Create installation directory
mkdir -p pentagi && cd pentagi
# Download installer
wget -O installer.zip https://pentagi.com/downloads/linux/amd64/installer-latest.zip
# Extract
unzip installer.zip
# Run interactive installer
./installerPrerequisites & Permissions:
The installer requires appropriate privileges to interact with the Docker API for proper operation. By default, it uses the Docker socket (/var/run/docker.sock) which requires either:
Option 1 (Recommended for production): Run the installer as root:
sudo ./installerOption 2 (Development environments): Grant your user access to the Docker socket by adding them to the docker group:
# Add your user to the docker group
sudo usermod -aG docker $USER
# Log out and log back in, or activate the group immediately
newgrp docker
# Verify Docker access (should run without sudo)
docker psdocker group grants root-equivalent privileges. Only do this for trusted users in controlled environments. For production deployments, consider using rootless Docker mode or running the installer with sudo.
The installer will:
.env file with optimal defaultsFor Production & Enhanced Security:
For production deployments or security-sensitive environments, we strongly recommend using a distributed two-node architecture where worker operations are isolated on a separate server. This prevents untrusted code execution and network access issues on your main system.
๐ See detailed guide: Worker Node Setup
The two-node setup provides:
mkdir pentagi && cd pentagi.env.example to .env or download it:curl -o .env https://raw.githubusercontent.com/vxcontrol/pentagi/master/.env.exampleexample.custom.provider.yml, example.ollama.provider.yml) or download it:curl -o example.custom.provider.yml https://raw.githubusercontent.com/vxcontrol/pentagi/master/examples/configs/custom-openai.provider.yml
curl -o example.ollama.provider.yml https://raw.githubusercontent.com/vxcontrol/pentagi/master/examples/configs/ollama-llama318b.provider.yml.env file.# Required: At least one of these LLM providers
OPEN_AI_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key
GEMINI_API_KEY=your_gemini_key
# Optional: AWS Bedrock provider (enterprise-grade models)
BEDROCK_REGION=us-east-1
BEDROCK_ACCESS_KEY_ID=your_aws_access_key
BEDROCK_SECRET_ACCESS_KEY=your_aws_secret_key
# Optional: Local LLM provider (zero-cost inference)
OLLAMA_SERVER_URL=http://localhost:11434
OLLAMA_SERVER_MODEL=your_model_name
# Optional: Additional search capabilities
DUCKDUCKGO_ENABLED=true
GOOGLE_API_KEY=your_google_key
GOOGLE_CX_KEY=your_google_cx
TAVILY_API_KEY=your_tavily_key
TRAVERSAAL_API_KEY=your_traversaal_key
PERPLEXITY_API_KEY=your_perplexity_key
PERPLEXITY_MODEL=sonar-pro
PERPLEXITY_CONTEXT_SIZE=medium
# Searxng meta search engine (aggregates results from multiple sources)
SEARXNG_URL=http://your-searxng-instance:8080
SEARXNG_CATEGORIES=general
SEARXNG_LANGUAGE=
SEARXNG_SAFESEARCH=0
SEARXNG_TIME_RANGE=
## Graphiti knowledge graph settings
GRAPHITI_ENABLED=true
GRAPHITI_TIMEOUT=30
GRAPHITI_URL=http://graphiti:8000
GRAPHITI_MODEL_NAME=gpt-5-mini
# Neo4j settings (used by Graphiti stack)
NEO4J_USER=neo4j
NEO4J_DATABASE=neo4j
NEO4J_PASSWORD=devpassword
NEO4J_URI=bolt://neo4j:7687
# Assistant configuration
ASSISTANT_USE_AGENTS=false # Default value for agent usage when creating new assistants.env file to improve security.COOKIE_SIGNING_SALT - Salt for cookie signing, change to random valuePUBLIC_URL - Public URL of your server (eg. https://pentagi.example.com)SERVER_SSL_CRT and SERVER_SSL_KEY - Custom paths to your existing SSL certificate and key for HTTPS (these paths should be used in the docker-compose.yml file to mount as volumes)SCRAPER_PUBLIC_URL - Public URL for scraper if you want to use different scraper server for public URLsSCRAPER_PRIVATE_URL - Private URL for scraper (local scraper server in docker-compose.yml file to access it to local URLs)PENTAGI_POSTGRES_USER and PENTAGI_POSTGRES_PASSWORD - PostgreSQL credentialsNEO4J_USER and NEO4J_PASSWORD - Neo4j credentials (for Graphiti knowledge graph).env file if you want to use it in VSCode or other IDEs as a envFile option:perl -i -pe 's/\s+#.*$//' .envcurl -O https://raw.githubusercontent.com/vxcontrol/pentagi/master/docker-compose.yml
docker compose up -dVisit localhost:8443 to access PentAGI Web UI (default is admin@pentagi.com / admin)
Note
If you caught an error about pentagi-network or observability-network or langfuse-network you need to run docker-compose.yml firstly to create these networks and after that run docker-compose-langfuse.yml, docker-compose-graphiti.yml, and docker-compose-observability.yml to use Langfuse, Graphiti, and Observability services.
You have to set at least one Language Model provider (OpenAI, Anthropic, Gemini, AWS Bedrock, or Ollama) to use PentAGI. AWS Bedrock provides enterprise-grade access to multiple foundation models from leading AI companies, while Ollama provides zero-cost local inference if you have sufficient computational resources. Additional API keys for search engines are optional but recommended for better results.
LLM_SERVER_* environment variables are experimental feature and will be changed in the future. Right now you can use them to specify custom LLM server URL and one model for all agent types.
PROXY_URL is a global proxy URL for all LLM providers and external search systems. You can use it for isolation from external networks.
The docker-compose.yml file runs the PentAGI service as root user because it needs access to docker.sock for container management. If you're using TCP/IP network connection to Docker instead of socket file, you can remove root privileges and use the default pentagi user for better security.
PentAGI allows you to configure default behavior for assistants:
| Variable | Default | Description |
|---|---|---|
ASSISTANT_USE_AGENTS |
false |
Controls the default value for agent usage when creating new assistants |
The ASSISTANT_USE_AGENTS setting affects the initial state of the "Use Agents" toggle when creating a new assistant in the UI:
false (default): New assistants are created with agent delegation disabled by defaulttrue: New assistants are created with agent delegation enabled by defaultNote that users can always override this setting by toggling the "Use Agents" button in the UI when creating or editing an assistant. This environment variable only controls the initial default state.
When using custom LLM providers with the LLM_SERVER_* variables, you can fine-tune the reasoning format used in requests:
| Variable | Default | Description |
|---|---|---|
LLM_SERVER_URL |
Base URL for the custom LLM API endpoint | |
LLM_SERVER_KEY |
API key for the custom LLM provider | |
LLM_SERVER_MODEL |
Default model to use (can be overridden in provider config) | |
LLM_SERVER_CONFIG_PATH |
Path to the YAML configuration file for agent-specific models | |
LLM_SERVER_PROVIDER |
Provider name prefix for model names (e.g., openrouter, deepseek for LiteLLM proxy) |
|
LLM_SERVER_LEGACY_REASONING |
false |
Controls reasoning format in API requests |
LLM_SERVER_PRESERVE_REASONING |
false |
Preserve reasoning content in multi-turn conversations (required by some providers) |
The LLM_SERVER_PROVIDER setting is particularly useful when using LiteLLM proxy, which adds a provider prefix to model names. For example, when connecting to Moonshot API through LiteLLM, models like kimi-2.5 become moonshot/kimi-2.5. By setting LLM_SERVER_PROVIDER=moonshot, you can use the same provider configuration file for both direct API access and LiteLLM proxy access without modifications.
The LLM_SERVER_LEGACY_REASONING setting affects how reasoning parameters are sent to the LLM:
false (default): Uses modern format where reasoning is sent as a structured object with max_tokens parametertrue: Uses legacy format with string-based reasoning_effort parameterThis setting is important when working with different LLM providers as they may expect different reasoning formats in their API requests. If you encounter reasoning-related errors with custom providers, try changing this setting.
The LLM_SERVER_PRESERVE_REASONING setting controls whether reasoning content is preserved in multi-turn conversations:
false (default): Reasoning content is not preserved in conversation historytrue: Reasoning content is preserved and sent in subsequent API callsThis setting is required by some LLM providers (e.g., Moonshot) that return errors like "thinking is enabled but reasoning_content is missing in assistant tool call message" when reasoning content is not included in multi-turn conversations. Enable this setting if your provider requires reasoning content to be preserved.
PentAGI supports Ollama for local LLM inference, providing zero-cost operation and enhanced privacy:
| Variable | Default | Description |
|---|---|---|
OLLAMA_SERVER_URL |
URL of your Ollama server | |
OLLAMA_SERVER_MODEL |
llama3.1:8b-instruct-q8_0 |
Default model for inference |
OLLAMA_SERVER_CONFIG_PATH |
Path to custom agent configuration file | |
OLLAMA_SERVER_PULL_MODELS_TIMEOUT |
600 |
Timeout for model downloads (seconds) |
OLLAMA_SERVER_PULL_MODELS_ENABLED |
false |
Auto-download models on startup |
OLLAMA_SERVER_LOAD_MODELS_ENABLED |
false |
Query server for available models |
Configuration examples:
# Basic Ollama setup with default model
OLLAMA_SERVER_URL=http://localhost:11434
OLLAMA_SERVER_MODEL=llama3.1:8b-instruct-q8_0
# Production setup with auto-pull and model discovery
OLLAMA_SERVER_URL=http://ollama-server:11434
OLLAMA_SERVER_PULL_MODELS_ENABLED=true
OLLAMA_SERVER_PULL_MODELS_TIMEOUT=900
OLLAMA_SERVER_LOAD_MODELS_ENABLED=true
# Custom configuration with agent-specific models
OLLAMA_SERVER_CONFIG_PATH=/path/to/ollama-config.yml
# Default configuration file inside docker container
OLLAMA_SERVER_CONFIG_PATH=/opt/pentagi/conf/ollama-llama318b.provider.ymlPerformance Considerations:
OLLAMA_SERVER_LOAD_MODELS_ENABLED=true): Adds 1-2s startup latency querying Ollama APIOLLAMA_SERVER_PULL_MODELS_ENABLED=true): First startup may take several minutes downloading modelsOLLAMA_SERVER_PULL_MODELS_TIMEOUT=900): 15 minutes in secondsPentAGI requires models with larger context windows than the default Ollama configurations. You need to create custom models with increased num_ctx parameter through Modelfiles. While typical agent workflows consume around 64K tokens, PentAGI uses 110K context size for safety margin and handling complex penetration testing scenarios.
Important: The num_ctx parameter can only be set during model creation via Modelfile - it cannot be changed after model creation or overridden at runtime.
Create a Modelfile named Modelfile_qwen3_32b_fp16_tc:
FROM qwen3:32b-fp16
PARAMETER num_ctx 110000
PARAMETER temperature 0.3
PARAMETER top_p 0.8
PARAMETER min_p 0.0
PARAMETER top_k 20
PARAMETER repeat_penalty 1.1Build the custom model:
ollama create qwen3:32b-fp16-tc -f Modelfile_qwen3_32b_fp16_tcCreate a Modelfile named Modelfile_qwq_32b_fp16_tc:
FROM qwq:32b-fp16
PARAMETER num_ctx 110000
PARAMETER temperature 0.2
PARAMETER top_p 0.7
PARAMETER min_p 0.0
PARAMETER top_k 40
PARAMETER repeat_penalty 1.2Build the custom model:
ollama create qwq:32b-fp16-tc -f Modelfile_qwq_32b_fp16_tcNote: The QwQ 32B FP16 model requires approximately 71.3 GB VRAM for inference. Ensure your system has sufficient GPU memory before attempting to use this model.
These custom models are referenced in the pre-built provider configuration files (ollama-qwen332b-fp16-tc.provider.yml and ollama-qwq32b-fp16-tc.provider.yml) that are included in the Docker image at /opt/pentagi/conf/.
PentAGI supports OpenAI's advanced language models, including the latest reasoning-capable o-series models designed for complex analytical tasks:
| Variable | Default | Description |
|---|---|---|
OPEN_AI_KEY |
API key for OpenAI services | |
OPEN_AI_SERVER_URL |
https://api.openai.com/v1 |
OpenAI API endpoint |
Configuration examples:
# Basic OpenAI setup
OPEN_AI_KEY=your_openai_api_key
OPEN_AI_SERVER_URL=https://api.openai.com/v1
# Using with proxy for enhanced security
OPEN_AI_KEY=your_openai_api_key
PROXY_URL=http://your-proxy:8080The OpenAI provider offers cutting-edge capabilities including:
The system automatically selects appropriate OpenAI models based on task complexity, optimizing for both performance and cost-effectiveness.
PentAGI integrates with Anthropic's Claude models, known for their exceptional safety, reasoning capabilities, and sophisticated understanding of complex security contexts:
| Variable | Default | Description |
|---|---|---|
ANTHROPIC_API_KEY |
API key for Anthropic services | |
ANTHROPIC_SERVER_URL |
https://api.anthropic.com/v1 |
Anthropic API endpoint |
Configuration examples:
# Basic Anthropic setup
ANTHROPIC_API_KEY=your_anthropic_api_key
ANTHROPIC_SERVER_URL=https://api.anthropic.com/v1
# Using with proxy for secure environments
ANTHROPIC_API_KEY=your_anthropic_api_key
PROXY_URL=http://your-proxy:8080The Anthropic provider delivers superior capabilities including:
The system leverages Claude's advanced understanding of security contexts to provide thorough and responsible penetration testing guidance.
PentAGI supports Google's Gemini models through the Google AI API, offering state-of-the-art reasoning capabilities and multimodal features:
| Variable | Default | Description |
|---|---|---|
GEMINI_API_KEY |
API key for Google AI services | |
GEMINI_SERVER_URL |
https://generativelanguage.googleapis.com |
Google AI API endpoint |
Configuration examples:
# Basic Gemini setup
GEMINI_API_KEY=your_gemini_api_key
GEMINI_SERVER_URL=https://generativelanguage.googleapis.com
# Using with proxy
GEMINI_API_KEY=your_gemini_api_key
PROXY_URL=http://your-proxy:8080The Gemini provider offers advanced features including:
The system automatically selects appropriate Gemini models based on agent requirements, balancing performance, capabilities, and cost-effectiveness.
PentAGI integrates with Amazon Bedrock, offering access to a wide range of foundation models from leading AI companies including Anthropic, AI21, Cohere, Meta, and Amazon's own models:
| Variable | Default | Description |
|---|---|---|
BEDROCK_REGION |
us-east-1 |
AWS region for Bedrock service |
BEDROCK_ACCESS_KEY_ID |
AWS access key ID for authentication | |
BEDROCK_SECRET_ACCESS_KEY |
AWS secret access key for authentication | |
BEDROCK_SESSION_TOKEN |
AWS session token as alternative way for authentication | |
BEDROCK_SERVER_URL |
Optional custom Bedrock endpoint URL |
Configuration examples:
# Basic AWS Bedrock setup with credentials
BEDROCK_REGION=us-east-1
BEDROCK_ACCESS_KEY_ID=your_aws_access_key
BEDROCK_SECRET_ACCESS_KEY=your_aws_secret_key
# Using with proxy for enhanced security
BEDROCK_REGION=us-east-1
BEDROCK_ACCESS_KEY_ID=your_aws_access_key
BEDROCK_SECRET_ACCESS_KEY=your_aws_secret_key
PROXY_URL=http://your-proxy:8080
# Using custom endpoint (for VPC endpoints or testing)
BEDROCK_REGION=us-east-1
BEDROCK_ACCESS_KEY_ID=your_aws_access_key
BEDROCK_SECRET_ACCESS_KEY=your_aws_secret_key
BEDROCK_SERVER_URL=https://bedrock-runtime.us-east-1.amazonaws.comImportant
AWS Bedrock Rate Limits Warning
The default PentAGI configuration for AWS Bedrock uses two primary models:
us.anthropic.claude-sonnet-4-20250514-v1:0 (for most agents) - 2 requests per minute for new AWS accountsus.anthropic.claude-3-5-haiku-20241022-v1:0 (for simple tasks) - 20 requests per minute for new AWS accountsThese default rate limits are extremely restrictive for comfortable penetration testing scenarios and will significantly impact your workflow. We strongly recommend:
Without adequate rate limits, you may experience frequent delays, timeouts, and degraded testing performance.
The AWS Bedrock provider delivers comprehensive capabilities including:
The system automatically selects appropriate Bedrock models based on task complexity and requirements, leveraging the full spectrum of available foundation models for optimal security testing results.
Warning
Converse API Requirements
PentAGI uses the Amazon Bedrock Converse API for model interactions, which requires models to support the following features:
Before selecting models, verify their feature support at: Supported models and model features
Note: AWS credentials can also be provided through IAM roles, environment variables, or AWS credential files following standard AWS SDK authentication patterns. Ensure your AWS account has appropriate permissions for Amazon Bedrock service access.
For advanced configuration options and detailed setup instructions, please visit our documentation.
Langfuse provides advanced capabilities for monitoring and analyzing AI agent operations.
.env file.LANGFUSE_POSTGRES_USER and LANGFUSE_POSTGRES_PASSWORD - Langfuse PostgreSQL credentialsLANGFUSE_CLICKHOUSE_USER and LANGFUSE_CLICKHOUSE_PASSWORD - ClickHouse credentialsLANGFUSE_REDIS_AUTH - Redis passwordLANGFUSE_SALT - Salt for hashing in Langfuse Web UILANGFUSE_ENCRYPTION_KEY - Encryption key (32 bytes in hex)LANGFUSE_NEXTAUTH_SECRET - Secret key for NextAuthLANGFUSE_INIT_USER_EMAIL - Admin emailLANGFUSE_INIT_USER_PASSWORD - Admin passwordLANGFUSE_INIT_USER_NAME - Admin usernameLANGFUSE_INIT_PROJECT_PUBLIC_KEY - Project public key (used from PentAGI side too)LANGFUSE_INIT_PROJECT_SECRET_KEY - Project secret key (used from PentAGI side too)LANGFUSE_S3_ACCESS_KEY_ID - S3 access key IDLANGFUSE_S3_SECRET_ACCESS_KEY - S3 secret access key.env file.LANGFUSE_BASE_URL=http://langfuse-web:3000
LANGFUSE_PROJECT_ID= # default: value from ${LANGFUSE_INIT_PROJECT_ID}
LANGFUSE_PUBLIC_KEY= # default: value from ${LANGFUSE_INIT_PROJECT_PUBLIC_KEY}
LANGFUSE_SECRET_KEY= # default: value from ${LANGFUSE_INIT_PROJECT_SECRET_KEY}curl -O https://raw.githubusercontent.com/vxcontrol/pentagi/master/docker-compose-langfuse.yml
docker compose -f docker-compose.yml -f docker-compose-langfuse.yml up -dVisit localhost:4000 to access Langfuse Web UI with credentials from .env file:
LANGFUSE_INIT_USER_EMAIL - Admin emailLANGFUSE_INIT_USER_PASSWORD - Admin passwordFor detailed system operation tracking, integration with monitoring tools is available.
.env file.OTEL_HOST=otelcol:8148curl -O https://raw.githubusercontent.com/vxcontrol/pentagi/master/docker-compose-observability.yml
docker compose -f docker-compose.yml -f docker-compose-observability.yml up -dVisit localhost:3000 to access Grafana Web UI.
Note
If you want to use Observability stack with Langfuse, you need to enable integration in .env file to set LANGFUSE_OTEL_EXPORTER_OTLP_ENDPOINT to http://otelcol:4318.
To run all available stacks together (Langfuse, Graphiti, and Observability):
docker compose -f docker-compose.yml -f docker-compose-langfuse.yml -f docker-compose-graphiti.yml -f docker-compose-observability.yml up -dYou can also register aliases for these commands in your shell to run it faster:
alias pentagi="docker compose -f docker-compose.yml -f docker-compose-langfuse.yml -f docker-compose-graphiti.yml -f docker-compose-observability.yml"
alias pentagi-up="docker compose -f docker-compose.yml -f docker-compose-langfuse.yml -f docker-compose-graphiti.yml -f docker-compose-observability.yml up -d"
alias pentagi-down="docker compose -f docker-compose.yml -f docker-compose-langfuse.yml -f docker-compose-graphiti.yml -f docker-compose-observability.yml down"PentAGI integrates with Graphiti, a temporal knowledge graph system powered by Neo4j, to provide advanced semantic understanding and relationship tracking for AI agent operations. The vxcontrol fork provides custom entity and edge types that are specific to pentesting purposes.
Graphiti automatically extracts and stores structured knowledge from agent interactions, building a graph of entities, relationships, and temporal context. This enables:
The Graphiti knowledge graph is optional and disabled by default. To enable it:
.env file:## Graphiti knowledge graph settings
GRAPHITI_ENABLED=true
GRAPHITI_TIMEOUT=30
GRAPHITI_URL=http://graphiti:8000
GRAPHITI_MODEL_NAME=gpt-5-mini
# Neo4j settings (used by Graphiti stack)
NEO4J_USER=neo4j
NEO4J_DATABASE=neo4j
NEO4J_PASSWORD=devpassword
NEO4J_URI=bolt://neo4j:7687
# OpenAI API key (required by Graphiti for entity extraction)
OPEN_AI_KEY=your_openai_api_key# Download the Graphiti compose file if needed
curl -O https://raw.githubusercontent.com/vxcontrol/pentagi/master/docker-compose-graphiti.yml
# Start PentAGI with Graphiti
docker compose -f docker-compose.yml -f docker-compose-graphiti.yml up -d# Check service health
docker compose -f docker-compose.yml -f docker-compose-graphiti.yml ps graphiti neo4j
# View Graphiti logs
docker compose -f docker-compose.yml -f docker-compose-graphiti.yml logs -f graphiti
# Access Neo4j Browser (optional)
# Visit http://localhost:7474 and login with NEO4J_USER/NEO4J_PASSWORD
# Access Graphiti API (optional, for debugging)
# Visit http://localhost:8000/docs for Swagger API documentationNote
The Graphiti service is defined in docker-compose-graphiti.yml as a separate stack. You must run both compose files together to enable the knowledge graph functionality. The pre-built Docker image vxcontrol/graphiti:latest is used by default.
When enabled, PentAGI automatically captures:
OAuth integration with GitHub and Google allows users to authenticate using their existing accounts on these platforms. This provides several benefits:
For using GitHub OAuth you need to create a new OAuth application in your GitHub account and set the GITHUB_CLIENT_ID and GITHUB_CLIENT_SECRET in .env file.
For using Google OAuth you need to create a new OAuth application in your Google account and set the GOOGLE_CLIENT_ID and GOOGLE_CLIENT_SECRET in .env file.
PentAGI allows you to configure Docker image selection for executing various tasks. The system automatically chooses the most appropriate image based on the task type, but you can constrain this selection by specifying your preferred images:
| Variable | Default | Description |
|---|---|---|
DOCKER_DEFAULT_IMAGE |
debian:latest |
Default Docker image for general tasks and ambiguous cases |
DOCKER_DEFAULT_IMAGE_FOR_PENTEST |
vxcontrol/kali-linux |
Default Docker image for security/penetration testing tasks |
When these environment variables are set, AI agents will be limited to the image choices you specify. This is particularly useful for:
Configuration examples:
# Using a custom image for general tasks
DOCKER_DEFAULT_IMAGE=mycompany/custom-debian:latest
# Using a specialized image for penetration testing
DOCKER_DEFAULT_IMAGE_FOR_PENTEST=mycompany/pentest-tools:v2.0Note
If a user explicitly specifies a particular Docker image in their task, the system will try to use that exact image, ignoring these settings. These variables only affect the system's automatic image selection process.
Run once cd backend && go mod download to install needed packages.
For generating swagger files have to run
swag init -g ../../pkg/server/router.go -o pkg/server/docs/ --parseDependency --parseInternal --parseDepth 2 -d cmd/pentagibefore installing swag package via
go install github.com/swaggo/swag/cmd/swag@v1.8.7For generating graphql resolver files have to run
go run github.com/99designs/gqlgen --config ./gqlgen/gqlgen.ymlafter that you can see the generated files in pkg/graph folder.
For generating ORM methods (database package) from sqlc configuration
docker run --rm -v $(pwd):/src -w /src --network pentagi-network -e DATABASE_URL="{URL}" sqlc/sqlc generate -f sqlc/sqlc.ymlFor generating Langfuse SDK from OpenAPI specification
fern generate --localand to install fern-cli
npm install -g fern-apiFor running tests cd backend && go test -v ./...
Run once cd frontend && npm install to install needed packages.
For generating graphql files have to run npm run graphql:generate which using graphql-codegen.ts file.
Be sure that you have graphql-codegen installed globally:
npm install -g graphql-codegenAfter that you can run:
npm run prettier to check if your code is formatted correctlynpm run prettier:fix to fix itnpm run lint to check if your code is linted correctlynpm run lint:fix to fix itFor generating SSL certificates you need to run npm run ssl:generate which using generate-ssl.ts file or it will be generated automatically when you run npm run dev.
Edit the configuration for backend in .vscode/launch.json file:
DATABASE_URL - PostgreSQL database URL (eg. postgres://postgres:postgres@localhost:5432/pentagidb?sslmode=disable)DOCKER_HOST - Docker SDK API (eg. for macOS DOCKER_HOST=unix:///Users/<my-user>/Library/Containers/com.docker.docker/Data/docker.raw.sock) more infoOptional:
SERVER_PORT - Port to run the server (default: 8443)SERVER_USE_SSL - Enable SSL for the server (default: false)Edit the configuration for frontend in .vscode/launch.json file:
VITE_API_URL - Backend API URL. Omit the URL scheme (e.g., localhost:8080 NOT http://localhost:8080)VITE_USE_HTTPS - Enable SSL for the server (default: false)VITE_PORT - Port to run the server (default: 8000)VITE_HOST - Host to run the server (default: 0.0.0.0)Run the command(s) in backend folder:
.env file to set environment variables like a source .envgo run cmd/pentagi/main.go to start the serverNote
The first run can take a while as dependencies and docker images need to be downloaded to setup the backend environment.
Run the command(s) in frontend folder:
npm install to install the dependenciesnpm run dev to run the web appnpm run build to build the web appOpen your browser and visit the web app URL.
PentAGI includes a powerful utility called ctester for testing and validating LLM agent capabilities. This tool helps ensure your LLM provider configurations work correctly with different agent types, allowing you to optimize model selection for each specific agent role.
The utility features parallel testing of multiple agents, detailed reporting, and flexible configuration options.
If you've cloned the repository and have Go installed:
# Default configuration with .env file
cd backend
go run cmd/ctester/*.go -verbose
# Custom provider configuration
go run cmd/ctester/*.go -config ../examples/configs/openrouter.provider.yml -verbose
# Generate a report file
go run cmd/ctester/*.go -config ../examples/configs/deepinfra.provider.yml -report ../test-report.md
# Test specific agent types only
go run cmd/ctester/*.go -agents simple,simple_json,primary_agent -verbose
# Test specific test groups only
go run cmd/ctester/*.go -groups basic,advanced -verboseIf you prefer to use the pre-built Docker image without setting up a development environment:
# Using Docker to test with default environment
docker run --rm -v $(pwd)/.env:/opt/pentagi/.env vxcontrol/pentagi /opt/pentagi/bin/ctester -verbose
# Test with your custom provider configuration
docker run --rm \
-v $(pwd)/.env:/opt/pentagi/.env \
-v $(pwd)/my-config.yml:/opt/pentagi/config.yml \
vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/config.yml -agents simple,primary_agent,coder -verbose
# Generate a detailed report
docker run --rm \
-v $(pwd)/.env:/opt/pentagi/.env \
-v $(pwd):/opt/pentagi/output \
vxcontrol/pentagi /opt/pentagi/bin/ctester -report /opt/pentagi/output/report.mdThe Docker image comes with built-in support for major providers (OpenAI, Anthropic, Gemini, Ollama) and pre-configured provider files for additional services (OpenRouter, DeepInfra, DeepSeek, Moonshot):
# Test with OpenRouter configuration
docker run --rm \
-v $(pwd)/.env:/opt/pentagi/.env \
vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/openrouter.provider.yml
# Test with DeepInfra configuration
docker run --rm \
-v $(pwd)/.env:/opt/pentagi/.env \
vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/deepinfra.provider.yml
# Test with DeepSeek configuration
docker run --rm \
-v $(pwd)/.env:/opt/pentagi/.env \
vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/deepseek.provider.yml
# Test with Moonshot configuration
docker run --rm \
-v $(pwd)/.env:/opt/pentagi/.env \
vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/moonshot.provider.yml
# Test with OpenAI configuration
docker run --rm \
-v $(pwd)/.env:/opt/pentagi/.env \
vxcontrol/pentagi /opt/pentagi/bin/ctester -type openai
# Test with Anthropic configuration
docker run --rm \
-v $(pwd)/.env:/opt/pentagi/.env \
vxcontrol/pentagi /opt/pentagi/bin/ctester -type anthropic
# Test with Gemini configuration
docker run --rm \
-v $(pwd)/.env:/opt/pentagi/.env \
vxcontrol/pentagi /opt/pentagi/bin/ctester -type gemini
# Test with AWS Bedrock configuration
docker run --rm \
-v $(pwd)/.env:/opt/pentagi/.env \
vxcontrol/pentagi /opt/pentagi/bin/ctester -type bedrock
# Test with Custom OpenAI configuration
docker run --rm \
-v $(pwd)/.env:/opt/pentagi/.env \
vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/custom-openai.provider.yml
# Test with Ollama configuration (local inference)
docker run --rm \
-v $(pwd)/.env:/opt/pentagi/.env \
vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/ollama-llama318b.provider.yml
# Test with Ollama Qwen3 32B configuration (requires custom model creation)
docker run --rm \
-v $(pwd)/.env:/opt/pentagi/.env \
vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/ollama-qwen332b-fp16-tc.provider.yml
# Test with Ollama QwQ 32B configuration (requires custom model creation and 71.3GB VRAM)
docker run --rm \
-v $(pwd)/.env:/opt/pentagi/.env \
vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/ollama-qwq32b-fp16-tc.provider.ymlTo use these configurations, your .env file only needs to contain:
LLM_SERVER_URL=https://openrouter.ai/api/v1 # or https://api.deepinfra.com/v1/openai or https://api.deepseek.com or https://api.openai.com/v1 or https://api.moonshot.ai/v1
LLM_SERVER_KEY=your_api_key
LLM_SERVER_MODEL= # Leave empty, as models are specified in the config
LLM_SERVER_CONFIG_PATH=/opt/pentagi/conf/openrouter.provider.yml # or deepinfra.provider.yml or deepseek.provider.yml or custom-openai.provider.yml or moonshot.provider.yml
LLM_SERVER_PROVIDER= # Provider name for LiteLLM proxy (e.g., openrouter, deepseek, moonshot)
LLM_SERVER_LEGACY_REASONING=false # Controls reasoning format, for OpenAI must be true (default: false)
LLM_SERVER_PRESERVE_REASONING=false # Preserve reasoning content in multi-turn conversations (required by Moonshot, default: false)
# For OpenAI (official API)
OPEN_AI_KEY=your_openai_api_key # Your OpenAI API key
OPEN_AI_SERVER_URL=https://api.openai.com/v1 # OpenAI API endpoint
# For Anthropic (Claude models)
ANTHROPIC_API_KEY=your_anthropic_api_key # Your Anthropic API key
ANTHROPIC_SERVER_URL=https://api.anthropic.com/v1 # Anthropic API endpoint
# For Gemini (Google AI)
GEMINI_API_KEY=your_gemini_api_key # Your Google AI API key
GEMINI_SERVER_URL=https://generativelanguage.googleapis.com # Google AI API endpoint
# For AWS Bedrock (enterprise foundation models)
BEDROCK_REGION=us-east-1 # AWS region for Bedrock service
BEDROCK_ACCESS_KEY_ID=your_aws_access_key # AWS access key ID
BEDROCK_SECRET_ACCESS_KEY=your_aws_secret_key # AWS secret access key
BEDROCK_SESSION_TOKEN=your_aws_session_token # AWS session token (alternative auth method)
BEDROCK_SERVER_URL= # Optional custom Bedrock endpoint
# For Ollama (local inference)
OLLAMA_SERVER_URL=http://localhost:11434
OLLAMA_SERVER_MODEL=llama3.1:8b-instruct-q8_0
OLLAMA_SERVER_CONFIG_PATH=/opt/pentagi/conf/ollama-llama318b.provider.yml
OLLAMA_SERVER_PULL_MODELS_ENABLED=false
OLLAMA_SERVER_LOAD_MODELS_ENABLED=false
For OpenAI accounts with unverified organizations that don't have access to the latest reasoning models (o1, o3, o4-mini), you need to use a custom configuration.
To use OpenAI with unverified organization accounts, configure your .env file as follows:
LLM_SERVER_URL=https://api.openai.com/v1
LLM_SERVER_KEY=your_openai_api_key
LLM_SERVER_MODEL= # Leave empty, models are specified in config
LLM_SERVER_CONFIG_PATH=/opt/pentagi/conf/custom-openai.provider.yml
LLM_SERVER_LEGACY_REASONING=true # Required for OpenAI reasoning formatThis configuration uses the pre-built custom-openai.provider.yml file that maps all agent types to models available for unverified organizations, using o3-mini instead of models like o1, o3, and o4-mini.
You can test this configuration using:
# Test with custom OpenAI configuration for unverified accounts
docker run --rm \
-v $(pwd)/.env:/opt/pentagi/.env \
vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/custom-openai.provider.ymlNote
The LLM_SERVER_LEGACY_REASONING=true setting is crucial for OpenAI compatibility as it ensures reasoning parameters are sent in the format expected by OpenAI's API.
When using LiteLLM proxy to access various LLM providers, model names are prefixed with the provider name (e.g., moonshot/kimi-2.5 instead of kimi-2.5). To use the same provider configuration files with both direct API access and LiteLLM proxy, set the LLM_SERVER_PROVIDER variable:
# Direct access to Moonshot API
LLM_SERVER_URL=https://api.moonshot.ai/v1
LLM_SERVER_KEY=your_moonshot_api_key
LLM_SERVER_CONFIG_PATH=/opt/pentagi/conf/moonshot.provider.yml
LLM_SERVER_PROVIDER= # Empty for direct access
# Access via LiteLLM proxy
LLM_SERVER_URL=http://litellm-proxy:4000
LLM_SERVER_KEY=your_litellm_api_key
LLM_SERVER_CONFIG_PATH=/opt/pentagi/conf/moonshot.provider.yml
LLM_SERVER_PROVIDER=moonshot # Provider prefix for LiteLLMWith LLM_SERVER_PROVIDER=moonshot, the system automatically prefixes all model names from the configuration file with moonshot/, making them compatible with LiteLLM's model naming convention.
Supported provider names for LiteLLM:
openai - for OpenAI models via LiteLLManthropic - for Anthropic/Claude models via LiteLLMgemini - for Google Gemini models via LiteLLMopenrouter - for OpenRouter aggregatordeepseek - for DeepSeek modelsdeepinfra - for DeepInfra hostingmoonshot - for Moonshot AI (Kimi)This approach allows you to:
If you already have a running PentAGI container and want to test the current configuration:
# Run ctester in an existing container using current environment variables
docker exec -it pentagi /opt/pentagi/bin/ctester -verbose
# Test specific agent types with deterministic ordering
docker exec -it pentagi /opt/pentagi/bin/ctester -agents simple,primary_agent,pentester -groups basic,knowledge -verbose
# Generate a report file inside the container
docker exec -it pentagi /opt/pentagi/bin/ctester -report /opt/pentagi/data/agent-test-report.md
# Access the report from the host
docker cp pentagi:/opt/pentagi/data/agent-test-report.md ./The utility accepts several options:
-env <path> - Path to environment file (default: .env)-type <provider> - Provider type: custom, openai, anthropic, ollama, bedrock, gemini (default: custom)-config <path> - Path to custom provider config (default: from LLM_SERVER_CONFIG_PATH env variable)-tests <path> - Path to custom tests YAML file (optional)-report <path> - Path to write the report file (optional)-agents <list> - Comma-separated list of agent types to test (default: all)-groups <list> - Comma-separated list of test groups to run (default: all)-verbose - Enable verbose output with detailed test results for each agentAgents are tested in the following deterministic order:
simple_json agent)Note: The
jsontest group is specifically designed for thesimple_jsonagent type, while all other agents are tested withbasic,advanced, andknowledgegroups. This specialization ensures optimal testing coverage for each agent's intended purpose.
Provider configuration defines which models to use for different agent types:
simple:
model: "provider/model-name"
temperature: 0.7
top_p: 0.95
n: 1
max_tokens: 4000
simple_json:
model: "provider/model-name"
temperature: 0.7
top_p: 1.0
n: 1
max_tokens: 4000
json: true
# ... other agent types ...This tool helps ensure your AI agents are using the most effective models for their specific tasks, improving reliability while optimizing costs.
PentAGI uses vector embeddings for semantic search, knowledge storage, and memory management. The system supports multiple embedding providers that can be configured according to your needs and preferences.
PentAGI supports the following embedding providers:
To configure the embedding provider, set the following environment variables in your .env file:
# Primary embedding configuration
EMBEDDING_PROVIDER=openai # Provider type (openai, ollama, mistral, jina, huggingface, googleai, voyageai)
EMBEDDING_MODEL=text-embedding-3-small # Model name to use
EMBEDDING_URL= # Optional custom API endpoint
EMBEDDING_KEY= # API key for the provider (if required)
EMBEDDING_BATCH_SIZE=100 # Number of documents to process in a batch
EMBEDDING_STRIP_NEW_LINES=true # Whether to remove new lines from text before embedding
# Advanced settings
PROXY_URL= # Optional proxy for all API calls
# SSL/TLS Certificate Configuration (for external communication with LLM backends and tool servers)
EXTERNAL_SSL_CA_PATH= # Path to custom CA certificate file (PEM format) inside the container
# Must point to /opt/pentagi/ssl/ directory (e.g., /opt/pentagi/ssl/ca-bundle.pem)
EXTERNAL_SSL_INSECURE=false # Skip certificate verification (use only for testing)If you see this error: tls: failed to verify certificate: x509: certificate signed by unknown authority
Step 1: Get your CA certificate bundle in PEM format (can contain multiple certificates)
Step 2: Place the file in the SSL directory on your host machine:
# Default location (if PENTAGI_SSL_DIR is not set)
cp ca-bundle.pem ./pentagi-ssl/
# Or custom location (if using PENTAGI_SSL_DIR in docker-compose.yml)
cp ca-bundle.pem /path/to/your/ssl/dir/Step 3: Set the path in .env file (path must be inside the container):
# The volume pentagi-ssl is mounted to /opt/pentagi/ssl inside the container
EXTERNAL_SSL_CA_PATH=/opt/pentagi/ssl/ca-bundle.pem
EXTERNAL_SSL_INSECURE=falseStep 4: Restart PentAGI:
docker compose restart pentagiNotes:
pentagi-ssl volume is mounted to /opt/pentagi/ssl inside the containerPENTAGI_SSL_DIR variable in docker-compose.ymlEXTERNAL_SSL_INSECURE=true only for testing (not recommended for production)Each provider has specific limitations and supported features:
EMBEDDING_KEY as it uses local modelsEMBEDDING_MODEL or custom HTTP clientEMBEDDING_KEY and supports all other optionsEMBEDDING_URL, requires EMBEDDING_KEYIf EMBEDDING_URL and EMBEDDING_KEY are not specified, the system will attempt to use the corresponding LLM provider settings (e.g., OPEN_AI_KEY when EMBEDDING_PROVIDER=openai).
It's crucial to use the same embedding provider consistently because:
If you change your embedding provider, you should flush and reindex your entire knowledge base (see etester utility below).
PentAGI includes a specialized etester utility for testing, managing, and debugging embedding functionality. This tool is essential for diagnosing and resolving issues related to vector embeddings and knowledge storage.
# Test embedding provider and database connection
cd backend
go run cmd/etester/main.go test -verbose
# Show statistics about the embedding database
go run cmd/etester/main.go info
# Delete all documents from the embedding database (use with caution!)
go run cmd/etester/main.go flush
# Recalculate embeddings for all documents (after changing provider)
go run cmd/etester/main.go reindex
# Search for documents in the embedding database
go run cmd/etester/main.go search -query "How to install PostgreSQL" -limit 5If you're running PentAGI in Docker, you can use etester from within the container:
# Test embedding provider
docker exec -it pentagi /opt/pentagi/bin/etester test
# Show detailed database information
docker exec -it pentagi /opt/pentagi/bin/etester info -verboseThe search command supports various filters to narrow down results:
# Filter by document type
docker exec -it pentagi /opt/pentagi/bin/etester search -query "Security vulnerability" -doc_type guide -threshold 0.8
# Filter by flow ID
docker exec -it pentagi /opt/pentagi/bin/etester search -query "Code examples" -doc_type code -flow_id 42
# All available search options
docker exec -it pentagi /opt/pentagi/bin/etester search -helpAvailable search parameters:
-query STRING: Search query text (required)-doc_type STRING: Filter by document type (answer, memory, guide, code)-flow_id NUMBER: Filter by flow ID (positive number)-answer_type STRING: Filter by answer type (guide, vulnerability, code, tool, other)-guide_type STRING: Filter by guide type (install, configure, use, pentest, development, other)-limit NUMBER: Maximum number of results (default: 3)-threshold NUMBER: Similarity threshold (0.0-1.0, default: 0.7)flush or reindex to ensure consistencyPentAGI includes a versatile utility called ftester for debugging, testing, and developing specific functions and AI agent behaviors. While ctester focuses on testing LLM model capabilities, ftester allows you to directly invoke individual system functions and AI agent components with precise control over execution context.
Run ftester with specific function and arguments directly from the command line:
# Basic usage with mock mode
cd backend
go run cmd/ftester/main.go [function_name] -[arg1] [value1] -[arg2] [value2]
# Example: Test terminal command in mock mode
go run cmd/ftester/main.go terminal -command "ls -la" -message "List files"
# Using a real flow context
go run cmd/ftester/main.go -flow 123 terminal -command "whoami" -message "Check user"
# Testing AI agent in specific task/subtask context
go run cmd/ftester/main.go -flow 123 -task 456 -subtask 789 pentester -message "Find vulnerabilities"Run ftester without arguments for a guided interactive experience:
# Start interactive mode
go run cmd/ftester/main.go [function_name]
# For example, to interactively fill browser tool arguments
go run cmd/ftester/main.go browserThe describe function provides detailed information about tasks and subtasks within a flow. This is particularly useful for diagnosing issues when PentAGI encounters problems or gets stuck.
# List all flows in the system
go run cmd/ftester/main.go describe
# Show all tasks and subtasks for a specific flow
go run cmd/ftester/main.go -flow 123 describe
# Show detailed information for a specific task
go run cmd/ftester/main.go -flow 123 -task 456 describe
# Show detailed information for a specific subtask
go run cmd/ftester/main.go -flow 123 -task 456 -subtask 789 describe
# Show verbose output with full descriptions and results
go run cmd/ftester/main.go -flow 123 describe -verboseThis function allows you to identify the exact point where a flow might be stuck and resume processing by directly invoking the appropriate agent function.
Each function has a help mode that shows available parameters:
# Get help for a specific function
go run cmd/ftester/main.go [function_name] -help
# Examples:
go run cmd/ftester/main.go terminal -help
go run cmd/ftester/main.go browser -help
go run cmd/ftester/main.go describe -helpYou can also run ftester without arguments to see a list of all available functions:
go run cmd/ftester/main.goThe ftester utility uses color-coded output to make interpretation easier:
JSON and Markdown responses are automatically formatted for readability.
When PentAGI gets stuck in a flow:
describe to identify the current task and subtaskVerify that API keys and external services are configured correctly:
# Test Google search API configuration
go run cmd/ftester/main.go google -query "pentesting tools"
# Test browser access to external websites
go run cmd/ftester/main.go browser -url "https://example.com"When developing new prompt templates or agent behaviors:
Ensure containers are properly configured:
go run cmd/ftester/main.go -flow 123 terminal -command "env | grep -i proxy" -message "Check proxy settings"If you have PentAGI running in Docker, you can use ftester from within the container:
# Run ftester inside the running PentAGI container
docker exec -it pentagi /opt/pentagi/bin/ftester [arguments]
# Examples:
docker exec -it pentagi /opt/pentagi/bin/ftester -flow 123 describe
docker exec -it pentagi /opt/pentagi/bin/ftester -flow 123 terminal -command "ps aux" -message "List processes"This is particularly useful for production deployments where you don't have a local development environment.
All function calls made through ftester are logged to:
To access detailed logs:
http://localhost:4000)http://localhost:3000)The main utility accepts several options:
-env <path> - Path to environment file (optional, default: .env)-provider <type> - Provider type to use (default: custom, options: openai, anthropic, ollama, bedrock, gemini, custom)-flow <id> - Flow ID for testing (0 means using mocks, default: 0)-task <id> - Task ID for agent context (optional)-subtask <id> - Subtask ID for agent context (optional)Function-specific arguments are passed after the function name using -name value format.
docker build -t local/pentagi:latest .Note
You can use docker buildx to build the image for different platforms like a docker buildx build --platform linux/amd64 -t local/pentagi:latest .
You need to change image name in docker-compose.yml file to local/pentagi:latest and run docker compose up -d to start the server or use build key option in docker-compose.yml file.
This project is made possible thanks to the following research and developments:
PentAGI Core: Licensed under MIT License
Copyright (c) 2025 PentAGI Development Team
VXControl Cloud SDK Integration: This repository integrates VXControl Cloud SDK under a special licensing exception that applies ONLY to the official PentAGI project.
https://github.com/vxcontrol/pentagiIf you fork this project or create derivative works, the VXControl SDK components are subject to AGPL-3.0 license terms. You must either:
For commercial use of VXControl Cloud SDK in proprietary applications, contact: