steipete / summarize
- среда, 18 февраля 2026 г. в 00:00:02
Point at any URL/YouTube/Podcast or file. Get the gist. CLI and Chrome Extension.
Fast summaries from URLs, files, and media. Works in the terminal, a Chrome Side Panel and Firefox Sidebar.
0.11.0 preview (unreleased): this README reflects the upcoming release.
--force-summary to override).One‑click summarizer for the current tab. Chrome Side Panel + Firefox Sidebar + local daemon for streaming Markdown.
Chrome Web Store: Summarize Side Panel
YouTube slide screenshots (from the browser):
npm i -g @steipete/summarizebrew install steipete/tap/summarizesummarize daemon install --token <TOKEN>Why a daemon/service?
127.0.0.1 for fast streaming and media tools (yt‑dlp, ffmpeg, OCR, transcription).If you only want the CLI, you can skip the daemon install entirely.
Notes:
free via summarize refresh-free (needs OPENROUTER_API_KEY). Add --set-default to set model=free.More:
yt-dlp + ffmpeg for extraction; tesseract for OCR. Missing tools show an in‑panel notice.pnpm -C apps/chrome-extension build
chrome://extensions → Developer mode → Load unpackedapps/chrome-extension/.output/chrome-mv3pnpm -C apps/chrome-extension build:firefox
about:debugging#/runtime/this-firefox → Load Temporary Add-onapps/chrome-extension/.output/firefox-mv3/manifest.jsonpnpm summarize daemon install --token <TOKEN> --devRequires Node 22+.
npx -y @steipete/summarize "https://example.com"npm i -g @steipete/summarizenpm i @steipete/summarize-coreimport { createLinkPreviewClient } from "@steipete/summarize-core/content";brew install steipete/tap/summarizeApple Silicon only (arm64).
summarize ... (no daemon needed).summarize daemon install --token <TOKEN> so the Side Panel can stream results and use local tools.summarize "https://example.com"URLs or local paths:
summarize "/path/to/file.pdf" --model google/gemini-3-flash-preview
summarize "https://example.com/report.pdf" --model google/gemini-3-flash-preview
summarize "/path/to/audio.mp3"
summarize "/path/to/video.mp4"Stdin (pipe content using -):
echo "content" | summarize -
pbpaste | summarize -
# binary stdin also works (PDF/image/audio/video bytes)
cat /path/to/file.pdf | summarize -Notes:
- argument tells summarize to read from standard inputYouTube (supports youtube.com and youtu.be):
summarize "https://youtu.be/dQw4w9WgXcQ" --youtube autoPodcast RSS (transcribes latest enclosure):
summarize "https://feeds.npr.org/500005/podcast.xml"Apple Podcasts episode page:
summarize "https://podcasts.apple.com/us/podcast/2424-jelly-roll/id360084272?i=1000740717432"Spotify episode page (best-effort; may fail for exclusives):
summarize "https://open.spotify.com/episode/5auotqWAXhhKyb9ymCuBJY"--length controls how much output we ask for (guideline), not a hard cap.
summarize "https://example.com" --length long
summarize "https://example.com" --length 20kshort|medium|long|xl|xxl1500, 20k, 20000--max-output-tokens <count> (e.g. 2000, 2k)
--length unless you need a hard cap.--force-summary to always run the LLM.--length numeric values must be >= 50 chars; --max-output-tokens must be >= 16.packages/core/src/prompts/summary-lengths.ts):
Best effort and provider-dependent. These usually work well:
text/* and common structured text (.txt, .md, .json, .yaml, .xml, ...)
application/pdf (provider support varies; Google is the most reliable here)image/jpeg, image/png, image/webp, image/gifaudio/*, video/* (local audio/video files MP3/WAV/M4A/OGG/FLAC/MP4/MOV/WEBM automatically transcribed, when supported by the model)Notes:
Use gateway-style ids: <provider>/<model>.
Examples:
openai/gpt-5-minianthropic/claude-sonnet-4-5xai/grok-4-fast-non-reasoninggoogle/gemini-3-flash-previewzai/glm-4.7openrouter/openai/gpt-5-mini (force OpenRouter)Note: some models/providers do not support streaming or certain file media types. When that happens, the CLI prints a friendly error (or auto-disables streaming for that model when supported by the provider).
summarize <input> [flags]Use summarize --help or summarize help for the full help text.
--model <provider/model>: which model to use (defaults to auto)--model auto: automatic model selection + fallback (default)--model <name>: use a config-defined model (see Configuration)--timeout <duration>: 30s, 2m, 5000ms (default 2m)--retries <count>: LLM retry attempts on timeout (default 1)--length short|medium|long|xl|xxl|s|m|l|<chars>--language, --lang <language>: output language (auto = match source)--max-output-tokens <count>: hard cap for LLM output tokens--cli [provider]: use a CLI provider (--model cli/<provider>). Supports claude, gemini, codex, agent. If omitted, uses auto selection with CLI enabled.--stream auto|on|off: stream LLM output (auto = TTY only; disabled in --json mode)--plain: keep raw output (no ANSI/OSC Markdown rendering)--no-color: disable ANSI colors--theme <name>: CLI theme (aurora, ember, moss, mono)--format md|text: website/file content format (default text)--markdown-mode off|auto|llm|readability: HTML -> Markdown mode (default readability)--preprocess off|auto|always: controls uvx markitdown usage (default auto)
uvx: brew install uv (or https://astral.sh/uv/)--extract: print extracted content and exit (URLs only; stdin - is not supported)
--extract-only--slides: extract slides for YouTube/direct video URLs and render them inline in the summary narrative (auto-renders inline in supported terminals)--slides-ocr: run OCR on extracted slides (requires tesseract)--slides-dir <dir>: base output dir for slide images (default ./slides)--slides-scene-threshold <value>: scene detection threshold (0.1-1.0)--slides-max <count>: maximum slides to extract (default 6)--slides-min-duration <seconds>: minimum seconds between slides--json: machine-readable output with diagnostics, prompt, metrics, and optional summary--verbose: debug/diagnostics on stderr--metrics off|on|detailed: metrics output (default on)Summarize can use common coding CLIs as local model backends:
codex -> --cli codex / --model cli/codex/<model>claude -> --cli claude / --model cli/claude/<model>gemini -> --cli gemini / --model cli/gemini/<model>agent (Cursor Agent CLI) -> --cli agent / --model cli/agent/<model>Requirements:
PATH (or set CODEX_PATH, CLAUDE_PATH, GEMINI_PATH, AGENT_PATH)codex login, claude auth, gemini login flow, agent login or CURSOR_API_KEY)Quick smoke test:
printf "Summarize CLI smoke input.\nOne short paragraph. Reply can be brief.\n" >/tmp/summarize-cli-smoke.txt
summarize --cli codex --plain --timeout 2m /tmp/summarize-cli-smoke.txt
summarize --cli claude --plain --timeout 2m /tmp/summarize-cli-smoke.txt
summarize --cli gemini --plain --timeout 2m /tmp/summarize-cli-smoke.txt
summarize --cli agent --plain --timeout 2m /tmp/summarize-cli-smoke.txtSet explicit CLI allowlist/order:
{
"cli": { "enabled": ["codex", "claude", "gemini", "agent"] }
}Configure implicit auto CLI fallback:
{
"cli": {
"autoFallback": {
"enabled": true,
"onlyWhenNoApiKeys": true,
"order": ["claude", "gemini", "codex", "agent"]
}
}
}More details: docs/cli.md
--model auto builds candidate attempts from built-in rules (or your model.rules overrides).
CLI attempts are prepended when:
cli.enabled is set (explicit allowlist/order), orcli.autoFallback is enabled.Default fallback behavior: only when no API keys are configured, order claude, gemini, codex, agent, and remember/prioritize last successful provider (~/.summarize/cli-state.json).
Set explicit CLI attempts:
{
"cli": { "enabled": ["gemini"] }
}Disable implicit auto CLI fallback:
{
"cli": { "autoFallback": { "enabled": false } }
}Note: explicit --model auto does not trigger implicit auto CLI fallback unless cli.enabled is set.
Non-YouTube URLs go through a fetch -> extract pipeline. When direct fetch/extraction is blocked or too thin,
--firecrawl auto can fall back to Firecrawl (if configured).
--firecrawl off|auto|always (default auto)--extract --format md|text (default text; if --format is omitted, --extract defaults to md for non-YouTube URLs)--markdown-mode off|auto|llm|readability (default readability)
auto: use an LLM converter when configured; may fall back to uvx markitdownllm: force LLM conversion (requires a configured model key)off: disable LLM conversion (still may return Firecrawl Markdown when configured)--format text.--youtube auto tries best-effort web transcript endpoints first. When captions are not available, it falls back to:
APIFY_API_TOKEN is set): uses a scraping actor (faVsWy9VTSNVIhWpR)yt-dlp is available): downloads audio, then transcribes with local whisper.cpp when installed
(preferred), otherwise falls back to OpenAI (OPENAI_API_KEY) or FAL (FAL_KEY)Environment variables for yt-dlp mode:
YT_DLP_PATH - optional path to yt-dlp binary (otherwise yt-dlp is resolved via PATH)SUMMARIZE_WHISPER_CPP_MODEL_PATH - optional override for the local whisper.cpp model fileSUMMARIZE_WHISPER_CPP_BINARY - optional override for the local binary (default: whisper-cli)SUMMARIZE_DISABLE_LOCAL_WHISPER_CPP=1 - disable local whisper.cpp (force remote)OPENAI_API_KEY - OpenAI Whisper transcriptionOPENAI_WHISPER_BASE_URL - optional OpenAI-compatible Whisper endpoint overrideFAL_KEY - FAL AI Whisper fallbackApify costs money but tends to be more reliable when captions exist.
Extract slide screenshots (scene detection via ffmpeg) and optional OCR:
summarize "https://www.youtube.com/watch?v=..." --slides
summarize "https://www.youtube.com/watch?v=..." --slides --slides-ocrOutputs are written under ./slides/<sourceId>/ (or --slides-dir). OCR results are included in JSON output
(--json) and stored in slides.json inside the slide directory. When scene detection is too sparse, the
extractor also samples at a fixed interval to improve coverage.
When using --slides, supported terminals (kitty/iTerm/Konsole) render inline thumbnails automatically inside the
summary narrative (the model inserts [slide:N] markers). Timestamp links are clickable when the terminal supports
OSC-8 (YouTube/Vimeo/Loom/Dropbox). If inline images are unsupported, Summarize prints a note with the on-disk
slide directory.
Use --slides --extract to print the full timed transcript and insert slide images inline at matching timestamps.
Format the extracted transcript as Markdown (headings + paragraphs) via an LLM:
summarize "https://www.youtube.com/watch?v=..." --extract --format md --markdown-mode llmLocal audio/video files are transcribed first, then summarized. --video-mode transcript forces
direct media URLs (and embedded media) through Whisper first. Prefers local whisper.cpp when available; otherwise requires
OPENAI_API_KEY or FAL_KEY.
Summarize can use NVIDIA Parakeet/Canary ONNX models via a local CLI you provide. Auto selection (default) prefers ONNX when configured.
summarize transcriber setupsherpa-onnx from upstream binaries/build (Homebrew may not have a formula)SUMMARIZE_ONNX_PARAKEET_CMD or SUMMARIZE_ONNX_CANARY_CMD (no flag needed)--transcriber parakeet|canary|whisper|autodocs/nvidia-onnx-transcription.mdRun: summarize <url>
Transcription: prefers local whisper.cpp when installed; otherwise uses OpenAI Whisper or FAL when keys are set.
--language/--lang controls the output language of the summary (and other LLM-generated text). Default is auto.
When the input is audio/video, the CLI needs a transcript first. The transcript comes from one of these paths:
youtubei / captionTracks when available.<podcast:transcript> (JSON/VTT) when the feed publishes it.whisper.cpp when installed + model available.OPENAI_API_KEY) or FAL (FAL_KEY).For direct media URLs, use --video-mode transcript to force transcribe -> summarize:
summarize https://example.com/file.mp4 --video-mode transcript --lang enSingle config location:
~/.summarize/config.jsonSupported keys today:
{
"model": { "id": "openai/gpt-5-mini" },
"env": { "OPENAI_API_KEY": "sk-..." },
"ui": { "theme": "ember" }
}Shorthand (equivalent):
{
"model": "openai/gpt-5-mini"
}Also supported:
model: { "mode": "auto" } (automatic model selection + fallback; see docs/model-auto.md)model.rules (customize candidates / ordering)models (define presets selectable via --model <preset>)env (generic env var defaults; process env still wins)apiKeys (legacy shortcut, mapped to env names; prefer env for new configs)cache.media (media download cache: TTL 7 days, 2048 MB cap by default; --no-media-cache disables)media.videoMode: "auto"|"transcript"|"understand"slides.enabled / slides.max / slides.ocr / slides.dir (defaults for --slides)ui.theme: "aurora"|"ember"|"moss"|"mono"openai.useChatCompletions: true (force OpenAI-compatible chat completions)Note: the config is parsed leniently (JSON5), but comments are not allowed. Unknown keys are ignored.
Media cache defaults:
{
"cache": {
"media": { "enabled": true, "ttlDays": 7, "maxMb": 2048, "verify": "size" }
}
}Note: --no-cache bypasses summary caching only (LLM output). Extract/transcript caches still apply. Use --no-media-cache to skip media files.
Precedence:
--modelSUMMARIZE_MODEL~/.summarize/config.jsonauto)Theme precedence:
--themeSUMMARIZE_THEME~/.summarize/config.json (ui.theme)aurora)Environment variable precedence:
~/.summarize/config.json (env)~/.summarize/config.json (apiKeys, legacy)Set the key matching your chosen --model:
Optional fallback defaults can be stored in config:
~/.summarize/config.json -> "env": { "OPENAI_API_KEY": "sk-..." }"apiKeys" still works (mapped to env names)OPENAI_API_KEY (for openai/...)
NVIDIA_API_KEY (for nvidia/...)
ANTHROPIC_API_KEY (for anthropic/...)
XAI_API_KEY (for xai/...)
Z_AI_API_KEY (for zai/...; supports ZAI_API_KEY alias)
GEMINI_API_KEY (for google/...)
GOOGLE_GENERATIVE_AI_API_KEY and GOOGLE_API_KEY as aliasesOpenAI-compatible chat completions toggle:
OPENAI_USE_CHAT_COMPLETIONS=1 (or set openai.useChatCompletions in config)UI theme:
SUMMARIZE_THEME=aurora|ember|moss|monoSUMMARIZE_TRUECOLOR=1 (force 24-bit ANSI)SUMMARIZE_NO_TRUECOLOR=1 (disable 24-bit ANSI)OpenRouter (OpenAI-compatible):
OPENROUTER_API_KEY=...--model openrouter/<author>/<slug>--model free (uses a default set of OpenRouter :free models)Quick start: make free the default (keep auto available)
summarize refresh-free --set-default
summarize "https://example.com"
summarize "https://example.com" --model autoRegenerates the free preset (models.free in ~/.summarize/config.json) by:
/models, filtering :freecontext_length / output cap) and fast modelsIf --model free stops working, run:
summarize refresh-freeFlags:
--runs 2 (default): extra timing runs per selected model (total runs = 1 + runs)--smart 3 (default): how many smart-first picks (rest filled by fastest)--min-params 27b (default): ignore models with inferred size smaller than N billion parameters--max-age-days 180 (default): ignore models older than N days (set 0 to disable)--set-default: also sets "model": "free" in ~/.summarize/config.jsonExample:
OPENROUTER_API_KEY=sk-or-... summarize "https://example.com" --model openrouter/meta-llama/llama-3.1-8b-instruct:free
OPENROUTER_API_KEY=sk-or-... summarize "https://example.com" --model openrouter/minimax/minimax-m2.5If your OpenRouter account enforces an allowed-provider list, make sure at least one provider
is allowed for the selected model. When routing fails, summarize prints the exact providers to allow.
Legacy: OPENAI_BASE_URL=https://openrouter.ai/api/v1 (and either OPENAI_API_KEY or OPENROUTER_API_KEY) also works.
NVIDIA API Catalog (OpenAI-compatible; free credits):
NVIDIA_API_KEY=...NVIDIA_BASE_URL=https://integrate.api.nvidia.com/v1/v1/models (examples: fast stepfun-ai/step-3.5-flash, strong but slower z-ai/glm5)export NVIDIA_API_KEY="nvapi-..."
summarize "https://example.com" --model nvidia/stepfun-ai/step-3.5-flashZ.AI (OpenAI-compatible):
Z_AI_API_KEY=... (or ZAI_API_KEY=...)Z_AI_BASE_URL=...Optional services:
FIRECRAWL_API_KEY (website extraction fallback)YT_DLP_PATH (path to yt-dlp binary for audio extraction)FAL_KEY (FAL AI API key for audio transcription via Whisper)APIFY_API_TOKEN (YouTube transcript fallback)The CLI uses the LiteLLM model catalog for model limits (like max output tokens):
https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json~/.summarize/cache/Recommended (minimal deps):
@steipete/summarize-core/content@steipete/summarize-core/promptsCompatibility (pulls in CLI deps):
@steipete/summarize/content@steipete/summarize/promptspnpm install
pnpm checksummarize daemon status~/.summarize/logs/daemon.err.logLicense: MIT