The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025. Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting Dolphin (Document Image Parsing via Heterogeneous Anchor Prompting) is a novel multimodal document image parsing model following an analyze-then-parse paradigm. This repository contains the demo code and pre-trained models for Dolphin. 📑 Overview D…
Stay on top of trending topics on social media and the web with AITrend Finder 🔦 Stay on top of trending topics on social media — all in one place. Trend Finder collects and analyzes posts from key influencers, then sends a Slack or Discord notification when it detects new trends or product launches. This has been a complete game-changer for the Firecrawl marketing team by: Saving time normally spent manually searching social channels Keeping you informed of relevant, real-time conversations E…
"RAG-Anything: All-in-One RAG Framework" 🚀 RAG-Anything: All-in-One RAG Framework 🎉 News [2025.08.12]🎯📢 🔍 RAG-Anything now features VLM-Enhanced Query mode! When documents include images, the system seamlessly integrates them into VLM for advanced multimodal analysis, combining visual and textual context for deeper insights. [2025.07.05]🎯…
Gen-AI Chat for Teams - Think ChatGPT if it had access to your team's unique knowledge. Open Source AI Platform Onyx is a feature-rich, self-hostable Chat UI that works with any LLM. It is easy to deploy and can run in a completely airgapped environment. Onyx comes loaded with advanced features like Agents, Web Search, RAG, MCP, Deep Research, Connectors to 40+ knowledge sources, and more. TipRun Onyx with one…
JavaScript/TypeScript-native, low-boilerplate, object-capability RPC systemCap'n Web: A JavaScript-native RPC system Cap'n Web is a spiritual sibling to Cap'n Proto (and is created by the same author), but designed to play nice in the web stack. That means: Like Cap'n Proto, it is an object-capability protocol. ("Cap'n" is short for "capabilities and".) We'll get into this more below, but it's incredibly powerful. Unlike Cap'n Proto, Cap'n We…
A payments protocol for the internet. Built on HTTP.x402 payments protocol "1 line of code to accept digital dollars. No fee, 2 second settlement, $0.001 minimum payment." app.use( // How much you want to charge, and where you want the funds to land paymentMiddleware("0xYourAddress", { "/your-endpoint": "$0.01" }) ); // That's it! See examples/typescript/servers/express.ts for a complete example. Instruction below for running on base-sepolia. Philosophy…
Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.What to expect in v2 Early-access notice Memvid v1 is still experimental. The file format and API may change until we lock in a stable release. Memvid v2 – what's next Living-Memory Engine – keep adding new data and let LLMs remember it across sessions. Capsule Context – shareable .mv2 capsules, each with its own rules and expiry. Time-Travel Debugging – rewind…
An ASGI web server, for Python. 🦄 An ASGI web server, for Python. Documentation: https://uvicorn.dev Source Code: https://www.github.com/Kludex/uvicorn Uvicorn is an ASGI web server implementation for Python. Until recently Python has lacked a minimal low-level server/application interface for async frameworks. The ASGI specification fills this gap, and means we're now able to start building a common set of tooling usable across all async frameworks. Uvicorn supports HTTP/1.1 and…
Minecraft AI with LLMs+MineflayerMindcraft 🧠⛏️ Crafting minds for Minecraft with LLMs and Mineflayer! FAQ | Discord Support | Video Tutorial | Blog Post | Contributor TODO | Paper Website | MineCollab CautionDo not connect this bot to public servers with coding enabled. This project allows an LLM to write/execute code on your computer. The code is sandboxed, but still vulnerable to injection attacks. Code writing is disabled by default, you can enable it by setting allow_insecure_coding to true…