NevaMind-AI / memU

пятница, 9 января 2026 г. в 00:00:02

Memory infrastructure for LLMs and AI agents

MemU

A Future-Oriented Agentic Memory System

MemU is an agentic memory framework for LLM and AI agent backends. It receives multimodal inputs (conversations, documents, images), extracts them into structured memory, and organizes them into a hierarchical file system that supports both embedding-based (RAG) and non-embedding (LLM) retrieval.

MemU is collaborating with four open-source projects to launch the 2026 New Year Challenge. 🎉Between January 8–18, contributors can submit PRs to memU and earn cash rewards, community recognition, and platform credits. 🎁Learn more & get involved

✨ Core Features

Feature	Description
🗂️ Hierarchical File System	Three-layer architecture: Resource → Item → Category with full traceability
🔍 Dual Retrieval Methods	RAG (embedding-based) for speed, LLM (non-embedding) for deep semantic understanding
🎨 Multimodal Support	Process conversations, documents, images, audio, and video
🔄 Self-Evolving Memory	Memory structure adapts and improves based on usage patterns

🗂️ Hierarchical File System

MemU organizes memory using a three-layer architecture inspired by hierarchical storage systems:

Layer	Description	Examples
Resource	Raw multimodal data warehouse	JSON conversations, text documents, images, videos
Item	Discrete extracted memory units	Individual preferences, skills, opinions, habits
Category	Aggregated textual memory with summaries	`preferences.md`, `work_life.md`, `relationships.md`

Key Benefits:

Full Traceability: Track from raw data → items → categories and back
Progressive Summarization: Each layer provides increasingly abstracted views
Flexible Organization: Categories evolve based on content patterns

🎨 Multimodal Support

MemU processes diverse content types into unified memory:

Modality	Input	Processing
`conversation`	JSON chat logs	Extract preferences, opinions, habits, relationships
`document`	Text files (.txt, .md)	Extract knowledge, skills, facts
`image`	PNG, JPG, etc.	Vision model extracts visual concepts and descriptions
`video`	Video files	Frame extraction + vision analysis
`audio`	Audio files	Transcription + text processing

All modalities are unified into the same three-layer hierarchy, enabling cross-modal retrieval.

🚀 Quick Start

Option 1: Cloud Version

Try MemU instantly without any setup:

👉 memu.so - Hosted cloud service with full API access

For enterprise deployment and custom solutions, contact info@nevamind.ai

Cloud API (v3)

Base URL	`https://api.memu.so`
Auth	`Authorization: Bearer YOUR_API_KEY`

Method	Endpoint	Description
`POST`	`/api/v3/memory/memorize`	Register a memorization task
`GET`	`/api/v3/memory/memorize/status/{task_id}`	Get task status
`POST`	`/api/v3/memory/categories`	List memory categories
`POST`	`/api/v3/memory/retrieve`	Retrieve memories (semantic search)

📚 Full API Documentation

Option 2: Self-Hosted

Installation

pip install -e .

Basic Example

Requirements: Python 3.13+ and an OpenAI API key

Test with In-Memory Storage (no database required):

export OPENAI_API_KEY=your_api_key
cd tests
python test_inmemory.py

Test with PostgreSQL Storage (requires pgvector):

# Start PostgreSQL with pgvector
docker run -d \
  --name memu-postgres \
  -e POSTGRES_USER=postgres \
  -e POSTGRES_PASSWORD=postgres \
  -e POSTGRES_DB=memu \
  -p 5432:5432 \
  pgvector/pgvector:pg16

# Run the test
export OPENAI_API_KEY=your_api_key
cd tests
python test_postgres.py

Both examples demonstrate the complete workflow:

Memorize: Process a conversation file and extract structured memory
Retrieve (RAG): Fast embedding-based search
Retrieve (LLM): Deep semantic understanding search

See tests/test_inmemory.py and tests/test_postgres.py for the full source code.

Custom LLM and Embedding Providers

MemU supports custom LLM and embedding providers beyond OpenAI. Configure them via llm_profiles:

from memu import MemUService

service = MemUService(
    llm_profiles={
        # Default profile for LLM operations
        "default": {
            "base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
            "api_key": "your_api_key",
            "chat_model": "qwen3-max",
            "client_backend": "sdk"  # "sdk" or "http"
        },
        # Separate profile for embeddings
        "embedding": {
            "base_url": "https://api.voyageai.com/v1",
            "api_key": "your_voyage_api_key",
            "embed_model": "voyage-3.5-lite"
        }
    },
    # ... other configuration
)

📖 Core APIs

`memorize()` - Extract and Store Memory

Processes input resources and extracts structured memory:

result = await service.memorize(
    resource_url="path/to/file.json",  # File path or URL
    modality="conversation",            # conversation | document | image | video | audio
    user={"user_id": "123"}             # Optional: scope to a user
)

# Returns:
{
    "resource": {...},      # Stored resource metadata
    "items": [...],         # Extracted memory items
    "categories": [...]     # Updated category summaries
}

`retrieve()` - Query Memory

Retrieves relevant memory based on queries. MemU supports two retrieval strategies:

RAG-based Retrieval (`method="rag"`)

Fast embedding vector search using cosine similarity:

✅ Fast: Pure vector computation
✅ Scalable: Efficient for large memory stores
✅ Returns scores: Each result includes similarity score

LLM-based Retrieval (`method="llm"`)

Deep semantic understanding through direct LLM reasoning:

✅ Deep understanding: LLM comprehends context and nuance
✅ Query rewriting: Automatically refines query at each tier
✅ Adaptive: Stops early when sufficient information is found

Comparison

Aspect	RAG	LLM
Speed	⚡ Fast	🐢 Slower
Cost	💰 Low	💰💰 Higher
Semantic depth	Medium	Deep
Tier 2 scope	All items	Only items in relevant categories
Output	With similarity scores	Ranked by LLM reasoning

Both methods support:

Context-aware rewriting: Resolves pronouns using conversation history
Progressive search: Categories → Items → Resources
Sufficiency checking: Stops when enough information is retrieved

Usage

result = await service.retrieve(
    queries=[
        {"role": "user", "content": {"text": "What are their preferences?"}},
        {"role": "user", "content": {"text": "Tell me about work habits"}}
    ],
    where={"user_id": "123"}  # Optional: scope filter
)

# Returns:
{
    "categories": [...],     # Relevant categories (with scores for RAG)
    "items": [...],          # Relevant memory items
    "resources": [...],      # Related raw resources
    "next_step_query": "..." # Rewritten query for follow-up (if applicable)
}

Scope Filtering: Use where to filter by user model fields:

where={"user_id": "123"} - exact match
where={"agent_id__in": ["1", "2"]} - match any in list
Omit where to retrieve across all scopes

📚 For complete API documentation, see SERVICE_API.md - includes all methods, CRUD operations, pipeline configuration, and configuration types.

💡 Use Cases

Example 1: Conversation Memory

Extract and organize memory from multi-turn conversations:

export OPENAI_API_KEY=your_api_key
python examples/example_1_conversation_memory.py

What it does:

Processes multiple conversation JSON files
Extracts memory items (preferences, habits, opinions, relationships)
Generates category markdown files (preferences.md, work_life.md, etc.)

Best for: Personal AI assistants, customer support bots, social chatbots

Example 2: Skill Extraction from Logs

Extract skills and lessons learned from agent execution logs:

export OPENAI_API_KEY=your_api_key
python examples/example_2_skill_extraction.py

What it does:

Processes agent logs sequentially
Extracts actions, outcomes, and lessons learned
Demonstrates incremental learning - memory evolves with each file
Generates evolving skill guides (log_1.md → log_2.md → skill.md)

Best for: DevOps teams, agent self-improvement, knowledge management

Example 3: Multimodal Memory

Process diverse content types into unified memory:

export OPENAI_API_KEY=your_api_key
python examples/example_3_multimodal_memory.py

What it does:

Processes documents and images together
Extracts memory from different content types
Unifies into cross-modal categories (technical_documentation, visual_diagrams, etc.)

Best for: Documentation systems, learning platforms, research tools

📊 Performance

MemU achieves 92.09% average accuracy on the Locomo benchmark across all reasoning tasks.

View detailed experimental data: memU-experiment

🧩 Ecosystem

Repository	Description	Use Case
memU	Core algorithm engine	Embed AI memory into your product
memU-server	Backend service with CRUD, user system, RBAC	Self-host a memory backend
memU-ui	Visual dashboard	Ready-to-use memory console