apple / embedding-atlas
- среда, 13 августа 2025 г. в 00:00:05
Embedding Atlas is a tool that provides interactive visualizations for large embeddings. It allows you to visualize, cross-filter, and search embeddings and metadata.
Embedding Atlas is a tool that provides interactive visualizations for large embeddings. It allows you to visualize, cross-filter, and search embeddings and metadata.
Features
🏷️ Automatic data clustering & labeling: Interactively visualize and navigate overall data structure.
🫧 Kernel density estimation & density contours: Easily explore and distinguish between dense regions of data and outliers.
🧊 Order-independent transparency: Ensure clear, accurate rendering of overlapping points.
🔍 Real-time search & nearest neighbors: Find similar data to a given query or existing data point.
🚀 WebGPU implementation (with WebGL 2 fallback): Fast, smooth performance (up to few million points) with modern rendering stack.
📊 Multi-coordinated views for metadata exploration: Interactively link and filter data across metadata columns.
Please visit https://apple.github.io/embedding-atlas for a demo and documentation.
To use Embedding Atlas with Python:
pip install embedding-atlas
embedding-atlas <your-dataset.parquet>
In addition to the command line tool, Embedding Atlas is also available as a Jupyter widget:
from embedding_atlas.widget import EmbeddingAtlasWidget
# Show the Embedding Atlas widget for your data frame:
EmbeddingAtlasWidget(df)
Finally, components from Embedding Atlas are also available in an npm package:
npm install embedding-atlas
import { EmbeddingAtlas, EmbeddingView, Table } from "embedding-atlas";
// or with React:
import { EmbeddingAtlas, EmbeddingView, Table } from "embedding-atlas/react";
// or Svelte:
import { EmbeddingAtlas, EmbeddingView, Table } from "embedding-atlas/svelte";
For more information, please visit https://apple.github.io/embedding-atlas/overview.html.
For the Embedding Atlas tool:
@misc{ren2025embedding,
title={Embedding Atlas: Low-Friction, Interactive Embedding Visualization},
author={Donghao Ren and Fred Hohman and Halden Lin and Dominik Moritz},
year={2025},
eprint={2505.06386},
archivePrefix={arXiv},
primaryClass={cs.HC},
url={https://arxiv.org/abs/2505.06386},
}
For the algorithm that automatically produces clusters and labels in the embedding view:
@misc{ren2025scalable,
title={A Scalable Approach to Clustering Embedding Projections},
author={Donghao Ren and Fred Hohman and Dominik Moritz},
year={2025},
eprint={2504.07285},
archivePrefix={arXiv},
primaryClass={cs.HC},
url={https://arxiv.org/abs/2504.07285},
}
This repo contains multiple sub-packages:
Frontend:
packages/component
: The EmbeddingView
and EmbeddingViewMosaic
components.
packages/table
: The Table
component.
packages/viewer
: The frontend application for visualizing embedding and other columns. It also provides the EmbeddingAtlas
component that can be embedded in other applications.
packages/density-clustering
: The density clustering algorithm, written in Rust.
packages/umap-wasm
: An implementation of UMAP algorithm in WebAssembly (with the umappp C++ library).
packages/embedding-atlas
: The embedding-atlas
package that get published. It imports all of the above and exposes their API in a single package.
Python:
packages/backend
: A Python package named embedding-atlas
that provides the embedding-atlas
command line tool.Documentation:
packages/docs
: The documentation website.For more information, please visit https://apple.github.io/embedding-atlas/develop.html.
This code is released under the MIT license
.