arkohut / pensieve
- среда, 20 ноября 2024 г. в 00:00:01
A passive recording project allows you to have complete control over your data. 一个完全由你掌控数据的「被动记录」项目。
English | 简体中文
I changed the name to Pensieve because Memos was already taken.
Pensieve is a privacy-focused passive recording project. It can automatically record screen content, build intelligent indices, and provide a convenient web interface to retrieve historical records.
This project draws heavily from two other projects: one called Rewind and another called Windows Recall. However, unlike both of them, Pensieve allows you to have complete control over your data, avoiding the transfer of data to untrusted data centers.
pip install memos
Initialize the pensieve configuration file and sqlite database:
memos init
Data will be stored in the ~/.memos
directory.
memos enable
memos start
This command will:
Open your browser and visit http://localhost:8839
On Mac, Pensieve needs screen recording permission. When the program starts, Mac will prompt for screen recording permission - please allow it to proceed.
Pensieve uses embedding models to extract semantic information and build vector indices. Therefore, choosing an appropriate embedding model is crucial. Depending on the user's primary language, different embedding models should be selected.
Open the ~/.memos/config.yaml
file with your preferred text editor and modify the embedding
configuration:
embedding:
use_local: true
model: jinaai/jina-embeddings-v2-base-en # Model name used
num_dim: 768 # Model dimensions
use_modelscope: false # Whether to use ModelScope's model
memos stop
memos start
The first time you use the embedding model, Pensieve will automatically download and load the model.
If you switch the embedding model during use, meaning you have already indexed screenshots before, you need to rebuild the index:
memos reindex --force
The --force
parameter indicates rebuilding the index table and deleting previously indexed screenshot data.
By default, Pensieve only enables the OCR plugin to extract text from screenshots and build indices. However, this method significantly limits search effectiveness for images without text.
To achieve more comprehensive visual search capabilities, we need a multimodal image understanding service compatible with the OpenAI API. Ollama perfectly fits this role.
Before deciding to enable the VLM feature, please note the following:
Hardware Requirements
Performance and Power Consumption Impact
Visit the Ollama official documentation for detailed installation and configuration instructions.
Download and run the multimodal model minicpm-v
using the following command:
ollama run minicpm-v "Describe what this service is"
This command will download and run the minicpm-v model. If the running speed is too slow, it is not recommended to use this feature.
Open the ~/.memos/config.yaml
file with your preferred text editor and modify the vlm
configuration:
vlm:
endpoint: http://localhost:11434 # Ollama service address
modelname: minicpm-v # Model name to use
force_jpeg: true # Convert images to JPEG format to ensure compatibility
prompt: Please describe the content of this image, including the layout and visual elements # Prompt sent to the model
Use the above configuration to overwrite the vlm
configuration in the ~/.memos/config.yaml
file.
Also, modify the default_plugins
configuration in the ~/.memos/plugins/vlm/config.yaml
file:
default_plugins:
- builtin_ocr
- builtin_vlm
This adds the builtin_vlm
plugin to the default plugin list.
memos stop
memos start
After restarting the Pensieve service, wait a moment to see the data extracted by VLM in the latest screenshots on the Pensieve web interface:
If you do not see the VLM results, you can:
memos ps
to check if the Pensieve process is running normally~/.memos/logs/memos.log
ollama ps
)Pensieve is a compute-intensive application. The indexing process requires the collaboration of OCR, VLM, and embedding models. To minimize the impact on the user's computer, Pensieve calculates the average processing time for each screenshot and adjusts the indexing frequency accordingly. Therefore, not all screenshots are indexed immediately by default.
If you want to index all screenshots, you can use the following command for full indexing:
memos scan
This command will scan and index all recorded screenshots. Note that depending on the number of screenshots and system configuration, this process may take some time and consume significant system resources. The index construction is idempotent, and running this command multiple times will not re-index already indexed data.
During the development of Pensieve, I closely followed the progress of similar products, especially Rewind and Windows Recall. I greatly appreciate their product philosophy, but they do not do enough in terms of privacy protection, which is a concern for many users (or potential users). Recording the screen of a personal computer may expose extremely sensitive private data, such as bank accounts, passwords, chat records, etc. Therefore, ensuring that data storage and processing are completely controlled by the user to prevent data leakage is particularly important.
The advantages of Pensieve are:
~/.memos
directory.memos stop && memos disable
, then uninstall it with pip uninstall memos
, and finally delete the ~/.memos
directory to clean up all databases and screenshot data.Of course, there is still room for improvement in terms of privacy, and contributions are welcome to make Pensieve better.
Pensieve records the screen every 5 seconds and saves the original screenshots in the ~/.memos/screenshots
directory. Storage space usage mainly depends on the following factors:
Screenshot Data:
Screenshots are deduplicated. If the content of consecutive screenshots does not change much, only one screenshot will be retained. The deduplication mechanism can significantly reduce storage usage in scenarios where content does not change frequently (such as reading, document editing, etc.).
Database Space:
Pensieve requires two compute-intensive tasks by default:
OCR Task: Executed using the CPU, and optimized to select the OCR engine based on different operating systems to minimize CPU usage
Embedding Task: Intelligently selects the computing device
To avoid affecting users' daily use, Pensieve has adopted the following optimization measures:
In fact, after Pensieve starts, it runs three programs:
memos serve
starts the web servicememos record
starts the screenshot recording programmemos watch
listens to the image events generated by memos record
and dynamically submits indexing requests to the server based on actual processing speedTherefore, if you are a developer or want to see the logs of the entire project running more clearly, you can use these three commands to run each part in the foreground instead of the memos enable && memos start
command.