facebookresearch / llm-transparency-tool
- суббота, 20 апреля 2024 г. в 00:00:10
LLM Transparency Tool (LLM-TT), an open-source interactive toolkit for analyzing internal workings of Transformer-based language models. *Check out demo at* https://huggingface.co/spaces/facebook/llm-transparency-tool-demo
# From the repository root directory
docker build -t llm_transparency_tool .
docker run --rm -p 7860:7860 llm_transparency_tool
# download
git clone git@github.com:facebookresearch/llm-transparency-tool.git
cd llm-transparency-tool
# install the necessary packages
conda env create --name llmtt -f env.yaml
# install the `llm_transparency_tool` package
pip install -e .
# now, we need to build the frontend
# don't worry, even `yarn` comes preinstalled by `env.yaml`
cd llm_transparency_tool/components/frontend
yarn install
yarn build
streamlit run llm_transparency_tool/server/app.py -- config/local.json
Initially, the tool allows you to select from just a handful of models. Here are the options you can try for using your model in the tool, from least to most effort.
Full list of models is here. In this case, the model can be added to the configuration json file.
Add the official name of the model to the config along with the location to read the weights from.
In this case the UI wouldn't know how to create proper hooks for the model. You'd need to implement your version of TransparentLlm class and alter the Streamlit app to use your implementation.
If you use the LLM Transparency Tool for your research, please consider citing:
@article{tufanov2024lm,
title={LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models},
author={Igor Tufanov and Karen Hambardzumyan and Javier Ferrando and Elena Voita},
year={2024},
journal={Arxiv},
url={https://arxiv.org/abs/2404.07004}
}
@article{ferrando2024information,
title={Information Flow Routes: Automatically Interpreting Language Models at Scale},
author={Javier Ferrando and Elena Voita},
year={2024},
journal={Arxiv},
url={https://arxiv.org/abs/2403.00824}
}
This code is made available under a CC BY-NC 4.0 license, as found in the LICENSE file. However you may have other legal obligations that govern your use of other content, such as the terms of service for third-party models.