MeZO: Fine-Tuning Language Models with Just Forward Passes

This is the implementation for the paper "Fine-Tuning Language Models with Just Forward Passes".

We are still actively cleaning the code base. A better version of README is coming soon!

Outline

For reproducing RoBERTa-large experiments, please refer to the medium_models folder. For OPT experiments, please refer to the large_models folder.

Citation

@article{malladi2023mezo,
   title={Fine-Tuning Large Language Models with Just Forward Passes},
   author={Malladi, Sadhika and Gao, Tianyu and Nichani, Eshaan and Damian, Alex and Lee, Jason D and Chen, Danqi and Arora, Sanjeev},
   year={2023}
}