From bfcba5e28e0c48dccd78b7256d89533f07187a4d Mon Sep 17 00:00:00 2001 From: Sayak Paul Date: Sat, 30 Sep 2023 15:07:48 +0200 Subject: [PATCH] Update README.md to include a note about the `trl` integration --- README.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/README.md b/README.md index 4e65182..f77295b 100644 --- a/README.md +++ b/README.md @@ -53,3 +53,8 @@ If you want to run the LLaVA prompt-image alignment experiments, you need to ded +## Training using 🤗 `trl` + +🤗 `trl` provides a [`DDPOTrainer` class](https://huggingface.co/docs/trl/ddpo_trainer) which lets you fine-tune Stable Diffusion on different reward functions using DDPO. The integration supports LoRA, too. You can check out the [supplementary blog post](https://huggingface.co/blog/trl-ddpo) for additional guidance. The DDPO integration was contributed by @metric-space to `trl`. + +