I’m also curious about this. @mralexis - did you ever work this out? It seems like a similar question was also asked here: M2M model finetuning on multiple language pairs which also had no reply.
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| Can we force first token by model.config.forced_bos_token_id? | 0 | 708 | April 12, 2022 | |
| `bos_token_id` has to be defined when no `input_ids` are provided | 0 | 1304 | January 10, 2022 | |
| Encoder-Decoder model only generates bos_token's [<s><s><s>] | 17 | 3330 | December 6, 2022 | |
| BART - Input format | 4 | 1842 | December 13, 2023 | |
| What I know and don't know about sequence to sequence batching | 3 | 2100 | September 11, 2020 |