Config does not match context length.

#19
by aberrio - opened

max_position_embeddings configures the sequence length to be 40960 while the README.md states the max is 32768 without YaRN.

I'm wondering:

  • if the smaller models were trained on this sequence length.
  • if this is an unintended typo that only applies to larger models.
  • if this is a general application and they were all trained this way.

Any feedback is appreciated.

Sign up or log in to comment