for the models to train

  1. llama is trained on 15T tokens
  2. public github repos less than 1T tokens

mitigation

  1. synthetic data
  2. self play
  3. RL approaches
    1. reinforcement learning

    All notes