Research scientist - llm foundation models | brisbane, au
BrisbaneBinance
...supervised fine-tuning (SFT), reward modelling, and reinforcement learning-while driving innovations in reasoning and decision-making. You will synthesise large-scale, high-quality datasets through rewriting, augmentation, and generation techniques to strengthen foundation models during pretraining, SFT, and RL stages. A key [...]
Category IT & Telecommunications