Building specialized AI without sacrificing intelligence: Nova Forge data mixing in action

1 week, 3 days ago aws.amazon.com - machine-learning

Large language models (LLMs) perform well on general tasks but struggle with specialized work that requires understanding proprietary data, internal processes, and industry-specific terminology. Supervised fine-tuning (SFT) adapts LLMs to these organizational contexts. SFT can be implemented through two distinct methodologies: Parameter-Efficient Fine-Tuning (PEFT), which updates only a subset of model parameters, offering faster training and lower computational costs while maintaining reasonable performance improvements; Full-rank SFT, which updates all model parameters rather than a subset and incorporates more domain knowledge than PEFT.

Full-rank SFT often faces a challenge: catastrophic forgetting. As models learn domain-specific patterns, they lose general capabilities including instruction-following, reasoning, and broad knowledge. Organizations must choose between domain expertise and general intelligence, which limits model utility across enterprise use cases.

Amazon Nova Forge addresses the problem. Nova Forge is a new service that you can use to build your own frontier models using Nova. Nova Forge customers can ...

Copyright of this story solely belongs to aws.amazon.com - machine-learning . To see the full text click HERE

Share: