Evaluate generative AI models with an Amazon Nova rubric-based LLM judge on Amazon SageMaker AI (Part 2)

2 hours ago aws.amazon.com - machine-learning

In the post Evaluating generative AI models with Amazon Nova LLM-as-a-Judge on Amazon SageMaker AI, we introduced the Amazon Nova LLM-as-a-judge capability, which is a specialized evaluation model available through Amazon SageMaker AI that you can use to systematically measure the relative performance of generative AI systems.

SageMaker AI now offers a rubric-based large language model (LLM) judge powered by Amazon Nova. Instead of using the same general rules for every task, it automatically creates specific evaluation criteria for each individual prompt. This helps generative AI developers and machine learning (ML) engineers automatically generate precise, scenario-specific evaluation criterion for their LLMs and generative AI products, without manually crafting rule sets for every use case.

In this post, we explore the Amazon Nova rubric-based judge feature: what a rubric-based judge is, how the judge is trained, what metrics to consider, and how to calibrate the judge. We chare notebook code of ...

Copyright of this story solely belongs to aws.amazon.com - machine-learning . To see the full text click HERE

Share: