Rapidata emerges to shorten AI model development cycles from months to days with near real-time RLHF

3 weeks ago venturebeat

Despite growing chatter about a future when much human work is automated by AI, one of the ironies of this current tech boom is how stubbornly reliant on human beings it remains, specifically the process of training AI models using reinforcement learning from human feedback (RLHF).

At its simplest, RLHF is a tutoring system: after an AI is trained on curated data, it still makes mistakes or sounds robotic. Human contractors are then hired en masse by AI labs to rate and rank a new model's outputs while it trains, and the model learns from their ratings, adjusting its behavior to offer higher-rated outputs. This process is all the more important as AI expands to produce multimedia outputs like video, audio, and imagery which may have more nuanced and subjective measures of quality.

Historically, this tutoring process has been a massive logistical headache and PR nightmare for AI companies ...

Copyright of this story solely belongs to venturebeat . To see the full text click HERE

Share: