Tech »  Topic »  New training method boosts AI multimodal reasoning with smaller, smarter datasets

New training method boosts AI multimodal reasoning with smaller, smarter datasets


Researchers at MiroMind AI and several Chinese universities have released OpenMMReasoner, a new training framework that improves the capabilities of language models in multimodal reasoning.

The framework uses a two-stage process. It first refines a base model with a curated dataset in a supervised fine-tuning (SFT) stage. Then, a reinforcement learning (RL) stage guides the model to reason more effectively in tasks that involve both text and visual data. 

Experiments show that models trained with OpenMMReasoner outperform other leading visual reasoning models, often while being trained on a smaller, higher-quality dataset. The framework and all its assets, including a trained 7B model, are fully open source, providing a reliable foundation for building applications that require traceability and robustness.

According to Kaichen Zhang, co-author of a research paper that outlines the new method, OpenMMReasoner offers significant benefits for businesses looking beyond large, closed systems. "A smaller open-source reasoning model has practical ...


Copyright of this story solely belongs to venturebeat . To see the full text click HERE