Microsoft Introduces 3 Foundational AI Models To Take on OpenAI, Anthropic

an hour ago extremetech.com

On Thursday, Microsoft introduced three new foundational AI models—MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2—focused on transcription, audio, and image generation, respectively. The tech giant positions them as in-house systems that will provide it with better control over cost, performance, and integration across its software and cloud services.

MAI-Transcribe-1 offers text-to-speech transcription in 25 different languages. This could be used to create instant transcripts of Teams meetings or customer-facing phone calls. Microsoft describes MAI-Transcribe-1 as "lightning fast," meaning it should produce captions or transcripts with very low latency. The brand also reports its model as having a lower word error rate than GPT-Transcribe, Gemini 3.1 Flash, and other transcription-focused AI models.

MAI-Voice-1 is a voice-generation model aimed at providing "voice experiences and voice agents" with nuance and emotional expression. It can reportedly produce 60 seconds of audio in just one second.

Finally, MAI-Image-2 targets marketing, design, and other professionals who ...

Copyright of this story solely belongs to extremetech.com . To see the full text click HERE

Share:

More related news