Microsoft AI Launches Three New Foundational Models to Compete in the LLM Market

Microsoft AI Launches Three New Foundational Models to Compete in the LLM Market
TechCrunch

Key Points

  • Microsoft AI releases three foundational models: MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2.
  • MAI-Transcribe-1 supports 25 languages and is 2.5× faster than Azure Fast.
  • MAI-Voice-1 can generate a minute of audio in one second and allows custom voice creation.
  • MAI-Image-2 adds video‑generation capabilities and debuted on MAI Playground on March 19.
  • Models are available via Microsoft Foundry and MAI Playground.
  • Pricing is positioned lower than competing Google and OpenAI services.
  • Developed by the MAI Superintelligence team led by Mustafa Suleyman, formed in November 2025.
  • Microsoft maintains its partnership with OpenAI while expanding its own AI stack.
  • The company uses both in‑house and external chip suppliers to support AI workloads.

Microsoft AI, the research arm of the tech giant, announced the rollout of three foundational multimodal models—MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2. The transcription model supports 25 languages and is 2.5 times faster than Azure Fast. The voice model can generate a minute of audio in one second and allows custom voice creation. The image model, originally unveiled on MAI Playground, expands Microsoft’s AI portfolio and is priced to be cheaper than competing offerings from Google and OpenAI. The launch underscores Microsoft’s commitment to building its own AI stack while maintaining its partnership with OpenAI.

New Model Portfolio

Microsoft AI, the research laboratory of the technology giant, unveiled three new foundational AI models. The suite includes MAI-Transcribe-1, a speech‑to‑text system; MAI-Voice-1, an audio‑generation engine; and MAI-Image-2, a video‑generation model. All three models are now accessible through Microsoft Foundry, with the transcription and voice models also available in MAI Playground.

Performance and Capabilities

MAI-Transcribe-1 can transcribe speech in 25 different languages and is reported to be 2.5 times faster than Microsoft’s Azure Fast offering. MAI-Voice-1 enables users to produce 60 seconds of audio in a single second and supports the creation of custom voice profiles. MAI-Image-2, initially released on MAI Playground on March 19, adds video‑generation capabilities to Microsoft’s multimodal AI lineup.

Strategic Positioning

The launch signals Microsoft’s continued push to develop its own stack of multimodal AI models and to compete with rival AI labs, even as it remains tied to OpenAI. The models were developed by the MAI Superintelligence team, an AI research group led by Mustafa Suleyman, the CEO of Microsoft AI, which was formed and announced in November 2025. Suleyman emphasized a “Humanist AI” approach that puts humans at the center and focuses on practical communication use cases.

Microsoft positions the new models as cost‑effective alternatives to offerings from Google and OpenAI, aiming to attract developers seeking affordable, high‑performance AI services.

Pricing and Availability

Pricing for the models is positioned to be lower than competing solutions. MAI-Transcribe-1 starts at $0.36 per hour, MAI-Voice-1 begins at $22 per 1 million characters, and MAI-Image-2 is priced at $5 for 1 million tokens of text input and $33 for 1 million tokens of image output.

Despite the independent model release, Microsoft reaffirmed its ongoing partnership with OpenAI, noting that a recent renegotiation of that partnership enables the company to pursue superintelligence research while still collaborating with OpenAI.

Hardware and Ecosystem

Microsoft continues a dual strategy on hardware, producing its own chips while also sourcing components from external vendors, ensuring flexibility in supporting its AI services across its cloud and product ecosystem.

#Microsoft#AI#foundational models#multimodal AI#MAI-Transcribe-1#MAI-Voice-1#MAI-Image-2#Azure#OpenAI partnership#superintelligence#cloud computing#AI research
Generated with  News Factory -  Source: TechCrunch

Also available in:

Microsoft AI Launches Three New Foundational Models to Compete in the LLM Market | AI News