Microsoft unveils in-house image generation model in pivot away from OpenAI

Microsoft unveils in-house image generation model in pivot away from OpenAI

Microsoft unveils MAI-Image-1, its first fully in-house text-to-image AI model, now available for public testing on LMArena

Microsoft's MAI-Image-1

Microsoft's MAI-Image-1

Aashish Kumar Shrivastava New Delhi
3 min read Last Updated : Oct 14 2025 | 2:37 PM IST

Listen to This Article

Microsoft has announced MAI-Image-1, its first text-to-image model developed entirely in-house. The company describes MAI-Image-1 as focused on photorealistic output and on delivering a balance of speed and image quality; it is currently available for public testing on LMArena while Microsoft prepares to add the model to Copilot and Bing Image Creator later. These Microsoft tools currently use OpenAI’s GPT-4o and DALL-E 3 image generation models.
 
Notably, this is the third fully homegrown AI model to be unveiled by Microsoft. Earlier in August, Microsoft unveiled two AI models designed to enhance Copilot and related AI experiences, namely, MAI-Voice-1 and MAI-1-preview.
 
 
According to the Microsoft blog, the MAI-Image-1 is ranked in the top 10 text-to-image models on LMArena. The top 10 rankings are also shared by Google Gemini 2.5 Flash Image model, and OpenAI’s GPT-1. Additionally, MAI-Image-1 will be competing with the likes of Google’s text-to-image AI model, Imagen 3, and more.

What MAI-Image-1 does and how it was developed

According to Microsoft, MAI-Image-1 was trained with an emphasis on data selection and evaluation tailored to real-world creative tasks. The company says it incorporated feedback from professionals in creative industries during evaluation and prioritised methods intended to reduce repetitive or oversimplified stylistic outputs.
 
Microsoft highlights the model’s performance on photorealism—examples cited include handling of lighting effects (bounce light and reflections) and landscapes—and positions MAI-Image-1 as relatively fast compared with some larger, slower models.
 
Microsoft states the model’s design goal is to provide visual diversity and practical utility for creators, enabling faster iteration and handoff to other tools for further refinement. It also says MAI-Image-1’s speed-quality balance is intended to help users get concepts on screen quickly.

Safety testing and deployment

The company frames safety and responsibility as priorities for MAI-Image-1. Microsoft stated that it has begun testing the model on LMArena to collect insights and feedback. The announcement specifically notes that further deployment is planned, with the model expected to appear in Microsoft products such as Copilot and Bing Image Creator “very soon.”

Other MAI models

  • MAI-Voice-1: MAI-Voice-1 is Microsoft’s initial expressive speech generation system, designed to produce a full minute of audio in under a second on a single GPU. Optimised for use in both single- and multi-speaker situations, it is claimed to deliver high-fidelity audio for voice-driven AI experiences.
  • MAI-1-preview: The MAI-1-preview is a mixture-of-experts foundation model trained on approximately 15,000 NVIDIA H100 GPUs. It is designed to follow instructions in text-based queries and is intended for use in Copilot text situations. The model is also being tested on LMArena, a community platform for model evaluation.

Topics : Microsoft AI Models OpenAI

Don't miss the most important news and views of the day. Get them on our Telegram channel

First Published: Oct 14 2025 | 2:37 PM IST

Business Standard
