French AI startup Mistral has unveiled its first multimodal model, Pixtral 12B, which can process both images and text.
The 12-billion-parameter model, based on Mistral’s existing text-based model Nemo 12B, is intended for tasks such as image captioning, object identification, and answering image-related queries.
OpenAI Strawberry might be released sooner than anticipated. Here’s what we know so far.
Weighing 24GB, the model is available for free under the Apache 2.0 license, allowing anyone to use, modify, or commercialize it without restrictions. Developers can download it from GitHub and Hugging Face, but there are no live web demos yet.
Mashable Light Speed
According to Mistral’s head of developer relations, Pixtral 12B will soon be integrated into the company’s chatbot, Le Chat, and API platform, La Platforme.
Multimodal models like Pixtral 12B could represent the next frontier for generative AI, following the path of tools such as OpenAI’s GPT-4 and Anthropic’s Claude. However, there are concerns about the data sources used to train these models. As highlighted by Tech Crunch, Mistral, like many AI companies, likely trained Pixtral 12B using vast amounts of publicly available web data — a practice that has led to lawsuits from copyright holders challenging the “fair use” argument often made by tech firms.
The release comes after Mistral secured $645 million in funding, boosting its valuation to $6 billion. With Microsoft as one of its supporters, Mistral is positioning itself as Europe’s answer to OpenAI.
Topics
Artificial Intelligence