French AI startup Mistral, which competes with companies like OpenAI and Anthropic, has released its first multimodal model, Pixtral 12B, integrating both language and image processing capabilities.
Mistral AI, a French artificial intelligence startup, has unveiled its first multimodal model, Pixtral 12B, which can process both images and text. With nearly 12 billion parameters and a size of approximately 24 GB, Pixtral 12B is designed to outperform models with fewer parameters in problem-solving tasks.
Pixtral 12B will rival OpenAI’s GPT-4o
Pixtral 12B can respond to queries related to an unlimited number of images of any size, provided via image URLs or base64-encoded images. The model is expected to perform tasks such as annotating photos and counting objects in images, similar to other multimodal models like Anthropic’s Claude family and OpenAI’s GPT-4.
Pixtral 12B is available for download via torrent links on GitHub and Hugging Face, platforms dedicated to artificial intelligence and machine learning development. Users can download, modify, and use the model under Mistral’s standard license.
Sophia Yang, Mistral’s head of developer relations, announced that Pixtral 12B will soon be available for testing on Mistral’s chatbot and API platforms, Le Chat and Le Platforme. However, at the time of release, no functional web demos for Pixtral 12B were available. The specific image data used by Mistral to develop the model has not yet been disclosed.
The launch of Pixtral 12B follows Mistral’s successful $645 million funding round led by General Catalyst, which valued the firm at $6 billion. Although Mistral is only a year old, it is seen as Europe’s answer to OpenAI.
You may also like this content
- GPT-4o, the brainchild of ChatGPT, has been Updated
- OpenAI’s AI Course for Educators Sparks Privacy and Security Concerns
- Microsoft Teams to Overcome Language Barriers with AI Translator Feature