Microsoft Introduces Phi-3 AI Models: Small Models Offer Big Potential for Privacy

The Rise of Compact AI: A New Era in Technology

by Faruk Imamovic
Microsoft Introduces Phi-3 AI Models: Small Models Offer Big Potential for Privacy
© Getty Images/Justin Sullivan

When ChatGPT was launched in November 2023, it was accessible only through the cloud due to the sheer size of the model behind it. Today, a similarly capable AI program runs effortlessly on a MacBook Air without generating any noticeable heat.

This transition illustrates the rapid advancements in refining AI models to be leaner and more efficient, demonstrating that increasing scale isn't the only way to enhance machine intelligence. The model currently powering a laptop with ChatGPT-like capabilities is called Phi-3-mini.

It belongs to a new family of smaller AI models recently introduced by researchers at Microsoft. Although compact enough to operate on a smartphone, it was tested on a laptop and accessed via an iPhone through an app called Enchanted, which provides a chat interface similar to the official ChatGPT app.

A New Benchmark in AI Efficiency

Microsoft's research team, in a paper detailing the Phi-3 family, claims that the model performs comparably to GPT-3.5, the OpenAI model that powered the first ChatGPT release. This claim is based on standard AI benchmarks that measure common sense and reasoning.

From this experience, Phi-3-mini seems equally capable, handling tasks with impressive proficiency. At Microsoft’s annual developer conference, Build, the company announced a new “multimodal” Phi-3 model capable of processing audio, video, and text inputs.

This announcement came shortly after OpenAI and Google showcased their advanced AI assistants, which are built on multimodal models accessible via the cloud. Microsoft’s new family of AI models indicates the potential for developing versatile AI applications that do not rely on the cloud, potentially opening up new use cases by making them more responsive or private.

Offline algorithms are a crucial component of Microsoft's Recall feature, which uses AI to make everything on a PC searchable.

Microsoft Logo© Getty Images/David Ramos

The Art of Efficient AI Training

The Phi family also sheds light on the evolving nature of modern AI and its potential for improvement.

Sébastien Bubeck, a researcher at Microsoft, explains that the models were created to explore if selective training could enhance an AI system’s abilities without expanding its data set excessively. Large language models like OpenAI’s GPT-4 or Google’s Gemini are typically trained on vast amounts of text from various sources.

While this approach has raised legal concerns, increasing the data and computational power used for training has unlocked new capabilities. Bubeck, intrigued by the “intelligence” exhibited by language models, sought to determine if carefully curated training data could yield similar improvements.

In September, his team trained a model roughly one-seventeenth the size of GPT-3.5 on high-quality synthetic data generated by a larger AI model. This included specialized information from domains like programming. Surprisingly, the resulting model outperformed GPT-3.5 in coding tasks.

“We were able to beat GPT-3.5 at coding using this technique,” Bubeck said. “That was really surprising to us”. Further experiments by Bubeck’s team revealed that even smaller models could produce coherent output when trained with specific types of data, such as children's stories.

This finding suggests that effectively training smaller AI models with the right material can make them function remarkably well.

The Future of AI: Small, Smart, and Local

Bubeck believes these results indicate that future advancements in AI will involve more than just increasing model size.

Scaled-down models like Phi-3 are likely to play a significant role in the future of computing. Running AI models locally on devices like smartphones, laptops, or PCs can reduce latency and prevent outages that occur when queries are processed in the cloud.

It ensures that data remains on the user’s device, paving the way for new AI applications deeply integrated into operating systems. Apple is expected to unveil its AI strategy at the upcoming WWDC conference. The company has previously highlighted that its custom hardware and software enable machine learning to occur locally on its devices.

Instead of competing with OpenAI and Google in building larger cloud-based AI models, Apple might focus on shrinking AI to fit seamlessly into its users’ daily lives.

Implications for Privacy and Usability

The ability to run sophisticated AI models locally on personal devices has significant implications for both privacy and usability.

When AI processes data directly on a user’s device, it minimizes the need to transmit potentially sensitive information over the internet, thereby enhancing privacy. This approach also reduces dependence on stable internet connections, making AI tools more reliable in areas with poor connectivity.

Moreover, locally run AI can enable more personalized and context-aware applications. For instance, an AI assistant integrated into a smartphone’s operating system could provide more relevant recommendations and insights based on the user’s habits and preferences without needing to send data to external servers.

This level of integration could lead to more intuitive and seamless user experiences.

Challenges and Future Directions

Despite the advantages, there are challenges to consider in the push towards more compact AI models. One significant hurdle is ensuring that these smaller models maintain the robustness and versatility of their larger counterparts.

Researchers must continue to develop innovative training techniques and data curation methods to maximize the efficiency and effectiveness of these models. Another challenge is the hardware limitations of personal devices.

While advancements in AI are impressive, the computational power and battery life of smartphones and laptops still pose constraints. However, ongoing improvements in hardware design and energy-efficient algorithms could mitigate these issues over time.