How far are we from AI on the edge?

Generative AI on edge is revolutionising how we interact with technology. From basic functionalities to advanced applications, significant progress has been made, but more improvements are needed.

Table of Contents

This is some text inside of a div block.

How far are we from AI on the edge?

Imagine a future where your smartphone understands your voice commands instantly and also predicts your needs before you even articulate them. A future where your wearable health monitor not only tracks your vital signs in real-time but also alerts you to potential health issues before they become critical. This is the promise of AI on edge devices—bringing advanced artificial intelligence capabilities directly to the devices we use every day, without relying on constant connectivity to powerful cloud servers.

Recent developments from Apple have reignited interest in local AI. Apple recently announced the integration of personal AI, called Apple Intelligence, into some of its products, showcasing their commitment to advancing AI on edge. However, their solution is a hybrid one, combining on-device and server processing through their partnership with OpenAI. This approach highlights the current limitations in achieving fully independent edge AI, despite being a clear priority for Apple.

As technology advances, the concept of AI on edge is rapidly moving from the realm of science fiction to practical reality. But how far are we from fully realising this potential? In this article, we will explore the current state of AI on edge, the challenges we face in its implementation, the recent advancements driving us forward, and the future prospects that lie ahead. By providing an understanding of the order of magnitudes involved, we aim to give you a clear picture of how close we are to a future where AI is seamlessly integrated into every device around us.

‍

Understanding AI on edge

What is AI on edge?

AI on edge refers to the deployment of artificial intelligence algorithms and models on devices located at the edge of the network, close to the data source. Unlike traditional AI systems that rely heavily on cloud computing resources, edge AI processes data locally on devices such as smartphones, IoT gadgets, and autonomous vehicles. This localised processing allows for faster decision-making, improved privacy, cheaper inference and reduced dependence on constant internet connectivity.

Examples of edge AI devices

Edge devices span a wide range of applications and industries. Common examples include smartphones that use AI for voice assistants, camera enhancements, and personalised recommendations; smart home devices like smart speakers, thermostats, and security cameras that manage home automation tasks locally; autonomous vehicles that require real-time data processing for navigation, obstacle detection, and decision-making; and wearable health monitors that track metrics such as heart rate, sleep patterns, and physical activity, providing instant feedback and alerts.

Why is AI on edge important?

The significance of AI on edge lies in its ability to meet the growing demand for low-latency, privacy-preserving, and cost-effective AI applications. By processing data locally, edge AI reduces the time it takes to derive insights and take actions, which is crucial for applications like autonomous driving and real-time health monitoring. Additionally, keeping data on-device enhances privacy and security by minimising the transmission of sensitive information over the internet. Another reason is that consumers want to have personalised models. AI models have biases that stem from datasets. People want the models to be specialised on their data, not on the behaviour of someone else. Edge AI can solve this. Most importantly, running large AI models on the cloud is prohibitively expensive. By leveraging edge computing, these costs can be significantly reduced, -using smaller models- making AI more accessible and sustainable for various applications.

‍

Current state of AI on edge

Classic AI models can now be integrated into most edge devices. Let’s focus on the over-hyped generative models on edge.

Generative AI models evolution

In terms of generative AI models, we are seeing significant advancements but also notable limitations. Large AI models, the most capables, such as those used in natural language processing (e.g., GPT-4) or image recognition, typically have tens to hundreds billions of parameters and require substantial computational resources that are beyond the capabilities of current edge devices. However, there has been progress in developing smaller, more efficient models tailored for edge computing. Techniques like model quantization and pruning reduce the size and complexity of these models, allowing them to run on devices with limited resources at the cost of some precision. Additionally, there are specialised models designed specifically for edge use cases.

The local model used by Apple is one of these (3 billion parameters). Phi-3 mini from Microsoft is also from this category of small language models. However these models do not currently have enough capabilities to run whole AI systems. Apple partnering with OpenAI to redirect some requests is an example of it.

Hardware capabilities

Edge devices such as smartphones rapidly improve their capabilities. Especially since using dedicated chips for AI. Yet, integrating generative models on smartphones is very challenging to avoid using full memory of the device.

Order of magnitude

To show how far we are from fully local generative AI on smartphones, we will compare how many equivalent phones would be needed to use AI generative models on edge.

The objective is to run the models without freezing the smartphone, keeping battery autonomy, using models with advanced capabilities, rapid generation and preserving the quality of models from too drastic optimisations.

One should notice, hardware speed and computing power strongly rely on the specified smartphones. We follow the example with an average smartphone.

To run most capable open-source models such as Llama3-70b from Meta, it would require around the equivalent computing power of 100 smartphones to be run in background without disturbing the smartphone usage. For bigger models, with better capabilities it could go up to around 1000 smartphones.

This shows we are still far from using very capable generative AI models on average smartphones. Hardware can be improved and also model size and optimisation can be enhanced.

‍

Challenges in bringing AI to edge

Hardware limitations

One of the primary challenges in bringing AI to edge devices is the inherent hardware limitations. Edge devices, like smartphones and IoT gadgets, are constrained by their size, power consumption and battery autonomy, and heat dissipation capabilities. Unlike large data centres that can use substantial power supplies and advanced cooling systems, edge devices must operate within much tighter constraints. This means that designing efficient, powerful AI hardware for edge devices requires significant innovation and optimisation. Disparities between hardware devices is another challenge toward adoption.

Software and models

Developing AI models and software that can run efficiently on edge devices is another major challenge. Traditional AI models, such as large neural networks, can have billions of parameters and require substantial computational resources. To run these models on edge devices, they must be optimised for lower power consumption and reduced computational load. Techniques such as model quantization, pruning, and the development of lightweight AI models are crucial in this regard. For instance, models must be compressed from billions to millions of parameters while maintaining acceptable levels of accuracy and performance. These optimisations help ensure that AI can function effectively within the limited resources available on edge devices. To reach acceptable computing power requirements, architectures must drastically improve, optimisation will probably not be enough.

Data and connectivity

Data availability and connectivity present additional challenges for AI on edge devices. Edge devices often operate with limited or intermittent connectivity, which can restrict their ability to access and process large datasets. Ensuring that AI models can function effectively under these conditions requires innovative approaches to data management and processing. Techniques such as federated learning, which allows models to be trained across multiple devices without sharing raw data, can help address these issues by enabling edge devices to collaborate and learn from each other while maintaining data privacy and security.

‍

Conclusion

Generative AI on edge is a rapidly evolving field with immense potential to transform how we interact with technology. From the early days of basic AI functionalities to the current state of sophisticated applications on smartphones and IoT devices, we have made significant progress. However, we are still far from being able to use the desired models fully on edge. Important improvements are needed, it promises exciting times!

‍

Yann Bilien

‍