Qualcomm demonstrated further progress in running generative AI models on smartphones at the annual IEEE/CVF Conference on Computer Vision and Pattern Recognition.
In February, Stable Diffusion was able to generate an image in less than 15 seconds on a smartphone running Qualcomm’s latest Snapdragon 8 Gen 2 system-on-a-chip (see news below).
This time was considered a record at the time, but was beaten many times over by researchers shortly thereafter. It could be similar to the “world’s fastest” generation of a picture with ControlNet on a smartphone, which Qualcomm has now presented. The company boasts a start-to-finish time of 11.26 seconds using the 1.5 billion-parameter image-to-image model.
Generating AI images typically requires a fast computer. But with optimized hardware and software, a smartphone can do the job, as Qualcomm shows.
While OpenAI and Midjourney provide dedicated servers for their image generators – and charge their customers for using them – Stable Diffusion can also run on your own hardware.
To generate high-quality images in a reasonable amount of time, your PC will need a modern (and expensive) graphics card. Other devices with chips optimized for AI computation, such as Apple’s Silicon Macs or iPhones, can also do the job. Stable Diffusion clients have been available for these systems for some time.
Now, for the first time, Qualcomm is demonstrating Stable Diffusion image generation on an Android smartphone powered by one of its chips.
The Qualcomm Snapdragon 8 Gen 2 was introduced in late 2022 and is expected to power some high-priced Android smartphones from various manufacturers this year, such as the recently announced Samsung Galaxy S23.
According to Qualcomm, running Stable Diffusion on this chip was achieved through quantization, compilation and hardware acceleration.
Image generation works offline
Alleged image generators based on Stable Diffusion are already floating around the Google Play Store. However, these are only web interfaces that rely on the computing power of a server and therefore require an Internet connection.
This is not necessary when Stable Diffusion is run directly on the smartphone – although this does take up a few gigabytes of storage space.
An image with a resolution of 512 x 512 pixels and 20 inference steps takes less than 15 seconds on the Qualcomm chip, according to the press release. The app also offers features such as inpainting, image editing, style transfer and super-resolution. All of this works offline on the device.
Stable Diffusion Android app is just a proof of concept for now
Whether Stable Diffusion will be widely available on Android smartphones remains to be seen. For one thing, Qualcomm’s video is only a proof of concept; a corresponding app does not yet exist and would probably require a new high-end smartphone.
For another, commercial alternatives like Midjourney and DALL-E 2 are still more convenient to use and deliver better results with less effort. Nevertheless, the technological progress of being able to perform these computationally intensive tasks on a smartphone is remarkable.
Qualcomm, the world’s second-largest maker of smartphone chips, has been exploring AI for several years. In 2018, the company announced a “neural processing unit” optimized for AI tasks in its latest smartphone chip. That same year, the company invested $100 million in AI startups.
In June 2022, Qualcomm paved the way for more diverse AI applications with its AI Stack, which combines multiple AI tools for development focused on mobile chips such as those found in smartphones, cars or headsets. The Stable Diffusion application demonstrated here was also implemented and optimized for smartphones using the AI Stack, according to Qualcomm.