Apx. 5 min read

Restoring Real Smartphone Image Detail, not Hallucinations, with Glass AI

Written by

Vishal Vinod

Published on

August 27, 2024

Today, smartphone cameras are not just ubiquitous tools for capturing everyday moments; they are gateways to a world of advanced photography powered by Artificial Intelligence (AI). Before AI, camera algorithms often struggled to consistently capture fine, clean details. Now, with AI-driven image enhancement and upsampling available directly on devices, anyone can enjoy photographs with crisp, clear resolution.

‍

However, the integration of AI in smartphone cameras often comes with a significant drawback: generative neural networks and their tendency to hallucinate details. This phenomenon has been a persistent issue, in particular with diffusion-based generative AI up-samplers and some Generative Adversarial Network (GAN) based approaches inventing details that weren't present in the original image.

‍

*Left: iPhone 15 Pro Max @15x zoom ; Right: Glass AI on same device/scene*

At Glass Imaging, we are reversing the trend towards hallucinations with a novel, revolutionary approach to AI smartphone photography, ensuring that images remain true to life without the artifacts associated with AI super resolution and upscaling. By focusing on extracting and restoring authentic detail and quality, our Glass AI software is setting a new standard in high definition mobile imaging.

‍

Disadvantages of existing smartphone camera ISP pipelines, and their typical AI enhancements

‍

Traditional smartphone camera software relies on an Image Signal Processing (ISP) pipeline (see simplified figure below), which combines tunable algorithms embedded in silicon with custom software blocks. A consequence of this approach is that large teams of image quality engineers are required to tune each of these blocks for each newly released phone, comparing the results across various settings to find the most visually pleasing results.

Simplified Image Signal Processing (ISP) pipeline with tunable blocks. Glass AI pipeline eliminates all blocks marked in Red

‍

Unfortunately, even with expert tuning at each stage of the pipeline, some quality can still be lost due to the interactions between these various algorithms. For example, an aggressive noise reduction block softens the image, and while this may be partially countered by a subsequent sharpening block, the combination leads to over-enhanced edges that may lack fine texture details.

Example of aggressive denoise and sharpening on smartphone

Modern AI based image enhancement and upsampling can counter some of these issues by post-processing the captured image and finding a visually plausible interpretation of the degraded image, creating apparent detail to fill in what was missed or distorted by the traditional processing. However, these post-processed details often contain hallucinations that are not authentic to the original scene.

One way to counter these issues is to go back to the RAW information at the start of the imaging pipeline to devise an all-in-one approach that extracts as much detail as possible from the signal in the first place.

‍

Resolving Real Detail: Making the Most of Available Information with Glass AI

‍

Glass AI is an end-to-end trained camera AI ISP pipeline, that subsumes many of the traditional image enhancements (noise reduction, sharpening, multi-frame fusion) blocks, but also extends them with new capabilities such as deconvolution for lens aberration and sensor crosstalk correction. The fact that several of these processes are combined together in one learned algorithm means that they can interact optimally to extract all of the available information from the RAW images on the sensor.

Glass AI ISP pipeline, many functions replaced with an end-to-end trainable neural network

‍

One of the key strengths of Glass AI’s technology is its ability to resolve real detail, making the most of the available information from smartphone cameras. This approach is based on RAW burst image super-resolution and demosaicing, where a sequence of images are combined to enhance detail and clarity. In essence there is detail in the original image sequence that is “mixed up” between them, but Glass AI can recover it. Traditional ISP pipelines and even GenAI algorithms typically do not make use of all of the information in the RAW images.

In a controlled test on text clarity comparing the iPhone 15 Pro Max with and without Glass AI at 15x zoom, the Glass AI enabled iPhone images display superior sharpness and readability. The text, which can often appear smeared or distorted on other devices at high zoom levels, retains crisp edges and clear spacing in Glass AI's renderings. In another test between the two smartphone cameras, we compared images of a busy amusement park in Santa Cruz and the Glass AI technology enabled iPhone 15 Pro Max consistently produced sharper, more detailed images, retaining textures and nuances that other cameras often lose.

‍

And zooming in on some details:

‍

The Promise of RAW

The technical imaging community has seen a recent surge in the use of diffusion models, which are designed to enhance image resolution by filling in missing details. However, these models often produce synthetic-looking images, lacking the nuanced textures of real scenes, or create meaningless character-like shapes in text.

‍

Glass AI's algorithms take a different approach by minimizing the reliance on artificial data generation, focusing instead on enhancing the real data captured by the camera. This results in high-resolution images that faithfully represent the original scene, comparable to the quality of professional SLR cameras. Glass AI enhances smartphone cameras, enabling photographers to capture clear and crisp zoomed-in images even in low-light conditions. This technology empowers photographers to capture the photographs they envision, bringing their artistic vision to life with precision and creativity by effectively utilizing the RAW captures.

What sets Glass AI’s approach apart is its novel methodology for accurately characterizing lens aberrations and reliably learning the sensor's characteristics and noise behavior. These factors are crucial when training a network to demosaic and fuse multiple raw images, as they are interconnected and influence each other. For instance, failing to account for optical blur can hinder accurate demosaicing.

‍

ISO-Conditioned Generation: Adapting to Light

Another standout feature of Glass's AI technology is the ISO-conditioned generation. By adjusting the processing based on the ISO sensitivity, Glass ensures optimal image quality under varying lighting conditions, a feat that most smartphones struggle to match.

In low-light conditions, where high ISO settings are inevitable, Glass maintains a clear advantage. A comparative analysis of images taken by Glass AI enabled smartphones and competitors like the Xiaomi 13 Pro, as in the examples below, shows that Glass AI manages to preserve more details and display fewer noise artifacts, making each photo appear more natural and less processed.

*Left: Xiaomi 13 Pro 3.2X zoom @ 3 LUX ; Right: Glass AI on same device/scene*

*Left: Xiaomi 13 Pro 3.2X zoom @ 5 LUX ; Right: Glass AI on same device/scene*

‍

Not only Smartphones

No lens is perfect. No image sensor is noise free and therefore leveraging our unique method of training neural networks to reverse optical aberrations and correct sensor noise allows us to maximize detail preservation, resolution and edge fidelity of any camera while suppressing noise and artifacts to form a sharp and clean image. Glass AI works with any device and is not limited to smartphone cameras.

Stay tuned for our upcoming blog post, where we will delve into how Glass AI is revolutionizing drone image quality.

Sneak preview showing a 50 MegPixel image from DJI Mini 4 Pro drone with and without Glass AI:

‍

Left: DJI Mini 4 Pro @50 Mpix ; Right: Glass AI applied to 50 Mpix RAW

‍