Skip to main content

A non-local algorithm for image denoising

Published in  2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, this paper introduces two main ideas
  1. Method noise
  2. Non-local (NL) means algorithm to denoise images

Method noise

It is defined as the difference between the original (noisy) image and its denoised version. Some of the intuitions that can be drawn by analysing method noise are

  1. Zero method noise means perfect denoising (complete removal of noise without lose of image data).
  2. If a denoising method performed well, the method noise must look like a noise and should contain as little structure as possible from the original image
The authors then discuss the method noise properties for different denoising filters. They are derived based on the filter properties. We will not be going in detail for each filter as the properties of the filters are known facts. The paper explains those properties using the intuitions of method noise.

NL-means idea

Denoised value at point of an image is the mean of all points whose gaussian neighborhood is similar to the neighborhood of x. This technique is different from local filtering and frequency domain filtering techniques as it takes what the entire image has to offer to help denoise the image rather than only looking at neighboring pixels and noise characteristics.

NL-means algorithm

  1. Given a noisy image Nfor each pixel i, calculate the weighted average of all the pixels in the image to obtain the denoised value for pixel i
  2. The weight given to each pixel in the weighted average is directly proportional to the similarity with pixel i.
    1. All weights are between 0 and 1
    2. Sum of weights is equal to 1
  3. Similarity between two pixels i and j is measured based on the similarity between the gray level vectors of the square neighborhoods of the pixels.
    1. Similarity is measured as a decreasing function (Guassian kernel) of the weighted Euclidean distance.
    2. Based on the similarity, the weights are assigned.
For pixel p, it is clear that neighborhood of points q1 and q2 are similar and hence w(p,q1) and w(p,q2) are larger. Similarly, q3 having a much different neihborhood attributes lower weight to w(p,q3).

The figure above shows the weight distribution of other pixels with respect to the central pixel. White being closer to weight 1 and black to 0.

Why does averaging work?

This averaging of similar pixels obtained from all the over the image, reduces the noise. As we know for a fact that image averaging works on the assumption that the noise in the image follows a random distribution. This way random fluctuations above and below the actual image data gets smoothened out as one averages.

The paper further discussed the consistency of the NL-means algorithm and experimental results. I encourage you to go through the paper and take a look at the mathematical derivations and the following experiments. (All pictures in this post were borrowed from the paper)

Learning links

Comments

Popular Posts

Deeper and Wider Siamese Networks for Real-Time Visual Tracking

 In this paper , the authors investigate how to increase the robustness and accuracy of existing Siamese trackers used for visual object tracking. Visual object tracking Visual object tracking is one of the fundamental problems in computer vision. It aims to estimate the position of an arbitrary target in a video sequence, given only its location in the initial frame. It has numerous applications in surveillance, robotics, and human-computer interaction. Siamese Networks and their usage in Trackers Siamese networks are a class of neural networks that fundamentally learns to generate comparable feature vectors from their twin inputs. By learning to compute these comparable feature vectors, it learns differentiable characteristics for each type of image class. With these output vectors, it is possible to compare the two inputs and say if they belong to the same image class or not. For example, this is used in one-shot learning for facial recognition. Here the siamese network learns t...

Ocean: Object-aware Anchor-free Tracking

The paper titled " Ocean: Object Aware Anchor Free Tracking " presents a novel approach to visual object tracking that is poised to outperform existing anchor-based approaches. The authors propose a unique anchor-free framework named Ocean, designed to address certain challenges in the current field of visual tracking. Introduction Visual object tracking is a crucial part of computer vision technology. The widely utilized anchor-based trackers have their limitations, which this paper attempts to address. The authors present the innovative Ocean framework, designed to transform the visual tracking field by improving adaptability and performance. The Problem with Anchor-Based Trackers Despite their wide usage, anchor-based trackers suffer from some notable drawbacks. They struggle with tracking objects experiencing drastic scale changes or those having high aspect ratios. The anchors, with their fixed scale and fixed ratios, can limit the flexibility of the trackers, making the...

BLIP: Bootstrapping Language-Image Pretraining for Unified Vision-Language Understanding

BLIP is a new vision-language model proposed by Microsoft Research Asia in 2022. It introduces a bootstrapping method to learn from noisy image-text pairs scraped from the web. The BLIP Framework BLIP consists of three key components: MED  - A multimodal encoder-decoder model that can encode images, text, and generate image-grounded text. Captioner  - Fine-tuned on COCO to generate captions for web images. Filter  - Fine-tuned on COCO to filter noisy image-text pairs. The pretraining process follows these steps: Collect noisy image-text pairs from the web. Pretrain MED on this data. Finetune captioner and filter on the COCO dataset. Use captioner to generate new captions for web images. Filter noisy pairs using the filter model. Repeat the process by pretraining on a cleaned dataset. This bootstrapping allows BLIP to learn from web-scale noisy data in a self-supervised manner. Innovations in BLIP Some interesting aspects of BLIP: Combines encoder-decoder capability in one...