Language selection

Search

Patent 2949105 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent: (11) CA 2949105
(54) English Title: METHODS AND SYSTEMS FOR SUPPRESSING ATMOSPHERIC TURBULENCE IN IMAGES
(54) French Title: PROCEDES ET SYSTEMES DE SUPPRESSION DE TURBULENCES ATMOSPHERIQUES DANS DES IMAGES
Status: Granted and Issued
Bibliographic Data
(51) International Patent Classification (IPC):
  • G06T 07/20 (2017.01)
  • H04N 19/176 (2014.01)
  • H04N 19/86 (2014.01)
(72) Inventors :
  • FOI, ALESSANDRO (Finland)
  • KATKOVNIK, VLADIMIR (Finland)
  • MOLCHANOV, PAVLO (Finland)
  • SANCHEZ-MONGE, ENRIQUE (Finland)
(73) Owners :
  • FLIR SYSTEMS, INC.
  • NOISELESS IMAGING OY LTD.
(71) Applicants :
  • FLIR SYSTEMS, INC. (United States of America)
  • NOISELESS IMAGING OY LTD. (Finland)
(74) Agent: GOWLING WLG (CANADA) LLP
(74) Associate agent:
(45) Issued: 2020-03-24
(86) PCT Filing Date: 2015-05-22
(87) Open to Public Inspection: 2015-11-26
Examination requested: 2019-05-30
Availability of licence: N/A
Dedicated to the Public: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): Yes
(86) PCT Filing Number: PCT/US2015/032302
(87) International Publication Number: US2015032302
(85) National Entry: 2016-11-14

(30) Application Priority Data:
Application No. Country/Territory Date
14/720,086 (United States of America) 2015-05-22
62/002,731 (United States of America) 2014-05-23

Abstracts

English Abstract

Various techniques are disclosed to suppress distortion in images, such as distortion caused by atmospheric turbulence. For example, similar image blocks from a sequence of images may be identified and tracked along motion trajectories to construct spatiotemporal volumes. The motion trajectories are smoothed to estimate the true positions of the image blocks without random displacements due to the distortion, and the smoothed trajectories are used to aggregate the image blocks in their new estimated positions to reconstruct the sequence of images with the random displacements suppressed. Blurring that may remain within each image block of the spatiotemporal volumes may be suppressed by modifying the spatiotemporal volumes in a collaborative fashion. For example, a decorrelating transform may be applied to the spatiotemporal volumes to suppress the blurring in a transform domain, such as by alpha-rooting or other suitable operations on the coefficients of the spectral volumes.


French Abstract

La présente invention concerne diverses techniques pour supprimer des distorsions dans des images, telles que des distorsions causées par des turbulences atmosphériques. Par exemple, des blocs d'image similaires d'une séquence d'images peuvent être identifiés et suivis le long de trajectoires de mouvement pour construire des volumes spatiotemporels. Les trajectoires de mouvement sont lissées pour estimer les positions véritables des blocs d'image sans déplacements aléatoires dus aux distorsions, et les trajectoires lissées sont utilisées pour agréger les blocs d'image dans leurs nouvelles positions estimées afin de reconstruire la séquence d'images avec les déplacements aléatoires supprimés. Le flou qui peut rester à l'intérieur de chaque bloc d'image des volumes spatiotemporels peut être supprimé par modification des volumes spatiotemporels d'une manière collaborative. Par exemple, une transformée de décorrélation peut être appliquée aux volumes spatiotemporels afin de supprimer le flou dans un domaine de transformée, tel que par une opération sur des racines alpha ou d'autres opérations appropriées sur les coefficients des volumes spectraux.

Claims

Note: Claims are shown in the official language in which they were submitted.


CLAIMS:
1. A method comprising:
receiving a plurality of video image frames;
extracting a plurality of image blocks from same or different spatial
positions on the
different received video image frames along motion trajectories, wherein the
motion trajectories
includes random displacements due to distortion in the received video image
frames;
constructing a plurality of spatiotemporal volumes by grouping the extracted
image
blocks according to the respective motion trajectories;
smoothing the motion trajectories to suppress the random displacements; and
aggregating the image blocks according to the smoothed trajectories to
generate a
plurality of processed video image frames, wherein at least some of the
distortion is suppressed
in the processed video image frames.
2. The method of claim 1, wherein the distortion is due to atmospheric
turbulence.
3. The method of claim 1, further comprising:
modifying the spatiotemporal volumes to suppress blurring due to the
distortion, wherein
aggregating of the image blocks comprises aggregating imaging blocks from the
modified
spatiotemporal volumes.
4. The method of claim 3, wherein the modifying of the spatiotemporal
volumes
further comprises:
applying a decorrelating transform to the spatiotemporal volumes to generate
corresponding three dimensional (3-D) spectra, wherein each 3-D spectrum
comprises a plurality
of spectral coefficients for a transform domain representation of a
corresponding one of the
spatiotemporal volumes;
modifying at least some of the spectral coefficients in each of the 3-D
spectra to suppress
the blurring due to the distortion; and
applying, to the 3-D spectra, an inverse transform of the decorrelating
transform to
generate the modified spatiotemporal volumes.

5. The method of claim 4, wherein the modifying of the at least some of the
spectral
coefficients comprises attenuating temporal-AC coefficients and amplifying
temporal-DC
coefficients of the 3-D spectra.
6. The method of claim 5, wherein the attenuating of the temporal-AC
coefficients
and the amplifying of the temporal-DC coefficients are by alpha-rooting the
temporal-AC and
temporal-DC coefficients.
7. The method of claim 1, wherein the smoothing of the motion trajectories
is
adaptive to a complexity of the motion trajectories and/or adaptive to a
magnitude of the random
displacements.
8. The method of claim 1, wherein the smoothing of the motion trajectories
comprises determining, by regression, approximate positions of the image
blocks without the
random displacements.
9. The method of claim 1, wherein the extracting of the plurality of image
blocks
comprises identifying and tracking similar image blocks from the received
video image frames.
10. The method of claim 1, wherein each of the image blocks is a fixed-size
patch
extracted from a corresponding one of the video image frames.
11. A system comprising:
a video interface configured to receive a plurality of video image frames;
a processor in communication with the video interface and configured to:
extract a plurality of image blocks from same or different spatial positions
on the
different received video image frames along motion trajectories, wherein the
motion
trajectories includes random displacements due to distortion in the received
video image
frames,
construct a plurality of spatiotemporal volumes by grouping the extracted
image
blocks according to the respective motion trajectories;
smooth the motion trajectories to suppress the random displacement, and
41

aggregate the image blocks according to the smoothed trajectories to generate
a
plurality of processed video image frames, wherein at least some of the
distortion is
suppressed in the processed video image frames; and
a memory in communication with the processor and configured to store the
processed
video image frames.
12. The system of claim 11, wherein the distortion is due to atmospheric
turbulence.
13. The system of claim 11, wherein the processor is further configured to:
modify the spatiotemporal volumes to suppress blurring due to the distortion;
and
aggregate the image blocks from the modified spatiotemporal volumes.
14. The system of claim 13, wherein the processor is configured to modify
the
spatiotemporal volumes by:
applying a decorrelating transform to the spatiotemporal volumes to generate
corresponding three dimensional (3-D) spectra, wherein each 3-D spectrum
comprises a plurality
of spectral coefficients for a transform domain representation of a
corresponding one of the
spatiotemporal volumes;
modifying at least some of the spectral coefficients in each of the 3-D
spectra to suppress
the blurring due to the distortion; and
applying, to the 3-D spectra, an inverse transform of the decorrelating
transform to
generate the modified spatiotemporal volumes.
15. The system of claim 14, wherein the modifying of the at least some of
the spectral
coefficients comprises attenuating temporal-AC coefficients and amplifying
temporal-DC
coefficients of the 3-D spectra.
16. The system of claim 15, wherein the attenuating of the temporal-AC
coefficients
and the amplifying of the temporal-DC coefficients are by alpha-rooting the
temporal-AC and
temporal-DC coefficients.
17. The system of claim 11, wherein the processor is configured to smooth
the motion
trajectories by determining, using regression, approximate positions of the
image blocks without
the random displacements.
42

18. The system of claim 11, wherein the processor is configured to extract
the
plurality of image blocks by identifying and tracking similar image blocks
from the received
video image frames.
19. The system of claim 18, wherein the identifying and tracking of the
similar image
blocks are based on a multiscale motion estimation.
20. The system of claim 11, further comprising an infrared camera
configured to
capture thermal images of the scene, wherein the received video image frames
comprise the
captured thermal images.
43

Description

Note: Descriptions are shown in the official language in which they were submitted.


METHODS AND SYSTEMS FOR SUPPRESSING ATMOSPHERIC TURBULENCE IN
IMAGES
Alessandro Foi, Vladimir Katkovnik, Pavlo Molchanov, and Enrique
Sanchez-Monge
- 1 -
CA 2949105 2019-07-08

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
TECHNICAL FIELD
One or more embodiments of the invention relate generally to digital imaging
processing and more particularly, for example, to noise and distortion
suppression in images.
BACKGROUND
Noise is one of the main causes of degradation in images (e.g., video and
still images)
captured by image sensors. Conventional noise filtering techniques typically
apply various
averaging or smoothing operations to suppress noise, under the assumption that
noise is
random and unstructured such that it can be canceled out by averaging or
smoothing.
However, the assumption of unstructured randomness of noise is not accurate.
In fact,
noise may include both a fixed pattern noise (FPN) component (e.g., due to
column noise in
readout circuitry, irregular pixel sizes, and/or other irregularities) and a
random noise
component. The FPN component may appear as a noisy pattern that is essentially
constant
through time, and as such it is not attenuated by averaging, but often becomes
even more
visible after conventional noise filtering. The FPN becomes more problematic
for low-cost
sensors, sensors with extremely narrow pixel-pitch, or sensors operating in
implementations
with a very low signal-to-noise ratios (SNRs) (e.g., in low-light imaging,
thermal imaging,
range imaging, or other imaging applications with low SNRs). Furthermore, for
most
imagers, both the FPN and random noise components are typically structured
(e.g., colored
noise), with different correlations present in the FPN and random noise
components. Thus,
conventional filtering techniques often produce images with prominent
structured artifacts.
In addition to random and fixed pattern noise, images may contain distortion
and
degradation caused by atmospheric turbulence as light travels through the air
from the source
to an image sensor, which may particularly be noticeable in outdoor and/or
long distance
images. For example, variations in the refractive index in turbulent air may
cause image blur
randomly varying in space and time, large-magnitude shifts (also referred to
as "dancing") of
image patches also randomly varying in space and time, and random geometrical
distortion
(also referred to as "random warping") of images.
Conventional techniques such as bispectrum imaging, lucky imaging, and
temporal
averaging have been developed to address at least some distortion caused by
atmospheric
turbulence, However, such conventional techniques require static scenes (e.g.,
a plurality of
- 2 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
short-exposure frames of the same static scene) to work. While such
conventional techniques
may be adapted to work on scenes with motion or moving objects by applying the
techniques
on a sliding temporal window basis, this produces various undesirable results.
For example,
applying such conventional techniques to scenes with motion or moving objects
typically
leads to motion blur as well as ghosting effects. Further in this regard, high
temporal
frequency content is lost from scenes with motion when such conventional
techniques are
applied, in addition to losing high spatial frequency content.
SUMMARY
Various techniques are disclosed to suppress distortion in images (e.g., video
or still
images), such as distortion caused by atmospheric turbulence. For example, in
various
embodiments, similar image blocks from a sequence of images (e.g., a sequence
of video
flumes) may be identified and tracked along motion trajectories to construct
spatiotemporal
volumes. The motion trajectories may contain random shifts/displacements (or
other spatial
low/mid frequency components of the distortion) that are caused, for example,
by
atmospheric turbulence, whereas the contents of the image blocks in the
spatiotemporal
volumes may be affected by blurring (or other higher spatial components of the
distortion)
that remains within each image block. In various embodiments, the motion
trajectories are
smoothed to estimate the true positions of the image blocks without the random
displacements, and the smoothed trajectories are used to aggregate the image
blocks in their
new estimated positions to reconstruct the sequence of images with the random
displacements/shifts suppressed. In various embodiments, the blurring effect
that may
remain within each image block of the spatiotemporal volumes may be suppressed
by
modifying the spatiotemporal volumes in a collaborative fashion. For example,
a
deeorrelating transform may be applied to the spatiotemporal volumes to
suppress the
blurring effects in a transform domain, such as by alpha-rooting or other
suitable operations
on the coefficients of the spectral volumes corresponding to the
spatiotemporal volumes.
In one embodiment, a method includes: receiving a plurality of video image
frames;
extracting a plurality of image blocks from the received video image frames
along motion
trajectories, wherein the motion trajectories includes random displacements
due to distortion
3 0 in the received video image frames; smoothing the motion trajectories
to suppress the random
displacements; and aggregating the image blocks according to the smoothed
trajectories to
- 3 -

generate a plurality of processed video image frames, wherein at least some of
the distortion
is suppressed in the processed video image frames.
In another embodiment, a system includes a video interface configured to
receive a
plurality of video image frames; a processor in communication with the video
interface and
configured to extract a plurality of image blocks from the received video
image frames along
motion trajectories, wherein the motion trajectories includes random
displacements due to
distortion in the received video image frames, smooth the motion trajectories
to suppress the
random displacement, and aggregate the image blocks according to the smoothed
trajectories
to generate a plurality of processed video image frames, wherein at least some
of the
distortion is suppressed in the processed video image frames; and a memory in
communication with the processor and configured to store the processed video
image frames.
The distortion in the received video image frames may, for example, be due to
atmospheric turbulence. In some embodiments, the method may further include,
and the
system may be further configured to perform, operations to construct
spatiotemporal volumes
using the extracted image blocks and to modify the spatiotemporal volumes to
suppress
blurring due to the distortion. For example, the blurring may be suppressed in
a transform
domain by applying a decorrelating transform to the spatiotemporal volumes to
generate
corresponding three dimensional (3-D) spectra and modifying at least some of
the spectral
coefficients in each of the 3-D spectra, such as by alpha-rooting the spectral
coefficients.
A more complete understanding of embodiments of the invention will be afforded
to
those skilled in the art, as well as a realization of additional advantages
thereof, by a
consideration of the following detailed description of one or more
embodiments. Reference
will be made to the appended sheets of drawings that will first be described
briefly.
BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 illustrates a block diagram of a video processing system in accordance
with an
embodiment of the disclosure.
Figs. 2A-2B illustrate examples of random noise in video images in accordance
with
an embodiment of the disclosure.
- 4 -
CA 2949105 2019-07-08

CA 02949105 2016-11-14
WO 2015/179841
PCT/1JS2015/032302
Fig. 2C illustrates an example of fixed pattern noise (FPN) in video images in
accordance with an embodiment of the disclosure.
Figs. 3A and 3B illustrate graphs representing examples of power spectral
densities of
random noise and FPN components, respectively, in accordance with an
embodiment of the
disclosure.
Fig. 4 illustrates a flowchart of a process to suppress noise in video images
in
accordance with an embodiment of the disclosure.
Fig. 5 illustrates a flowchart of a process to construct and filter
spatiotemporal
volumes to suppress noise in video images in accordance with an embodiment of
the
disclosure.
Fig. 6 illustrates an example of a motion trajectory along which image blocks
may be
extracted to construct a spatiotemporal volume in accordance with an
embodiment of the
disclosure.
Fig. 7 illustrates a visual representation of filtering on a three dimensional
(3-D)
spectrum of a spatiotemporal volume in accordance with an embodiment of the
disclosure.
Fig. 8 illustrates various two-dimensional (2-D) transform representations of
examples of power spectral densities of random noise and FPN components in
accordance
with an embodiment of the disclosure.
Fig. 9 illustrates an example of an input video image frame captured by an
infrared
imaging sensor in accordance with an embodiment of the disclosure.
Fig. 10A illustrates an example of a resulting video image frame filtered
using a
conventional technique.
Fig. 10B illustrates an example of a resulting video image frame filtered and
enhanced
using a conventional technique.
Fig. 11A illustrates an example of a resulting video image frame filtered in
accordance with an embodiment of the disclosure.
- 5 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
Fig. 11B illustrates an example of a resulting video image frame filtered and
enhanced
in accordance with an embodiment of the disclosure.
Fig. 12 illustrates how light from a scene may be distorted due to turbulent
air before
reaching an imaging sensor in accordance with an embodiment of the disclosure.
Fig. 13 illustrates a flowchart of a process to suppress distortion in images
in
accordance with an embodiment of the disclosure.
Fig. 14 illustrates a graph of image block motion trajectories in accordance
with an
embodiment of the disclosure.
Fig. 15A illustrates an example of an input video image frame captured by an
infrared
imaging sensor in accordance with an embodiment of the disclosure.
Fig. 15B illustrates an example of a resulting video image frame obtained by
suppressing distortion caused by atmospheric turbulence in the example image
frame of Fig.
15A in accordance with an embodiment of the disclosure.
Embodiments of the invention and their advantages are best understood by
referring
to the detailed description that follows. It should be appreciated that like
reference numerals
are used to identify like elements illustrated in one or more of the figures.
DETAILED DESCRIPTION
Various embodiments of methods and systems disclosed herein may be used to
model
random noise and FPN to suppress both types of noise in images (e.g., video or
still images).
More specifically, in one or more embodiments, methods and systems may permit
effective
suppression of noise even in images that have a prominent FPN component, by
modeling
noise more accurately to comprise both random noise and FPN components,
estimating one
or more noise parameters, filtering images based on motion-adaptive
parameters, and/or
performing other operations described herein.
In one aspect of methods and systems disclosed herein, filtering may be
performed on
spatiotemporal volumes, any one of which may be constructed by grouping image
blocks
(e.g., a fixed-size portion or patch of a video image frame) extracted from a
sequence of
video image frames along a motion trajectory. Because different image blocks
in such a
- 6 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
spatiotemporal volume may belong to different spatial positions on a video
image, FPN may
be revealed as random noise in the volume, and thus may be modeled and
filtered as such. If
there is little or no motion, different image blocks may be aligned (e.g.,
belong to the same
spatial positions on video image frames) and thus FPN may be preserved as such
in the
spatiotemporal volumes. In this regard, various embodiments of the disclosure
may
effectively suppress FPN by adaptively filtering the spatiotemporal volumes
based not only
on various noise parameters, but also on the relative motion captured in the
spatiotemporal
volumes, as further described herein.
In another aspect of methods and systems disclosed herein, one or more noise
parameters associated with both FPN and random noise may be estimated using
video images
to be processed and/or other video images that may be used for purposes of
estimating noise
parameters, according to various embodiments of the disclosure. Similarly, in
some
embodiments, one or more parameters for processing video images to suppress
distortion/degradation (e.g., distortion/degradation caused by atmospheric
turbulence) may be
estimated using video images to be processed and/or other video images (e.g.,
reference video
images) that may be used for obtaining estimates.
Thus, in various embodiments, filtering operations may be adaptively performed
on
the spatiotemporal volumes based on the estimated noise parameters and the
motion captured
in the volumes (e.g., relative spatial alignment of image blocks from frame to
frame). In
some embodiments, such filtering operations may be efficiently performed by
applying a
three-dimensional (3-D) transform (e.g., a discrete cosine transform (DCT),
discrete sine
transform (DST), discrete wavelet transfoini (DWT), or other orthogonal
transforms) to the
spatiotemporal volumes to obtain 3-D spectra, modifying (e.g., adjusting,
adaptively
shrinking) coefficients of the 3-D spectra, and applying an inverse transform
to obtain filtered
spatiotemporal volumes. Image blocks from the filtered spatiotemporal volumes
may be
aggregated (e.g., combined or averaged using adaptive or non-adaptive weights)
to construct
filtered video image frames. Video image frames in some embodiments may be a
set of
discrete still images, which can be utilized to provide digital still images
(e.g., as digital
photographs captured by a digital camera).
Therefore, for example, various embodiments of methods and systems disclosed
herein may be included in or implemented as various devices and systems such
as infrared
imaging devices, mobile digital cameras, video surveillance systems, video
processing
- 7 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
systems, or other systems or devices that may need to obtain acceptable
quality video images
from video images impaired by noise (e.g., captured by infrared image sensors
or other
sensors operating at a low signal-to-noise ratio regime). Furthermore, various
techniques
disclosed herein not limited to providing noise suppression, but may further
beneficially
improve performance of various other video processing operations such as
enhancement,
restoration, deblurring, equalization, sharpening, super-resolution, and other
operations that
can be impaired by noise, as well as performance of high-level analytics such
as object
detection, object identification, target tracking, segmentation, scene
tracking, and other
analytics operations.
Fig. 1 shows a block diagram of a system 100 (e.g., an infrared camera) for
capturing
and/or processing video images in accordance with an embodiment of the
disclosure. The
system 100 comprises, in one implementation, a processing component 110, a
memory
component 120, an image capture component 130, a video interface component
134, a control
component 140, a display component 150, a sensing component 160, and/or a
network
interface 180.
System 100 may represent an imaging device, such as a video camera, to capture
and/or process images, such as video images of a scene 170. In one embodiment,
system 100
may be implemented as an infrared camera configured to detect infrared
radiation and
provide representative data and information (e.g., infrared image data of a
scene). For
example, system 100 may represent an infrared camera that is directed to the
near, middle,
and/or far infrared spectrums. In some embodiments, image data captured and/or
processed
by system 100 may comprise non-uniform data (e.g., real image data that is not
from a shutter
or black body) of the scene 170, for processing, as set forth herein. System
100 may
comprise a portable device and may be incorporated, for example, into a
vehicle (e.g., an
automobile or other type of land-based vehicle, an aircraft, or a spacecraft)
or a non-mobile
installation requiring infrared images to be stored and/or displayed.
In various embodiments, processing component 110 comprises a processor, such
as
one or more of a microprocessor, a single-core processor, a multi-core
processor, a
microcontroller, a logic device (e.g., a programmable logic device (PLD)
configured to
perform processing functions), a digital signal processing (DSP) device, etc.
Processing
component 110 may be adapted to interface and communicate with various other
components
of system 100 perform method and processing steps and/or operations, as
described herein.
- 8 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
Processing component 110 may include a noise filtering module 112 configured
to implement
a noise suppression and/or removal operation such as discussed in reference to
Figs. 2A-
11B). In one aspect, processing component 110 may be configured to perform
various other
image processing algorithms including scaling and/or converting image data,
either as part of
or separate from the noise filtering operation.
It should be appreciated that noise filtering module 112 may be integrated in
software
and/or hardware as part of processing component 110, with code (e.g., software
or
configuration data) for noise filtering module 112 stored, for example, in
memory component
120. Embodiments of the noise filtering operation as disclosed herein, may be
stored by a
separate machine-readable medium 121 (e.g., a memory, such as a hard drive, a
compact
disk, a digital video disk, or a flash memory) to be executed by a computer
(e.g., a logic or
processor-based system) to perform various methods and operations disclosed
herein. In one
aspect, machine-readable medium 121 may be portable and/or located separate
from system
100, with the stored noise filtering operation provided to system 100 by
coupling the
computer-readable medium to system 100 and/or by system 100 downloading (e.g.,
via a
wired link and/or a wireless link) the noise filtering operation from computer-
readable
medium 121.
Memory component 120 comprises, in one embodiment, one or more memory devices
configured to store data and information, including video image data and
information.
Memory component 120 may comprise one or more various types of memory devices
including volatile and non-volatile memory' devices, such as RAM (Random
Access
Memory), ROM (Read-Only Memory), EEPROM (Electrically-Erasable Read-Only
Memory), flash memory, hard disk drive, and/or other types of memory.
Processing
component 110 may be configured to execute software stored in memory component
120 so
as to perform method and process steps and/or operations described herein.
Processing
component 110 may be configured to store in memory component 120 video image
data
captured by image capture component 130 and/or received via video interface
component
134. Processing component 110 may be configured to store processed (e.g.,
filtered) video
image data in memory component 120.
Image capture component 130 may comprise, in various embodiments, one or more
image sensors for capturing image data (e.g., still image data and/or video
data)
representative of an image, such as scene 170. In one embodiment, image
capture component
- 9 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
130 may comprise one or more infrared sensors (e.g., any type of multi-pixel
infrared
detector, such as a focal plane array) for capturing thermal image data (e.g.,
thermal still
image data and/or thermal video data) representative of an image, such as
scene 170. In one
embodiment, the infrared sensors of image capture component 130 may provide
for
representing (e.g., converting) the captured image data as digital data (e.g.,
via an analog-to-
digital converter included as part of the infrared sensor or separate from the
infrared sensor as
part of the system 100). In another embodiment, digital conversion and/or
other interfacing
may be provided at video interface component 134.
In one aspect, video and/or still image data (e.g., thermal video data) may
comprise
non-uniform data (e.g., real image data) of an image, such as scene 170. Video
and/or still
image data may also comprise, in some embodiments, uniform data (e.g., image
data of a
shutter or a reference black body) that may be utilized, for example, as
calibration video
and/or calibration image data. Processing component 110 may be configured to
process the
captured image data (e.g., to provide processed image data), store the image
data in the
memory component 120, and/or retrieve stored image data from memory component
120.
For example, processing component 110 may be adapted to process thermal image
data
stored in memory component 120 to provide processed (e.g., filtered) image
data and
information.
Video interface component 134 may include, in some embodiments, appropriate
input
ports, connectors, switches, and/or circuitry configured to interface with
external devices
(e.g., remote device 182 and/or other devices) to receive video data (e.g.,
video data 132)
generated by or otherwise stored at the external devices. The received video
data may be
provided to processing component 110. In this regard, the received video data
may be
converted into signals or data suitable for processing by processing component
110. For
example, in one embodiment, video interface component 134 may be configured to
receive
analog video data and convert it into suitable digital data to be provided to
processing
component 110. In one aspect of this embodiment, video interface component 134
may
comprise various standard video ports, which may be connected to a video
player, a video
camera, or other devices capable of generating standard video signals, and may
convert the
received video signals into digital video/image data suitable for processing
by processing
component 110. In some embodiments, video interface component 134 may be also
configured to interface with and receive image data from image capture
component 130. In
- 10 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
other embodiments, image capture component 130 may interface directly with
processing
component 110.
Control component 140 comprises, in one embodiment, a user input and/or
interface
device, such as a rotatable knob (e.g., potentiometer), push buttons, slide
bar, keyboard,
.. and/or other devices, that is adapted to generate a user input control
signal. Processing
component 110 may be adapted to sense control input signals from a user via
control
component 140 and respond to any sensed control input signals received
therefrom.
Processing component 110 may be adapted to interpret such a control input
signal as a value,
as generally understood by one skilled in the art. In one embodiment, control
component 140
may comprise a control unit (e.g., a wired or wireless handheld control unit)
having push
buttons adapted to interface with a user and receive user input control
values. In one
implementation, the push buttons of the control unit may be used to control
various functions
of system 100, such as autofocus, menu enable and selection, field of view,
brightness,
contrast, noise filtering, image enhancement, and/or various other features.
Display component 150 comprises, in one embodiment, an image display device
(e.g.,
a liquid crystal display (LCD)) or various other types of generally known
video displays or
monitors. Processing component 110 may be adapted to display image data and
information
on display component 150. Processing component 110 may be adapted to retrieve
image data
and information from memory component 120 and display any retrieved image data
and
information on display component 150. Display component 150 may comprise
display
circuitry, which may be utilized by the processing component 110 to display
image data and
information (e.g., filtered theonal images). Display component 150 may be
adapted to
receive image data and information directly from image capture component 130
via
processing component 110 and/or video interface component 134. or the image
data and
information may be transferred from memory component 120 via processing
component 110.
Sensing component 160 comprises, in one embodiment, one or more sensors of
various types, depending on the application or implementation requirements, as
would be
understood by one skilled in the art. Sensors of sensing component 160 provide
data and/or
information to at least processing component 110. In one aspect, processing
component 110
may be adapted to communicate with sensing component 160 (e.g., by receiving
sensor
information from sensing component 160) and with image capture component 130
(e.g., by
receiving data and information from the image capture component 130 and
providing and/or
- 11 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
receiving command, control, and/or other information to and/or from one or
more other
components of the system 100).
In various implementations, sensing component 160 may provide information
regarding environmental conditions, such as outside temperature, lighting
conditions (e.g.,
day, night, dusk, and/or dawn), humidity level, specific weather conditions
(e.g., sun, rain,
and/or snow), distance (e.g., laser rangefinder or time-of-flight camera),
and/or whether a
tunnel or other type of enclosure has been entered or exited. Sensing
component 160 may
represent conventional sensors as generally known by one skilled in the art
for monitoring
various conditions (e.g., environmental conditions) that may have an effect
(e.g., on the
image appearance) on the data provided by image capture component 130.
In some implementations, sensing component 160 (e.g., one or more of sensors)
may
comprise devices that relay information to processing component 110 via wired
and/or
wireless communication. For example, sensing component 160 may be adapted to
receive
information from a satellite, through a local broadcast (e.g., radio frequency
(RF))
transmission, through a mobile or cellular network and/or through information
beacons in an
infrastructure (e.g., a transportation or highway information beacon
infrastructure), or various
other wired and/or wireless techniques.
In various embodiments, various components of system 100 may be combined
and/or
implemented or not, as desired or depending on the application or
requirements. In one
example, processing component 110 may be combined with memory component 120,
the
image capture component 130, video interface component, display component 150,
network
interface 180, and/or sensing component 160. In another example, processing
component
110 may be combined with image capture component 130 with only certain
functions of
processing component 110 performed by circuitry (e.g., a processor, a
microprocessor, a
.. logic device, a microcontroller, etc.) within image capture component 130.
Furthermore, in some embodiments, various components of system 100 may be
distributed and in communication with one another over a network 190. In this
regard,
system 100 may include network interface 180 configured to facilitate wired
and/or wireless
communication among various components of system 100 over network. In such
embodiments, components may also be replicated if desired for particular
applications of
system 100. That is, components configured for same or similar operations may
be
- 12 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
distributed over a network. Further, all or part of any one of the various
components may be
implemented using appropriate components of a remote device 182 (e.g., a
conventional
digital video recorder (DVR), a computer configured for image processing,
and/or other
device) in communication with various components of system 100 via network
interface 180
over network 190, if desired, Thus, for example, all or part of processor 110,
all or part of
memory component 120, and/or all of part of display component 150 may be
implemented or
replicated at remote device 182, and configured to perform filtering of video
image data as
further described herein. In another example, system 100 may a comprise image
capture
component located separately and remotely from processing component 110 and/or
other
components of system 100. It will be appreciated that many other combinations
of
distributed implementations of system 100 are possible, without departing from
the scope and
spirit of the disclosure.
Fig. 2A-2C show examples of random noise and FPN in video image data in
accordance with an embodiment of the disclosure. More specifically, Figs. 2A-
2B show
.. examples of random noise extracted respectively from two consecutive video
image frames,
and Fig. 2C shows FPN that persists in a sequence of video image frames. In
Figs. 2A-2C.
FPN is substantially constant (e.gõ does not vary or vary only slightly)
overtime (e.g., over
consecutive video image frames), whereas random noise may vary randomly with
respect to
time.
Video image data captured by many image sensors exhibit both random noise and
FPN. Whereas many conventional filtering techniques simply model noise present
in still Or
video images as random and unstructured noise, systems and methods disclosed
herein
advantageously model both a random noise component and a FPN component in
video image
data to effectively suppress both types of noise therein. In various
embodiments, noise that
may appear as a result of sensor defects (e.g., response non-uniformity, dead
pixels, hot
pixels, or other defects) may also be modeled or otherwise considered as part
of ITN.
Moreover, noise exhibited in still or video images captured by many image
sensors is not
unstructured noise. Rather, both the random component and FPN component may be
correlated. That is, noise pixels in different spatial (e.g., different pixel
coordinates) and
temporal (e.g., in different frames) locations are not independent of one
another, but rather
are correlated with each other, Typical noise in video image data may
therefore be referred to
as "colored" noise, rather than "white" noise.
- 13 -

CA 02949105 2016-11-14
WO 2015/179841 PCMJS2015/032302
Such characteristics may be readily observed in power spectral density (PSD)
graphs
of example noise as shown in Figs. 3A-3B. More specifically, Fig. 3A shows a
PSD graph
of an example random noise component and Fig, 3B shows a PSD graph of an
example FPN
component, both of which are computed and presented with respect to a 32x32
Fourier
transform and shown with a direct current (DC) term at the center. As
generally known by
one skilled in image processing, a PSD graph of white noise shows a
substantially same
constant value for all coefficients. In contrast, typical example noise in
Figs. 3A-3B is
characterized by clear and distinct non-uniform PSD graphs in both random
noise and FPN
components. For example, the PSD graph of random noise in Fig, 3A shows a
larger
horizontal correlation, which may typically be due to column noise in many
types of image
sensors. As may be appreciated, correlations of noise may be analyzed and
expressed with
respect to other transforms than the Fourier transform, for example, with
respect to the
discrete cosine transform (DCT), various types of wavelet transforms, or other
suitable
transforms.
In embodiments of systems and methods disclosed herein, such structured
properties
(or "eoloredness") of typical noise may be modeled for both random noise and
FPN
components, thereby permitting effective suppression of noise in video image
data through a
more accurate model of typical noise therein.
In one embodiment, both random noise and FPN components may be modeled as
colored Gaussian noise. Experiments performed in connection with the
disclosure have
revealed that Gaussian distributions may be taken as good approximations for
both noise
components. In other embodiments, other distributions, such as a Poisson
distribution or a
Rician distribution, may be used in place of Gaussian distributions.
One example of random noise and FPN components modeled as colored Gaussian
noise may be described mathematically as follows. Let x, e X, c Z, i =1,2 , be
pixel spatial
coordinates and IETcZ be a video image frame index (e.g., time index). Also,
let
X= XixX, and V =Xx7' denote, respectively, a spatial (e.g., directed to pixels
within a
video image frame) domain and a spatiotemporal (e.g., directed to a sequence
of video image
frames) domain. Then, in one example, noisy video data z : V IR may be modeled
as:
z(x1,x7,t)=Ax1,x2,t)+77m0(x1,x2,0+77(x1,x2,t) (Equation 1)
- 14 -

CA 02949105 2016-11-14
WO 2015/179841 PCMJS2015/032302
wherein y : V ¨> R is an unknown noise-free video, 77
RND : V R and thõ : V --> R
are realizations of random and FPN components.
As discussed above, in one embodiment, these two noise components may be
assumed and modeled as colored Gaussian noise,
uRND kRND 0 ri white , (Equation 2)
whtte
qFPN ICFPN FPN (Equation 3)
wherein rah,t; are 77,7,,r white noise factors following independent and
identically distributed
(Lid.) Gaussian distributions such that:
17RiliDte (x, x2 , t) N(0,o-RND(0), w.r.t. x1, x, and independent w.r.t. t,
(Equation 4)
tab,' (xõ x2, t) ¨ N(0, o- (t)), w.r.t. xõ x2 but not independent w.r.t. t,
(Equation 5)
wherein 0 denotes the convolution operator, and RND and kFPN are equivalent
convolution
kernels determining power spectral densities of 71R1'iD and 71FPN
respectively.
In various embodiments, standard deviation values CRND and "FPN may be
estimated
from video image data as further described herein. Experiments performed in
connection
with the disclosure have revealed that standard deviation values (YRNID and
c`FPN, as well as
FPN rhPN typically vary slowly over time. As such, standard deviation values
(3-RND and
o-
FPN may be estimated only sporadically in some embodiments.
More specifically, it may be assumed that:
a
¨0- (t) 7., 0,
at RND
(Equation 6)
- 15 -

CA 02949105 2016-11-14
WO 2015/179841 PCT/US2015/032302
a
¨ato-,,N(t)- 0,
(Equation 7)
a
X2, t) 0,
at (Equation 8)
wherein the approximations of these partial derivatives with respect to t are
such that a ,
o-F,N, and thpN may be treated as constant with respect to t within temporal
windows that are
used by operations (e.g., filtering operations) described herein.
In addition, PSDs of n
RND and 7/FpN may be assumed to be fixed modulus
normalization with respect to corresponding aRN2 and cs,2pN . That is, PSDs do
not need to be
estimated during operations on video image data, but rather may be treated as
built-in
calibration parameters in some embodiments. As such, in some embodiments, PSDs
of 17RN,,
and ri,õ may be estimated offline using calibration video images or any other
images that
may be suitable for calibration purposes, and only need to be re-calibrated
periodically or as
needed.
In some embodiments, equation 1 may be generalized to incorporate a signal-
dependent noise model, by having amp and cr,õ as functions of both y and t.
Such
.. functions may be reasonably considered as separable into independent
factors as
o-õ, (y, t) = o (y)x oç (t) and crF,N(y,t)= c47,71 (y)x ag 1 (t).
In addition, while a can be further decomposed into a vertical and a
horizontal
component, such an anisotropy in noise may be embedded in PSD representations
of noise in
various embodiments, as further described herein.
It may be noted that some "bad pixels" (e.g., stuck pixels that always show a
fixed
value or dead pixels that never detect light) may result in impulse noise of
extremely low
probability, and thus may not be adequately captured by equation 1. However,
various
embodiments of the disclosure contemplate incorporating simple mean/median
operations
based on a look-up table or other inexpensive ad-hoc procedures to compensate
for such
cases.
- 16 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
Having described example noise models and associated noise parameters, such as
standard deviation 0D' standard deviation o-õx, PSD of n
. AND and PSD of 77F,I,T, that may
be utilized in various embodiments of systems and methods of the disclosure, a
process 400
to suppress noise in video data in accordance with an embodiment of the
disclosure will now
be described in connection with Fig. 4. For example, process 400 may be
performed by
various embodiments of system 100. It should be appreciated that system 100
and various
components thereof are identified only for purposes of example, and that any
other suitable
system may be utilized to perform all or part of process 400.
At operation 404, a plurality of video image frames (e.g., consecutive still
images that
may be composed to construct moving videos) may be received. For example,
video image
data (e.g., input video 401) captured or otherwise generated by image capture
component 130
or external devices (e.g., generating video data 132) may be received at video
interface
component 134 and/or processing component 110. In some embodiments, video
image data
may be processed or otherwise managed to extract a plurality of video image
frames
therefrom, as needed or desired for particular applications or requirements.
For example,
video interface component 134 and/or processing component 110 may be
configured to
extract a plurality of video image frames, which may then be received at
processor 110.
At operation 406, standard deviation o-m, of random noise component and
standard
deviation 0FPNof FPN component may be estimated using the video image frames.
For
example, standard deviation o-õ, of random noise component and standard
deviation o-rp,
of FPN component may be computed, calculated, approximated, or otherwise
estimated at
processing component 110 of Fig. 1. As discussed above, such parameters may be
estimated
only sporadically, for example, after filtering or otherwise processing a
certain number of
video image frames. As such, standard deviation estimation operations may be
used within a
real-time image processing pipeline if desired. In one embodiment, standard
deviation
estimation operations may be embedded within, for example, noise filtering
module 112. In
another embodiment, standard deviation estimation operations may be
implemented in a
standalone module.
In various embodiments, standard deviation o-m, of random noise component may
be estimated by performing a temporal high-pass filtering of the video, and
calculating a
median of absolute deviations (MAD) of the temporal high-pass version of the
video. For
- 17 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
example, in one embodiment, temporal high-pass filtering may be performed by
obtaining the
differences between one video image frame and another video image frame
delayed by one
frame. MAD calculations may then be performed on the temporal high-pass
version of the
video to obtain a robust estimation of standard deviation CrRND . In other
embodiments,
standard deviation o-õ, may be estimated in a three-dimensional (3-D)
transform domain
(e.g., transformed by applying a decorrelating transform for filtering as
further described
herein), where coefficients representing the highest temporal frequency, or
some frequency
higher than a threshold value, may be used as samples for MAD calculation. It
is also
contemplated that other known methods for temporal high-pass filtering of
video image data
and/or other known methods for estimating a standard deviation, may be adapted
to be used
with process 400.
In various embodiments, standard deviation Cr, pN may be obtained from the
estimated
standard deviation o-õ,, and an estimation of total standard deviation of both
FPN and
random noise components. In one embodiment, standard deviation o-Fõ may be
computed
as:
2 2
FPN = RND +FPN RND 5 (Equation 11)
wherein is a
total standard deviation of both FPN and random components. In other
embodiments, standard deviation o-õN may be computed using other statistical
criteria (e.g.,
maximum-likelihood) for estimating standard deviation o-õN given standard
deviation
aRND FPN and standard deviation aRND=In various embodiments, total standard
deviation orRN D +FPN may be estimated by
performing a spatial high-pass filtering of the video, and calculating a MAD
of the spatial
high-pass version of the video. For example, in one embodiment, spatial high-
pass filtering
may be performed by obtaining the differences between a video image frame and
the video
image frame shifted by one pixel. MAD calculations may then be performed on
the spatial
high-pass version of the video to obtain a robust estimation of standard
deviation aRND f FPN
which in turn can be used to obtain a robust estimation of o-FpN as described
above. In other
embodiments, standard deviation cr,NDõ,,N may be estimated in a three-
dimensional
- 18 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
transform domain (e.g., transformed using a decorrelating transform for
filtering as further
described herein), where coefficients representing the highest spatial
frequency, or some
frequency higher than a threshold value, may be used as samples for MAD
calculation. It is
also contemplated that other known methods for spatial high-pass filtering of
video image
data and/or other known methods for estimating a standard deviation, may be
adapted to be
used with process 400.
At operation 408, power spectral densities (PSD) of a random noise component
TiRm)
and a FPN component 77,,,N may be estimated using calibration video 402 or any
other video
images that may be used for calibration purposes. As discussed above, PSDs of
77 and
qFpN may be considered to be constant modulus normalization of uõ,, and UFPN.
As such,
in some embodiments, operation 408 may be performed offline and/or only
periodically (e.g.,
when recalibration may be desired or needed). In some embodiments, calibration
video 402
may provide substantially uniform video images (e.g., provided by capturing
images of a
closed shutter, a substantially uniform blackbody, a substantially uniform
background, or
other similar images) such that noise present in calibration video 402 may be
more effectively
distinguished from true images. In other embodiments, estimation of PSDs may
be
performed using any video that contains noise distributed and correlated as
typical for an
image sensor that captures video images to be filtered by process 400.
In some embodiments, PSDs of a random noise component jiand a FPN
component ilppN may be computed by performing an autocorrelation operation on
calibration
video 402. In other embodiments, other suitable techniques for computing PSDs
may be
adapted to be used for operation 408.
In some embodiments, an actual pattern of FPN in the video image frames may be
dynamically estimated, in addition to or in place of various statistical
parameters associated
with the FPN (e.g., a PSD 17FPN and a standard deviation CrFPN of the FPN
estimated as
described herein). For one or more embodiments, the dynamically estimated FPN
pattern
may be subtracted from the video image frames, and from the resulting video
image frames a
PSD of the residual FPN (e.g., FPN remaining in the video image frames after
the
dynamically estimated FPN pattern is subtracted) and/or other noise may be
estimated online
(e.g., using the received video image frames) as opposed to being estimated
offline (e.g.,
- 1 9 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
using calibration video 402). Such online estimation of the PSD of the
residual FPN or other
noise may enable noise filtering that is robust against modeling imprecisions
and
inaccuracies, for example.
At operation 410, spatiotemporal volumes (e.g., containing image blocks
extracted
from different temporal positions, such as from different video image frames)
may be
constructed from image blocks (e.g., image patches such as fixed-size patches
or portions of a
video image frame) extracted from video image frames. In various aspects of
process 400,
filtering and/or other processing operations may be performed on the
constructed
spatiotemporal volumes.
In various embodiments, spatiotemporal volumes may be constructed by
extracting
and stacking together image blocks from a sequence of video image frames along
a motion
trajectory. For example, if 8x8 image blocks are utilized in an embodiment,
the constructed
spatiotemporal volume may have size 8x8xN, where N is a length of a trajectory
(e.g., a
number of video image frames) along which motion is tracked. In some
embodiments,
motion trajectories may be determined by concatenating motion vectors obtained
by, for
example, block-matching techniques or any other suitable motion or optical
flow estimation
techniques. Motion vectors may be either computed from the received video
image frames,
or, when input video 401 is a coded video, motion vectors embedded in the
coded video may
be utilized. In some embodiments, the motion vectors may be utilized to assess
the quality of
various dynamic (e.g., instantaneous or online) estimates associated with FPN
described
above.
Briefly referring to Figs. 5 and 6, examples of constructing spatiotemporal
volumes
are further described. Fig. 5 shows a process 500 to construct and filter a
spatiotemporal
volume 508 to suppress noise in an input video 501 in accordance with an
embodiment of the
disclosure. For example, process 500 may be performed as part of process 400
of Fig. 4,
such as at operations 410-414. Fig. 6 shows an example of a motion trajectory
along which
image blocks are extracted to construct a spatiotemporal volume in accordance
with an
embodiment of the disclosure.
As described above, block-matching techniques may be used in some embodiments
to
construct spatiotemporal volumes. For example, at operation 506,
spatiotemporal volume
508 may be constructed using a block-matching technique. That is, a plurality
of video
- 20 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
image frames 502 may be examined to search for image blocks 504A-504D matching
(e.g.,
meeting a certain similarity criterion) a reference image block 503. Such
image blocks 503,
504A-504D may define a motion trajectory, and may be stacked together to
construct
spatiotemporal volume 508. Note that operations enclosed in the dashed line
(e.g., including
operations 506, 510-514) may be repeated for each reference image block to
construct and
filter a plurality of spatiotemporal volumes. In another example, image blocks
602A-6021 in
Fig. 6 may be selected as defining a motion trajectory using various motion
estimation
techniques. As such, images blocks 602A-602J may be extracted and stacked
together to
form a spatiotemporal volume of length 10, for example.
As can be seen in Figs. 5 and 6, a spatiotemporal volume may comprise image
blocks
that may correspond to various different spatial positions on a video image.
In such a case,
FPN may appear substantially as random noise (e.g., not fixed to specific
pixel positions
because image block positions change), which may allow FPN to be modeled and
filtered as
such. If, however, there is little or no motion, all or a substantial portion
of FPN may be
preserved in the spatiotemporal volume, and as such, may be filtered based
substantially on
noise parameters associated with FPN. Thus, how much of FPN may be captured as
random
noise or preserved as FPN in spatiotemporal volumes may depend on the relative
alignment
of image blocks (e.g., how many of the image blocks in a spatiotemporal volume
are aligned
and how many of them are from other spatial locations).
Referring back to Fig. 4, at operation 412, the constructed spatiotemporal
volumes
may be filtered (e.g., to suppress noise or to perform other processing as
further described
herein with regard to operation 512). In various embodiments, the filtering
may be based at
least in part on one or more noise parameters. For example, in some
embodiments, the
filtering may be based at least in part on standard deviation o- of a
random noise
component, standard deviation arõ of a FPN component, PSD of a random noise
component, and/or PSD of a FPN component, any one of which may be computed,
calculated, approximated, or otherwise estimated at operations 406 and 408. In
some
embodiments, the filtering may be further adaptive to other characteristics of
the constructed
spatiotemporal volumes, as further described herein.
In some embodiments, filtering may be performed on 3-D transform domain
representations (which may also be referred to as 3-D spectra) of the
spatiotemporal volumes.
- 21 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
For example, referring again to Fig. 5, filtering operations may include
applying a three-
dimensional (3-D) transform to the spatiotemporal volumes to obtain 3-D
spectra (e.g., at
operation 510), modifying (e.g., adaptively shrinking) coefficients of the 3-D
spectra (e.g., at
operation 512), and applying an inverse transform to obtain filtered
spatiotemporal volumes
(e.g., at operation 514). It is also contemplated that other forms of
regularization such as
weighted averaging or diffusion may be performed in place of or in addition to
operations
510-514.
More specifically, at operation 510, a decorrelating 3-D transform may be
applied to
the spatiotemporal volumes. Such a decorrelating 3-D transform may include a
discrete
cosine transform (DCT), discrete sine transform (DST), discrete wavelet
transform (DWT),
discrete Fourier transform (DFT), or any other appropriate transform (e.g.,
separable,
orthogonal transforms) that typically decorrelate image signals. In one
embodiment, a DCT
may be utilized for the transform operation.
A decorrelating 3-D transform may be applied by a separable cascaded
composition
of lower dimensional transforms. For example, for spatial decorrelation, a 2-D
transform
(e.g., a separable DCT of size 8x 8) may be applied to each of the image
blocks (e.g., having
a size of 8x 8) stacked in the spatiotemporal volume, and for the temporal
decorrelation, a 1-
D transform of length N (e.g., a 1-D DCT of length matching the length of the
spatiotemporal volume) may be applied. As may be appreciated by one skilled in
the art, the
order of these two cascaded transforms may be reversed, leading to an
identical result.
Referring also to Fig. 7, a resulting 3-D spectrum 702 may comprise a
plurality of
spectral coefficients (shown as small circles in Fig. 7) representing the
spatiotemporal
volume in the 3-D transform domain. 3-D spectrum 702 may also include a direct
current
(DC) plane 704 and an alternating current (AC) co-volume 706. DC plane 704 may
be
viewed as a collection of DC-terms, which may refer to transform domain
coefficients that
correspond to zero frequency and may represent an averaging of values. In this
regard, the
DC-terms in DC plane 704 may encode information about the FPN component. As
such, in
some embodiments, filtering operations may be adjusted based on which plane
(e.g., DC
plane or AC co-volume) the coefficients belong to, as further described
herein. AC co-
volume 706 may be viewed as other remaining coefficients, which typically
satisfy some type
of orthogonal relationship with the coefficients in DC-plane 704. It should be
noted that Fig.
7 is merely a visual presentation provided for purposes of explaining
filtering operations on a
- 22 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
3-d spectrum, and as such, the depiction of the location, size, and/or shape
of 3-D spectrum
702, DC plane 704, AC co-volume 706 should not be understood as limiting a
resulting 3-D
spectrum.
At operation 512 of Fig 5, shrinking (or shrinkage) may be performed to modify
the
coefficients of the 3-D spectrum (e.g,, 3-D spectrum 702), thereby obtaining a
shrunk 3-D
spectrum 708. Shrinking may include thresholding (e.g., hard thresholding,
soft thresholding,
or others), scaling, Wiener filtering, or other operations suitable for
regularizing signals in a
transform domain. In various embodiments, shrinking modifies the spectral
coefficients
based on corresponding coefficient standard deviations of noise that may be
embedded in
each spectral coefficient. Thus, for example, in one embodiment, shrinking may
be
performed by hard thresholding the spectral coefficients based on the
corresponding
coefficient standard deviations (e.g., setting a value to 0 if it does not
meet a threshold value).
In another example, shrinking may be performed in two or more stages, in which
thresholding may be performed in earlier stages to provide an estimate to
Wiener filtering
.. performed in later stages.
The coefficient standard deviation may be approximated, calculated, or
otherwise
obtained based on various parameters associated with a random noise component
and a FPN
component that may be present in video images. For example, in one embodiment,
the
coefficient standard deviation may be approximated based at least in part on
standard
deviation cr of a random noise component and standard deviation crFpN of a
FPN
component.
In another embodiment, the coefficient standard deviation may be approximated
based further on a PSD of a random noise component and a PSD of a FPN
component, in
addition to standard deviation CrRND and standard deviation UFN. As described
above with
respect to modeling of noise in Figs. 2A-3B and equations 1-5, these PSDs may
encode
correlation or structure of the noise components. Thus, if computed with
respect to the 2-D
transform used for spatial decorrelation, these PSDs may additionally provide
variances of
the random noise component and the FPN component for each of the coefficients
in the 2-D
spectra prior to the application of the 1-D transform for temporal
decorrelation. Such
properties of the PSDs may be better visualized or understood through Fig. 8,
which shows
- 23 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
example graphical representations of PSDs of random noise and FPN components
computed
with respect to a 2-D transform used for spatial decorrelation.
In various embodiments, one or more of these and other noise parameters may be
based on estimated values (e.g., estimated online and/or offline as part of
process 400). For
example, the coefficient standard deviation may be approximated based on
standard deviation
CYRND 9 standard deviation CrFpN a PSD of random noise component and/or a PSD
of a FPN
component, all or some of which may be estimated values obtained through
operations 406
and 408 of Fig. 4 described above.
The coefficient standard deviations may be further adapted, refined, or
otherwise
adjusted based on the motion captured in the spatiotemporal volumes, in
addition to being
approximated or calculated based on noise parameters as discussed above. That
is, in
accordance with various embodiments of the disclosure, it has been observed
that the relative
alignment of image blocks grouped in spatiotemporal volumes affects how a FPN
component
is manifested in spectral coefficients. For example, in one extreme case in
which all image
blocks are aligned (e.g., when there is no motion), the FPN component may be
same across
all image blocks. As such, the FPN component may simply accumulate through
averaging,
and thus constitute a substantial part of the content, rather than noise, of
the DC plane in the
3-D spectrum. In the other extreme case in which all image blocks are from
various different
spatial positions of video images, the FPN component may present different
patterns over the
different image blocks. As such, restricted to the spatiotemporal volume, the
FPN component
may appear as another random noise component.
Accordingly, in some embodiments, the coefficient standard deviations may not
only
be approximated based on the noise parameters, but they may also be adapted,
refined, or
otherwise adjusted based further on the size of the spatiotemporal volume, the
relative spatial
alignment of images blocks associated with the spatiotemporal volume, and/or
the position of
coefficients within the 3-D spectrum (e.g., whether the coefficients lie on
the DC plane or the
AC co-volume). In one embodiment, such an adaptive approximation of the
coefficient
standard deviations may be obtained using a formulation that encompasses the
two extreme
cases and at the same time offers a gradual transition for intermediate cases.
3 0 One example of such a formulation may be described formally as follows.
For a
spatiotemporal volume of temporal length N, let L5..N,1.1.11._1\,T, be the
number of image
- 24 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
blocks forming the spatiotemporal volume sharing the same original spatial
position as the n-
th block in the volume. Let L = maxlN{Lõ} (an alternative different
definition, which can
be more practical depending on the specific filter implementation, may be L =
L1). The
coefficient standard deviations may then be approximated, for the coefficients
in the temporal
DC plane and its complementary AC co-volume as:
aDC - N
Ajla .D psd'iTD + I? N ¨ L ci2, psd2,1),,,,1 , (Equation 10)
2 ,,,,,,,i R_2DNTD +
CAC = aRND Y"".. Al N-La-
N F2PN = P
Sd2FDPNT , (Equation 11)
wherein cõ and cA, are the coefficient standard deviations for coefficients in
the DC plane
2DT
and in the AC co-volume, respectively, and wherein psd2RD,T, and psd FPN are
the PSDs of FPN
and random noise components with respect to the 2-D spatial decorrelating
transform. Thus,
by modifying the spectral coefficients using cD, and cA, obtained from
equations 10 and
11, an embodiment of the disclosure may perform adaptive shrinking that may
permit near-
optimal filtering of noise in video images. Note that the abovementioned
extreme cases are
obtained in equations 10 and 11 with L = N (no motion) or L -- 0 (image blocks
all from
different spatial positions), respectively.
Further, at operation 512, other operations may also be performed on the
shrunk 3-D
spectra (e.g., shrunk 3-D spectrum 708) for further processing or
manipulation. For example,
in one embodiment, the spectral coefficients may be further modified using
collaborative a -
rooting or other techniques that sharpen and/or enhance the contrast in images
by boosting
appropriate ones of the spectral coefficients, In other examples, image
restoration,
deblurring, sharpening, equalization, super-resolution, or other operations
may be performed
to further modify the coefficients of the shrunk 3-D spectra. Whereas
inaccurately modeled
and/or sub-optimally suppressed noise often render enhancement and other
operations
ineffective, or worse, cause enhancement and other operations to degrade
rather than improve
images, near-optimal suppression of noise that may be achieved by embodiments
of the
disclosure may beneficially improve the efficacy of enhancement and other
operations, as
further illustrated herein.
- 25 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
At operation 514, the inverse of the decorrelating 3-D transform may be
applied to the
shrunk 3-D spectra to obtain filtered spatiotemporal volumes (e.g., a filtered
spatiotemporal
volume 714). As shown in Fig. 7, cascaded separable inverse 2-D and 1-D
transforms may
be applied in any order (e.g., with intermediate 2-D spectra 710 or
intermediate 1-D spectra
.. 712) to obtain filtered spatiotemporal volume 714.
At operation 414/516, image blocks from the filtered spatiotemporal volumes
may be
aggregated using appropriate aggregation techniques to generate filtered video
image frames
(e.g., filtered video 416). For example, in various embodiments, aggregation
may include
weighted averaging of image blocks. In some embodiments, weights for averaging
may be
based in part on the coefficient standard deviation. In such embodiments, the
aggregating
operation may benefit from the adaptive approximation of the coefficient
standard deviations
described above for operation 512. It may be appreciated that other operations
associated
with processes 400 and 500 may also benefit from the adaptivity provided by
embodiments of
the disclosure, if such operations are based in part on the coefficient
standard deviations.
Referring now to Fig. 9-11B, examples of advantageous results that may be
obtained
by embodiments of the disclosure are illustrated and compared with results
obtained by
conventional techniques. Fig. 9 shows an example of an input video image frame
captured
by an infrared imaging sensor. The input video image frame of Fig. 9 exhibits
both
correlated random noise and correlated FPN. Fig. 10A shows an example of a
resulting video
image frame obtained by processing the input video image of Fig. 9 using a
conventional
noise filtering technique. More specifically, the conventional technique
utilized to obtain
Fig, 10A assumes conventional additive white Gaussian noise (AWGN) model. That
is,
unlike various embodiments of the disclosure, there is no modeling of noise
correlation/structure or modeling of separate FPN and random noise components,
In Fig.
.. 10A, this leads to ineffective noise suppression, with residual FPN and
visible structured
artifacts clearly visible from the resulting video image frame.
Furthermore, in an example in Fig. 10B of a resulting video image frame
obtained by
filtering and enhancing the input video image frame of Fig. 9 using
conventional techniques,
performing an enhancement (e.g., sharpening and/or contrast enhancement)
operation on the
conventionally filtered video image frame lead to a degradation, rather than
an improvement,
of the video image frame, with noise being exacerbated rather than being
attenuated.
- 26 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
In contrast, in an example in Fig. 11A of a resulting filtered video image
frame
obtained by filtering the input video image of Fig. 9 according to an
embodiment of the
disclosure, both FPN and random noise components are effectively suppressed
with no
structured artifacts in the resulting video image. Further, advantages of
accurate modeling
and filtering of noise may be appreciated even more in Fig. 11B, which shows
an example of
a resulting video image frame obtained by filtering and enhancing the input
video image of
Fig. 9 in accordance with an embodiment of the disclosure.
Therefore, some embodiments of methods and systems disclosed herein may permit
effective suppression of noise even in images that have a prominent FPN
component, by
modeling noise more accurately, estimating one or more noise parameters,
filtering images
based on motion-adaptive parameters, and/or performing other operations
described herein.
Some embodiments of methods and systems disclosed herein may also beneficially
suppress
residual FPN that may still remain after conventional FPN compensation
procedures, such as
a column noise compensation technique, FPN removal based on pre-calibrated or
dynamically estimated FPN masks, and/or other techniques, have been performed.
Thus, for
example, some embodiments of methods and systems disclosed herein may be
included in or
implemented as various devices and systems that capture and/or process video
or still images
impaired by noise (e.g., video or still images captured by infrared image
sensors or other
sensors operating at a low signal-to-noise ratio regime, and/or video or still
images processed
by conventional FPN compensation techniques) to beneficially improve image
quality.
Based on the framework of constructing and adaptively operating on motion-
based
spatiotemporal volumes, additional embodiments of the disclosure may
advantageously
reduce, remove, or otherwise suppress distortion and/or degradation in images
(e.g.,
distortion and/or degradation caused by atmospheric turbulence), in addition
to or in place of
suppressing random and fixed pattern noise in images. As briefly discussed
above, images
(e.g., still and video image frames) captured by an imaging system, such as
system 100, may
contain distortion and/or degradation such as those caused by atmospheric
turbulence as light
travels through the air from a scene to an imaging sensor of the imaging
system.
For example, as illustrated by Fig. 12 according to an embodiment of the
disclosure,
light 1210 (e.g., visible light, infrared light, ultraviolet light, or light
in other wavebands
detectable by image capture component 130) from scene 170 may travel through
turbulent air
1220 (e.g., occurring due to mixing of hot air pockets 1230 and cold air
pockets 1240 as
-27 -

CA 02949105 2016-11-14
WO 2015/179841 PCMJS2015/032302
shown in Fig. 12, air flow disturbance around floating particles, or other
natural or man-made
phenomena) before it reaches imaging component 130 of imaging system 100
(e.g., a visible
light and/or infrared video camera). Thus, for example, variation in the
refractive index of
turbulent air 1220 cause the light wavefiront 1250 to distort, leading to
degradation and/or
distortion in images captured by imaging system 100, which may particularly be
visible in
outdoor and/or long-distance image acquisition.
Such degradation and/or distortion appearing in captured images due to
atmospheric
turbulence may include, for example, image blurs that randomly vary in space
(e.g., at
different spatial pixel locations) and in time (e.g., from frame to frame),
large-magnitude
shifts/displacements (also referred to as "dancing") of image patches that
also randomly vary
in space and time (i.e. different shift for different patches and in different
frames), and/or
random geometrical distortion (also referred to as "random warping") of
captured images of
objects. Such degradation and/or distortion may occur in addition to the
random and fixed
pattern noise discussed above and blurring due to camera optics.
In accordance with one or more embodiments of the disclosure, such degradation
and/or distortion appearing in captured images due to atmospheric turbulence
may be
mathematically modeled through randomly varying point-spread functions (PSFs),
treating
each point in an ideal image as being shifted and blurred with a PSF. In one
non-limiting
example for purposes of illustrating various techniques of the disclosure, an
observed noisy,
blurred, and turbulent video z (or a sequence of still images) may be modeled
as follows:
Let y: R2 x R ¨> R be the true noise-free, blur-free, and turbulence-free
video,
x E R2 be a spatial coordinate, and t E R be a temporal coordinate. An
observed noisy,
blurred, and turbulent video z can then be approximately expressed in the
linear integral form
z(x, t) = f f y(v ¨ u,t)hatmo(u ¨ e(v,t))du hiõ,(x ¨ v)dv + E(x,t),
ile
x E R2, t E R (Equation
12)
where ham,: 1182 ---> R. and hie: R2 ¨* IR are a pair of atmospheric and
optical PSFs, and
1112 x 111 ¨> 111 and E: 1182 x R ¨> DI are random fields. In particular,
models the random
displacements due to distorted light propagation caused by turbulent air 1220,
while e can
model random as well as fixed pattern noise components of imaging system 100.
- 28 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
If the randomness of is ignored and 0 is assumed, the combined effect of
hatmo, h102,and E results in blurred noisy observations which can be filtered
as discussed
above with reference to Figs. 4 through 8. However, for a random the blur is
no longer
convolutional. Indeed, causes random displacement of the PSF hatmo. In
particular, if the
PSFs and noise are ignored for purposes of discussing (i.e. assume ham, h
=¨Iens=60 and
E 0) , can be simply seen as the displacement field that warps y onto z.
Since such
displacement changes randomly with time, it corresponds to "dancing" visible
in turbulent
video z. Various techniques discussed below in accordance with one or more
embodiments
of the disclosure may compensate for such "dancing" due to atmospheric
turbulence, for
example, by compensating for the randomness of
For example, Fig. 13 illustrates a process 1300 to suppress distortion and/or
degradation due to atmospheric turbulence in an input video 1301 by
constructing and
operating on spatiotemporal volumes, in accordance with an embodiment of the
disclosure.
For embodiments illustrated with reference to Fig. 13, input video 1301 is
assumed to contain
distortion, degradation, or other effects due to atmospheric turbulence as
discussed above
with reference to Fig. 12, and process 1300 includes additional or alternative
operations to
filter the positions (e.g., the coordinates within input video 1301) of the
image blocks (e.g..
blocks 1304A-1304E) inside a spatiotemporal volume (e.g., a spatiotemporal
volume 1308)
and/or to process the spatiotemporal volume (e.g. by performing alpha-rooting
or other
techniques on the coefficients of the 3-D spectrum obtained by transforming
spatiotemporal
volume 1308) to compensate for the randomness of as further discussed herein,
but may
otherwise be similar to process 500.
Thus, at operation 1306 of process 1300, spatiotemporal volume 1308 may be
constructed by tracking similar blocks (e.g., patches) 1304A through 1304E in
a sequence of
video image frames 1302 of input video 1301, as discussed above for operation
410 of
process 400 and operation 506 of process 500. In this regard, motion is
defined through an
overcomplete block-wise tracking, in contrast to some conventional techniques
that define a
deformation field or an optical flow that would track the motion of each pixel
in time. That
is, according to embodiments of the disclosure, each pixel simultaneously
belongs to multiple
spatiotemporal volumes, each of which can follow different trajectories and is
subject
separate trajectory smoothing and spectral filtering operations as further
described below. As
such, each pixel in an output video of process 1300 is obtained from combining
various
- 29 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
pixels from different original positions in input video 1301, and each pixel
in the input
turbulent video may follow multiple trajectories and contribute to multiple
pixels at different
positions in the output video.
In some embodiments, such tracking may be based on a multiscale motion
estimation.
As an example for such embodiments, matching and tracking of similar image
blocks may be
performed in a coarse-to-fine manner, such that the matching and tracking of
image blocks
may start at a coarse scale (e.g., a large blocks/patches) and repeated at
finer and finer scales
(e.g., a smaller blocks/patches), with matches obtained for a coarser scale
being used as
prediction for a finer scale where the match results may be refined. In this
manner, matching
and tracking of similar image blocks may be performed even in the presence of
deformations
and blur (e.g., due to atmospheric turbulence) and heavy noise in input video
1301, thereby
effectively tracking moving objects (e.g., including parts of objects enclosed
in a block) as
well as stationary objects, for example.
In addition to constructing spatiotemporal volumes by extracting and stacking
together the contents of image blocks along the tracked motion trajectory at
operation 1306,
process 1300 at operation 1309 includes extracting and filtering the positions
(e.g.,
coordinates) of the image blocks along the tracked motion to compensate for
shifts or
"dancing" (e.g., the randomness of due to atmospheric turbulence. Such
filtering of the
positions may also be referred to herein as "trajectory smoothing."
Referring also to Fig. 14, an example result of trajectory smoothing is
illustrated in
accordance with an embodiment of the disclosure. In Fig. 14, a trajectory 1402
(also referred
to as extracted trajectory 1402) is drawn to connect the positions of example
image blocks
(shown as shaded blocks in Fig. 14) that are extracted along the tracked
motion trajectory
from a sequence of input video image frames at operation 1306. As shown,
extracted
trajectory 1402 may appear jagged due to random displacements or shifts (e.g.,
modeled by
in equation 12 above) caused by turbulent air 1220. According to embodiments
of the
disclosure, the positions (coordinate x) of such extracted image blocks may be
modeled as:
5(t) = x (t) + (t), (Equation 13)
where t is the temporal coordinate, 2"(t) is the observed position of a block
extracted from
turbulent input video 1301 at t, x (t) is the unknown position of the block in
an ideal video
without turbulence, and f(t) is the spatial displacement (e.g., dancing) of
the block due to
- 30 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
atmospheric turbulence, which is treated as a zero-mean random variable (i.e.,
a position
noise).
According to various embodiments, the size of image blocks/patches may be
selected
(e.g., a block size of 8x8 pixels chosen in some embodiments) such that the
random
displacement (t) may capture spatial low/medium-frequency components of the
effects
due to atmospheric turbulence. Thus, for example, positions x(t) recovered
from the
observed positions 5-e (t) of the extracted image blocks may represent the
positions of those
image blocks with low/medium-frequency components of atmospheric turbulence
effects
suppressed. In the example of Fig. 14, the non-shaded blocks represent image
blocks that are
repositioned at such recovered positions x(t), with a trajectory 1406 (also
referred to as
smoothed trajectory 1406) drawn to connect the recovered positions x(t). As
shown in
Fig. 14, the example smoothed trajectory 1402 may appear smooth due to the
suppression of
low/medium-frequency components of random displacements or shifts in input
video 1301
caused by atmospheric turbulence.
In one or more embodiments, such smoothing (e.g., filtering or reconstruction)
of the
trajectory of the extracted image blocks at operation 1309 may comprise
performing a
regression analysis (e.g., to recover the unknown positions x(t) from the
observed
positions (t) of the extracted image blocks from turbulent input video 1301),
which may be
an adaptive-order regression analysis in some embodiments. In such
embodiments, the
strength of the smoothing may be adaptive with respect to the amount of
turbulence and the
complexity (e.g., the order) of the underlying smooth trajectory.
Thus, for example, the adaptive-order regression according to some embodiments
allows approximation of not only fixed stationary positions and linear motions
with uniform
velocity, but also more complex accelerated motions. In other words, in
contrast to some
conventional trajectory-smoothing approaches based on a unifoini-motion
approximation
within a short temporal interval, the model of the trajectories according to
some embodiments
of the disclosure is full rank and the smoothing is adaptive, thereby
peimitting arbitrarily long
spatiotemporal volumes to be constructed and very complex motion patterns to
be captured
and smoothed. Trajectory smoothing according to one or more embodiments of the
disclosure can be used to suppress not only random dancing due to turbulence,
but also more
common problems such as camera shake or jitter.
- 31 -

CA 02949105 2016-11-14
WO 2015/179841 PCMJS2015/032302
In an implementation example according to some embodiments of the disclosure,
the
regression analysis may be performed on a complex number representation of the
image
block positions (e.g., coordinates), treating the spatiotemporal trajectories
(e.g., the observed
positions 1(0 and the unknown positions x(t)) as curves in a complex plane.
In this implementation example, the observed positions X"(t) of the extracted
image
blocks from turbulent input video 1301 may be represented as a complex
variable:
2(t) = 21(0 +
= (t) + ix2(t) + 1(t) + (Equation 14)
Then, an adaptive-order regression analysis of the curves in a complex plane
may be
perfoimed as follows, for example:
Let Nt be the temporal length (e.g., the length or the number of video frames
along which
the motion of an image block is tracked) of a spatiotemporal volume and let
PNt = fpk}Nk_t-01
be a basis composed of Nt complex monomials of the form
Pk bk
where
1
1
b I E evt
t Nt
is a complex vector of length N.
Further, let PAcj,N = {mr}Nkt-01 be the orthonormal basis constructed by Gram-
Schmidt
orthonormalization of PNt . Any trajectory t =1,..., Nt of length Nt can be
represented as a linear combination of fecNrk_t-01 with complex coefficients
Ctk f 1Alt-1
,-k=0
Nt-1
'At) = aka' V (t) , t = 1, , Nt,
k=0
where
- 32 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
Nt
ak =I2.(t)Pr (t) , k = 0, ... , Nt ¨ 1.
(Equation 15)
Filtering of the positions 2(t) of image blocks along the extracted trajectory
(e.g.,
extracted trajectory 1402) may be performed on this representation of the
positions 2(0 to
perform trajectory smoothing (e.g., to obtain a smoothed trajectory, such as
smoothed
trajectory 1406). In this regard, in one or more embodiments, an adaptive
approximation of
2(t) may be perfoimed with respect to the orthonormal basis p17 using only the
most
significant coefficients ak. For example, the smoothed approximation 2 of 2(0
may be
obtained as
Nt-1
2(0 = 1 akpr (t), t -= 1,... , Nt,
(Equation 16)
k=o
where the coefficients oik are defined by shrinkage of the coefficients ak.
Thus, in one or
more embodiments, hard thresholding may be performed on the coefficients ak to
obtain
the coefficients ilk for the smoothed approximation of 2 of 2(t). In one
specific
example, the shrinkage of the coefficients ak by hard thresholding may be
performed as
follows:
ao if k=0
ak = Clic f
if k>0 and I ak I > A
if k>0 and lak I < 2traj Citraj Al 2 ln(Nt) traj Crtraj1/2 ln(Nt)
¨ - (Equation 17)
0
where o-trai is the standard deviation of the turbulence displacement ç and
Atraj 0 is a
(real valued) smoothing parameter.
Thus, the hard thresholding, and in turn the strength of the trajectory
smoothing, is
adaptive with respect to the amount of turbulence (e.g., depending on atraj)
and the
complexity (e.g., depending on the order Nt for the term V2 ln(Nt)) of the
underlying
smooth trajectory, which, as discussed above, allows capturing and smoothing
of more
complex motions unlike conventional techniques. In some embodiments, the
standard
deviation of the turbulence displacement traj may be estimated online using
turbulent input
video 1301 (e.g., by calculating MAD as discussed above for operation 406 of
process 400),
determined offline (e.g., using reference data or videos), and/or otherwise
provided for
- 33 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
process 1300. In some embodiments, the smoothing parameter Iltrai may be
estimated
online using turbulent input video 1301, determined offline, and/or otherwise
provided for
process 1300.
Turning to operation 1310, a 3-D transform to the constructed spatiotemporal
volumes
may be performed to obtain corresponding 3-D spectra, in a similar manner as
discussed
above for operation 510 of process 500. At operation 1312, process 1300 may
process,
manipulate, modify, or otherwise operate on the coefficients of 3-D spectrum
coefficients to
reduce or otherwise suppress at least some effects of atmospheric turbulence
(e.g., blurring
due to atmospheric turbulence). As discussed above for operations 1306 and
1309, spatial
low/mid-frequency components of the effects of atmospheric turbulence on input
video 1301
may be suppressed by the block-wise motion tracking and trajectory smoothing
according to
various embodiments of the disclosure. Thus, for example, some higher spatial
frequency
components of the atmospheric turbulence effects may remain within the content
of each
image block, which may be suppressed by operating on the coefficients of 3-D
spectra at
operation 1312.
In some embodiments, alpha-rooting (also referred to as a -rooting) of the 3-D
spectrum coefficients may be performed to suppress high-frequency components
of the
atmospheric turbulence effects (e.g., blurring) in input video 1301. For
example, softening
alpha-rooting may be perfoimed on the spatial features that are different
among different
blocks in a same spatiotemporal volume. As discussed above with reference to
Fig. 7, spatial
features that are different among different blocks in a same spatiotemporal
volume may be
modified by modifying the temporal-AC coefficients (e.g., the coefficients of
AC co-volume
706), and thus the softening alpha-rooting may be performed on the temporal-AC
coefficients
to suppress high-frequency components of the atmospheric turbulence effects
according to
one or more embodiments.
In some embodiments, alpha-rooting of the 3-D spectrum coefficients at
operation
1312 may also include sharpening alpha-rooting of the spatial features common
to all image
blocks in a same spatiotcmporal volume to sharpen those spatial features,
which may be
performed by operating on the temporal-DC coefficients (e.g., the coefficients
of DC plane
704). Thus, in such embodiments, contents of the image blocks may be
sharpened, while at
the same time suppressing the higher frequency components of atmospheric
turbulence
effects (e.g., blurring) in image blocks. As may be appreciated, performing
different types of
- 34 -

CA 02949105 2016-11-14
WO 2015/179841 PCMJS2015/032302
filtering, enhancements, or other modifications on different spectral
dimension as in this
example is advantageously facilitated by the structure of the constructed
spatiotemporal
volumes and the corresponding 3-D spectra.
As a specific example according to one or more embodiments, 3-D spectrum
coefficients Oi may be modified by taking the alpha-root of their magnitude
for some a >0,
thus modifying the differences both within and between the grouped blocks:
= 1 sign(0 (0) = 10(0)1
1
0 (0
0 (0) 1f6(0) # 0,
0 a (i) 0(i)
if 0(0) = 0,
(Equation 18)
where 9(i) is the resulting alpha¨rooted 3-D spectrum coefficient 9, and 0(0)
is the
spatiotemporal DC component of the spectrum. In this representation of alpha-
rooting, a
value a> 1 results in an amplification of the differences, thus sharpening,
whereas a <1
results in an attenuation of the differences, thus softening. The neutral
value a = 1 leaves
the coefficients unchanged. By employing a> 1 for the temporal-DC coefficients
(the
coefficients in the temporal DC plane) and a < 1 for the temporal-AC
coefficients (the
coefficients in the temporal AC co-volume), spatial sharpening of the contents
on the image
blocks may be achieved while at the same time producing temporal softening
(which can be
is considered as an adaptive nonlinear alternative to the conventional
temporal averaging).
Since block-wise tracking to construct the spatiotemporal volumes is
overeomplete and of
variable length as discussed above in connection with operation 1306, the
alpha-rooting or
other operation to modify, manipulate, or otherwise process the coefficients
of 3-D spectrum
coefficients is also overcomplete and adaptive.
In some embodiments, operation 1312 may also include adaptive shrinking of the
coefficients of the 3-D spectrum to suppress noise in a similar manner as
described for
operation 512 of process 500. In such embodiment, the 3-D spectrum
coefficients 0i to be
alpha-rooted as discussed in the preceding paragraph may be coefficients that
have already
been shrunk by the adaptive shrinking according to one or more embodiments of
the
disclosure.
- 35 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
In some embodiments, operation 1312 may include performing known bispectrum
imaging (also referred to as specking imaging) and/or lucky imaging techniques
appropriately
modified to work within the framework of process 1300, in addition to or in
place of the
alpha-rooting. In some embodiments, operation 1312 may also include further
deblurring of
the contents of the spatiotemporal volumes. For example, blind deblurring can
be applied
after the alpha-rooting to further sharpen the contents of the image blocks in
the
spatiotemporal volumes.
At operation 1314, the inverse of the 3-D transform may be applied to the 3-D
spectra
to obtain spatiotemporal volumes with higher-frequency contents of atmospheric
turbulence
effects suppressed (e.g., by alpha-rooting and/or otherwise operating on the 3-
D spectrum
coefficients at operation 1312), in a similar manner as described above for
operation 514 of
process 500. It is also contemplated for some embodiments that operation 1312
be perfolined
on spatiotemporal volumes (e.g., spatiotemporal volume 1308) instead of the 3-
D spectra
obtained from the spatiotemporal volumes. In such embodiments, applying a 3-D
transform
and its inverse at operations 1310 and 1314 may be omitted.
At operation 1316, image blocks from the spatiotemporal volumes after the
inverse 3-
D transform may be aggregated to generate output images (e.g., output video
416) in which
distortion and/or degradation in input images (e.g., input video 1301/401) due
to atmospheric
turbulence are reduced, removed, or otherwise suppressed. In contrast to
operation 414/516
of process 400/500 where the image blocks for the spatiotemporal volumes are
aggregated
according to their original observed positions (e.g., observed positions 2(0),
the image
blocks are aggregated at operation 1316 according to the filtered position
(e.g., approximated
positions At)) obtained by trajectory smoothing at operation 1309.
Thus, in the output images the random displacement (t) due to atmospheric
turbulence may be suppressed. More specifically, spatial low/medium-frequency
components may be suppressed by aggregating according to the filtered image
block
positions obtained by trajectory smoothing, and higher spatial frequency
components that
remain within image blocks may be suppressed by alpha-rooting or other
modifications of the
3-D spectral coefficients, as discussed above for operations 1309 and 1312.
Such complementary roles of trajectory smoothing and spectral coefficient
modification such as alpha-rooting may also be described as follows, for
example:
- 36 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
Referring again to Equation 12 above, consider a square block Bzt c 1R.2 x la
of
fixed size Nb X Nb X 1 centered at position x in space and at instant t in
time. The average
displacement NIT' fB,, e(v, t) dv naturally corresponds to the shift between
y(Bzt, t)
,
and z(Bzt, t). The trajectory smoothing identifies and compensates such shift
by leveraging
the fact that the optical flow in the true video y should consist of smooth
trajectories,
whereas, because of the randomness of e , the optical flow in z is random.
Due to the low-pass effect given by the integral Arb-2 fBx,t (v t) dv, , the
average
displacement captures only a low spatial frequency band of the spectrum of e.
This means
that high spatial frequency bands of are active within each block z(Bzt, t).
Appropriate
alpha-rooting can be applied as discussed above to compensate for such high-
frequency
turbulence in effect by mimicking, within each spatiotemporal volume, a
temporal averaging
spatial deblurring approach.
Therefore, according to one or more embodiments of the disclosure, process
1300
may be performed by an imaging system (e.g., system 100) to suppress
distortion,
degradation. or other effects caused by atmospheric turbulence (e.g., by
turbulent air 1220)
appearing in images (e.g., input video 1301) captured by the imaging system.
According to
embodiments of the disclosure, sequence of images (e.g., a sequence of video
frames) may be
processed as a (possibly redundant and/or overlapping) collection of image
blocks (or image
patches). In this way, block-transform-based processing may be applied to
input images
corrupted by atmospheric turbulence, where each block (or patch) is tracked
along time
forming a spatiotemporal volume. After processing of the spatiotemporal volume
in the
transform domain, image blocks/patches may be moved to new estimated positions
to
suppress random displacement ("dancing") of the image blocks/patches. The
blurring effect
of atmospheric turbulence may be suppressed in the transfot III domain,
whereas the random
displacement may be suppressed by regression analysis on original observed
positions of
image blocks/patches in the spatiotemporal volumes.
According to various embodiments, an imaging system configured to perfoini
process
1300 may include an infrared camera (e.g., including a thermal infrared
camera), mobile
digital cameras, video surveillance systems, satellite imaging systems, or any
other device or
system that can benefit from atmospheric turbulence suppression in captured
images. For
example, systems and methods to suppress atmospheric turbulence in images
according to
- 37 -

CA 02949105 2016-11-14
WO 2015/179841
PCMJS2015/032302
various embodiments of the disclosure may be beneficial for obtaining
acceptable quality
video output from sensors aimed at imaging from long distances. Furthermore,
the
techniques to suppress atmospheric turbulence in images according to various
embodiments
of the disclosure may not only be beneficial for obtaining a quality
video/image output, but
also for effective operations of video/image processing operations such as
detection,
segmentation, target identification, target tracking, scene interpretation, or
other higher-level
operation that can be impaired by atmospheric turbulence effects in captured
images,
An example result of methods and systems to suppress atmospheric turbulence in
images is illustrated by Figs. 15A-15B, in accordance with an embodiment of
the disclosure.
Fig. 15A shows an example raw image (e.g., a frame of input video 1301)
captured by a
thermal infrared camera, whereas Fig. 15B shows an example processed image
obtained by
suppressing the distortion, degradation, or other effects of atmospheric
turbulence in the
example raw image of Fig. 15A in accordance with an embodiment of the
disclosure.
Compared with the example raw image in Fig. 15A, the example processed image
obtained
according to the atmospheric turbulence suppression techniques of the
disclosure shows
much more details and overall improvement of image quality. For example, text
1502 on an
object in the processed image of Fig. 15B is in focus and can be read, which
is not possible in
the raw image of Fig. 15A.
Where applicable, various embodiments provided by the present disclosure can
be
implemented using hardware, software, or combinations of hardware and
software. Also
where applicable, the various hardware components and/or software components
set forth
herein can be combined into composite components comprising software,
hardware, and/or
both without departing from the spirit of the present disclosure. Where
applicable, the
various hardware components and/or software components set forth herein can be
separated
into sub-components comprising software, hardware, or both without departing
from the
spirit of the present disclosure. In addition, where applicable, it is
contemplated that software
components can be implemented as hardware components, and vice-versa.
Software in accordance with the present disclosure, such as non-transitory
instructions, program code, and/or data, can be stored on one or more non-
transitory machine
readable mediums. It is also contemplated that software identified herein can
be
implemented using one or more general purpose or specific purpose computers
and/or
computer systems, networked and/or otherwise. Where applicable, the ordering
of various
- 38 -

steps described herein can be changed, combined into composite steps, and/or
separated into
sub-steps to provide features described herein.
Embodiments described above illustrate but do not limit the invention. It
should also
be understood that numerous modifications and variations are possible in
accordance with the
principles of the invention.
- 39 -
CA 2949105 2019-07-08

Representative Drawing
A single figure which represents the drawing illustrating the invention.
Administrative Status

2024-08-01:As part of the Next Generation Patents (NGP) transition, the Canadian Patents Database (CPD) now contains a more detailed Event History, which replicates the Event Log of our new back-office solution.

Please note that "Inactive:" events refers to events no longer in use in our new back-office solution.

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Event History , Maintenance Fee  and Payment History  should be consulted.

Event History

Description Date
Inactive: IPC expired 2024-01-01
Common Representative Appointed 2021-11-13
Grant by Issuance 2020-03-24
Inactive: Cover page published 2020-03-23
Inactive: Final fee received 2020-01-30
Pre-grant 2020-01-30
Common Representative Appointed 2019-10-30
Common Representative Appointed 2019-10-30
Notice of Allowance is Issued 2019-08-12
Letter Sent 2019-08-12
Notice of Allowance is Issued 2019-08-12
Inactive: QS passed 2019-08-08
Inactive: Approved for allowance (AFA) 2019-08-08
Amendment Received - Voluntary Amendment 2019-07-08
Examiner's Interview 2019-06-14
Letter Sent 2019-06-06
All Requirements for Examination Determined Compliant 2019-05-30
Advanced Examination Requested - PPH 2019-05-30
Advanced Examination Determined Compliant - PPH 2019-05-30
Amendment Received - Voluntary Amendment 2019-05-30
Request for Examination Received 2019-05-30
Request for Examination Requirements Determined Compliant 2019-05-30
Change of Address or Method of Correspondence Request Received 2018-01-10
Inactive: IPC deactivated 2017-09-16
Inactive: IPC assigned 2017-01-01
Inactive: Cover page published 2016-12-28
Inactive: IPC assigned 2016-12-12
Inactive: IPC assigned 2016-12-12
Inactive: First IPC assigned 2016-12-05
Inactive: IPC assigned 2016-12-05
Inactive: Notice - National entry - No RFE 2016-11-25
Inactive: IPC assigned 2016-11-23
Application Received - PCT 2016-11-23
National Entry Requirements Determined Compliant 2016-11-14
Application Published (Open to Public Inspection) 2015-11-26

Abandonment History

There is no abandonment history.

Maintenance Fee

The last payment was received on 2019-04-16

Note : If the full payment has not been received on or before the date indicated, a further fee may be required which may be one of the following

  • the reinstatement fee;
  • the late payment fee; or
  • additional fee to reverse deemed expiry.

Patent fees are adjusted on the 1st of January every year. The amounts above are the current amounts if received by December 31 of the current year.
Please refer to the CIPO Patent Fees web page to see all current fee amounts.

Fee History

Fee Type Anniversary Year Due Date Paid Date
Basic national fee - standard 2016-11-14
MF (application, 2nd anniv.) - standard 02 2017-05-23 2017-04-13
MF (application, 3rd anniv.) - standard 03 2018-05-22 2018-04-16
MF (application, 4th anniv.) - standard 04 2019-05-22 2019-04-16
Request for examination - standard 2019-05-30
Final fee - standard 2020-02-12 2020-01-30
MF (patent, 5th anniv.) - standard 2020-05-22 2020-04-24
MF (patent, 6th anniv.) - standard 2021-05-25 2021-04-21
MF (patent, 7th anniv.) - standard 2022-05-24 2022-04-25
MF (patent, 8th anniv.) - standard 2023-05-23 2023-04-20
MF (patent, 9th anniv.) - standard 2024-05-22 2024-04-24
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
FLIR SYSTEMS, INC.
NOISELESS IMAGING OY LTD.
Past Owners on Record
ALESSANDRO FOI
ENRIQUE SANCHEZ-MONGE
PAVLO MOLCHANOV
VLADIMIR KATKOVNIK
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Description 2016-11-13 39 2,262
Drawings 2016-11-13 14 2,340
Representative drawing 2016-11-13 1 22
Abstract 2016-11-13 1 76
Claims 2016-11-13 5 152
Claims 2019-05-29 4 148
Description 2019-07-07 39 2,227
Representative drawing 2020-02-23 1 11
Maintenance fee payment 2024-04-23 47 1,968
Notice of National Entry 2016-11-24 1 193
Reminder of maintenance fee due 2017-01-23 1 113
Acknowledgement of Request for Examination 2019-06-05 1 175
Commissioner's Notice - Application Found Allowable 2019-08-11 1 163
International search report 2016-11-13 2 67
National entry request 2016-11-13 3 80
Request for examination 2019-05-29 2 46
PPH request 2019-05-29 11 287
PPH supporting documents 2019-05-29 2 173
Interview Record 2019-06-13 1 15
Amendment 2019-07-07 5 124
Final fee 2020-01-29 1 35