Note: Descriptions are shown in the official language in which they were submitted.
CA 02271868 1999-OS-14
WO 98/229D9 PCT/CA97/00878
PIPELINE PROCESSOR FOR MEDICAL AND
BIOLOGIGAL IMAGE ANALYSIS
FIELD OF THE INVENTION
The present invention relates to processor
architectures, and more particularly to a pipeline
processor for medical and biological image analysis.
BACKGROUND OF THE INVENTION
In the fields of medicine and biology, it is
commonplace for evaluation and diagnosis to be assisted by
the visual inspection of images. For example, x-ray
photographs can be used to assess bone or soft tissue or
microscopic images of cells can be used to identify disease
and estimate prognosis. If these images are further
transformed into their digital representations then
computing machines can be used to enhance the image
clarity, identify key components, or even conclude an
automatic evaluation. These images tend to be highly
complex and full of information, and put high demands on
computing machines.
To permit the development of commercially viable
image analysis systems for medicine or biology the designer
is faced with the task of performing large numbers of
calculations on these images in as short a time as
possible. This requirement leads to the use of computer
pipeline processing.
Fig. 1 shows the principle processing steps
according to classical image analysis. In classical image
analysis, the first step 1 involves digitizing a visual
field 10 into a spatially-discrete set of elements or
CA 02271868 1999-OS-14
WO 98l22909 PCT/CA97/00878
- 2 -
pixels 12 ( shown individually as 12a and 12b in Fig. 1 ) .
Each pixel 12 comprises a digital representation of the
integrated light intensity within that spatially-discrete
region of the visual scene 10. After the digitization
step, the image 12 is segmented in a processing step 2 that
aims to separate the regions or objects of interest 19a,
14b in the image from the background. Segmentation is
often a difficult and computationally intensive task. Next
in step 3, each distinct object 14 located in the
segmentation phase is uniquely labeled as object "y" (16a)
and object "x" (16b) so that the identity of the object 16
can be recovered. Next in step 4, a series of mathematical
measurements 18 are calculated for each object 16 which
encapsulate the visual appearance of each object as a set
of numerical quantities. In the art, this step is known as
feature extraction. The last operation in step 5 involves
classifying the extracted features 18 using any one of a
variety of known hierarchical classification algorithms to
arrive at a classification or identification of the
segmented objects.
It will be appreciated by those skilled in that
art that in classical image analysis, the task of
transforming the visual scene into its numerical
representation as objects with features (18a and 18b in
Fig. 1) typically requires the most computing effort.
Fortunately, the discrete nature of the digital images
produced in the image analysis procedure, coupled with a
judiciously designed series of operational tasks, lends the
procedure to the use of a high-speed pipeline architecture
for the overall analysis engine.
In pipeline processing, the computation is
divided into a number of computational operations, much
like an assembly line of computer operations. In the
CA 02271868 1999-OS-14
WO 98I22909 PCTICA97100878
- 3 -
context of image processing, if a complex set of image
processing tasks can be broken down into a set of smaller
tasks that can be performed serially, and if the raw data
for processing can be broken down as well, then a pipeline
. 5 architecture can be implemented. In the context of image
analysis, the raw data comprises the digitized image 12
(Fig. 1). For pipelining, the operations must be capable of
being performed serially and on one small portion of the
image at a time and cannot require any more information
than available before the portion of image reached the task
in question. In practical systems, it is possible to use
sets of delay elements so that any one pipeline operational
task may have access to rather more information. Generally
speaking, the pipeline processor is able to provide an
increase in processing speed that is proportional to the
length of the pipeline.
In pipeline processing, two approaches have
emerged: coarse-grained pipelining 20 and fine-grained
pipelining 30 as shown in Fig. 2. In coarse-grained
pipelining, the image processing task is broken into rather
large operational blocks 22a, 22b, where each of the blocks
22 comprises a complex set of operations. The coarse-
grained approach utilizes higher-level computational
structures to successfully perform the computing tasks.
Each of the higher-level computational structures 22 can
themselves be pipelined, if the operations at that level
lend themselves to pipelining. In the fine-grained
approach, the computational task is broken down into a
fundamental set of logical operation blocks: AND 32a, OR
32b, NAND 32c, NOR 32d, NOT 32e and XOR 32f. Fine-grained
pipeline architectures are the most difficult to design in
general but offer an approach to the theoretical maximum
rate of operation.
i
CA 02271868 1999-OS-14
WO 98I22909 PCT/CA97100878
- 4 -
In the art, attempts have been made to capitalize
on the speed and power of computer pipeline operations as
applied to cytological image analysis. J'ohnston et al. in
published PCT Patent Application No. WO 93/16438 discloses
an apparatus and method for rapidly processing data
sequences. The Johnston system comprises a massively
parallel architecture with a distributed memory. The
results of each processing step are passed to the next
processing stage by a complex memory sharing scheme. In
addition, each discrete or atomic operation must process an
entire frame before any results are available for following
operations. Total processing time is on the order of the
time to process one image frame multiplied by the number of
atomic operations. Thus, the number of image storage and
processing elements represents the total processing time
divided by the image scan time. In order to reduce the
hardware requirements, it is necessary to use high speed
logic and memory.
Another problem in the art involves the
interpretation of the boundary regions of digital images.
The boundary regions present a special problem because the
objects can be cut-off or distorted. In practice,
overlapping digitization procedures eliminate this
difficulty at the global level. Nevertheless, it is
necessary for the computer system to realize the proper
boundary of the image in. order to restrict operations in
this special area. Looking at the Johnston system, the
system software would have to be recompiled with new
constants. In the alternative, image size variables would
need to be supplied to, each processor, requiring these
variables to be checked after every pixel operation.
Pipeline processors are typically required in
applications where data must be processed at rates of
CA 02271868 1999-OS-14
WO 98I22909 PCT/CA97/00878
- 5 -
approximately 200 million bits of digital information per
second. When this information is to be processed for image
analysis, the number of operations required may easily
approach 50 billion per second. Accordingly, a practical
pipeline processor must be able to handle such data
volumes.
Accordingly, there remains a need in the art for
a pipeline architecture suitable for image analysis in the
medical and biological imaging fields.
SUMMARY OF THE INVENTION
The present invention provides a pipeline
processor architecture suitable for medical and biological
image analysis. The pipeline processor comprises fine-
grained processing pipelines wherein the computational
tasks are broken down into fundamental logic operations.
It is a feature of the pipeline processor
according to the invention that the outputs from each
processing stage directly connect to the inputs of the next
stage. In addition each atomic operation requires only one
row and 3 clock cycles from input to output. Thus, the
total amount of storage for intermediate results is greatly
reduced. In addition most of the storage elements can be
of the form of data shift registers, eliminating the need
for memory addressing and reducing the memory control
requirements. In addition the required processing speed is
reduced to image scan time divided by the number of pixels.
In practice this is approximately two-thirds of the pixel
scan rate, allowing the use of much lower speed, more
reliable, and cheaper hardware, and fewer elements as well.
CA 02271868 1999-OS-14
WO 98I22909 PCT/CA97100878
& _
According to another aspect, the pipeline
processor system utilizes a frame and line synchronization
scheme which allows the boundary of each image to be
detected by a simple logic operation. These
synchronization, or synch, signals are pipelined to each
processing stage. This allows image dimensions to be
determined within the limits of the available storage
during the hardware setup operation.
The pipeline processor also features a debug
buffer. The debug buffer provides a feedback path for
examining the results the pipeline operations without
modifying normal operation of the pipeline. The debug
buffer is useful for performing system self-checks and
performance monitoring.
In a first aspect the present invention provides
a pipeline processor for processing images, said pipeline
processor comprising: an input stage for receiving said
image, a segmentation pipeline stage coupled to the output
of said input stage, said segmentation pipeline stage
including means for segmenting said image into selected
portions, a feature extraction pipeline stage coupled to
the output of said segmentation pipeline stage, said
feature extraction pipeline stage including means for
associating features with said selected portions; and an
output stage for outputting information associated with the
processing of said image, and a controller for controlling
the operation of said pipeline stages.
In another aspect, the present invention provides
in a pipeline processor for processing images comprising an
input stage for receiving an image of a biological
specimen, a segmentation pipeline stage coupled to'the
output of the input stage for segmenting said image into
CA 02271868 1999-OS-14
WO 98I22909 PCTICA97/00878
selected portions, a feature extraction pipeline for
associating features with the selected portions, and an
output stage for outputting information associated with the
processing of said image, a hardware organization
comprising: a backplane having a plurality of slots adapted
for receiving cards carrying electronic circuitry; the
cards including, a processor card for carrying a control
processor, an output card for carrying a memory circuit for
storing information processed by the pipeline stages and a
communication interface for transferring said information
to another computer, one or more module cards, wherein each
of said module cards includes means for receiving a
plurality of pipeline cards, each of said pipeline cards
comprising modules of the segmentation and feature
extraction pipeline stages; said backplane including bus
means for transferring information and control signals
between said cards plugged into the slots of said
backplane.
BRIEF DESCRIPTION OF THE DRAWINGS
Reference will now be made to the accompanying
drawings which show, by way of example, preferred
embodiments of the present invention, and in which:
Fig. 1 shows the processing steps according to
classical image analysis techniques;
Fig. 2 shows in block diagram two approaches to
pipeline processor architectures:
Fig. 3 shows in block diagram form a pipeline
processor according to the present invention;
I
CA 02271868 1999-OS-14
WO 98/22909 PCTICA97/00878
_ g _
Fig. 4 shows the pipeline processor according to
the invention in the context of a high-speed image analysis
system;
Fig. 5 shows the acquisition of spectral images
by the camera sub-system for the image analysis system of
Fig. 4;
Fig. 6 shows the hardware organization of the
pipeline processor according to the present invention;
Fig. 7 shows the data distribution on the custom
backplane of Fig. 6;
Fig. 8 shows a quad card for the hardware
organization of Fig. 6;
Fig. 9 shows the timing for data transfer on the
busses for the backplane;
Fig. 10 shows the input levelling stage of the
pipeline processor in more detail;
Fig. 11 shows a general filter stage for the
segmentation pipeline; and
Fig. 12 shows a -general stage of the feature
extraction pipeline in the pipeline processor.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
Reference is now made to the drawings and in
particular to Fig. 3, which shows in block diagram form a
pipeline processor 100 according to the invention. The
major sub-systems of the pipeline processor 100 are a
CA 02271868 1999-OS-14
WO 98I22909 PCTICA97100878
_ g _
control processor 110, a segmentation sub-system pipeline
l20, a feature extraction pipeline sub-system 130, and an
uplink transfer module 140. As described below with
reference to Fig. 6, the sub-systems for the pipeline
processor l00 are carried by a custom backplane 202 and
hardware arrangement 200 according to another aspect of the
invention.
The pipeline processor 100 forms the "front-end"
of a high-speed image processing system 50 as shown in Fig.
4. The pipeline processor 100 performs the pre-processing
of images for a final classification or analysis. The pre-
processing steps include conditioning the image and
segmentation of the image for feature extraction. When
these operations are complete, the transformation of the
digital image into segmented objects with features is
complete and subsequent pattern classification and analysis
can be rapidly concluded. As will be described, the
pipeline processor features a general purpose design that
allows the computational elements to be rapidly
reconfigured or modified.
The high-speed image processing system 50 shown
in Fig. 4 is for the automated evaluation and analysis of
Pap monolayer specimens. The image processing system 50
comprises a camera sub-system 52, a control computer 54, a
host computer 56, and a series of peripheral devices 58.
The control computer 54 provides the overall control for
the system 50 and includes a peripheral control interface
55 for controlling the peripheral devices 58. The
peripheral devices include a bar code reader 58a, a
focussing system 58b, a scanner 58c, and a slide loader
58d. The peripheral devices 58 do not form part of the
present invention and are described merely to provide an
overview of the image processing system 50. The focussing
CA 02271868 1999-OS-14
WO 98/22909 PCTlCA97/00878
- 10 -
system and the slide loader are subject of pending
application Nos. CA96/00476 (filed July 18, 1996) and
CA96/00475 (filed July 18, 1996), respectively. The host
computer 56 includes a communication interface 57 for
receiving processed data from the uplink transfer module
l40 in the pipeline processor 100. The principal function
of the host computer 56 is to classify the processed data
according to a classification algorithm. As shown, the
control 54 and host 56 computers are linked by a serial
RS232 communication link 59. The control computer 52 is
responsible for the overall direction of the pipeline
processor 100 and the image analysis system 50 (i.e. the
cytological instrument).
It will be appreciated that while the pipeline
1S processor 100 is described in the context of an image
analysis system 50 for detecting precursors to cervical
cancer in cytological specimens prepared by standard
monolayer techniques and stained in accordance with usual
laboratory procedures, the pipeline processor 100 according
to the invention provides an architecture which facilitates
rapid reconfiguration for other related medical or
biological image analysis.
The camera sub-system 52 comprises a light source
61 and an array of charge coupled devices 62 (LCD's). As
depicted in Fig. 5, the camera sub-system 52 generates a
series of three digital images I1, I2 and I3 from the slide
containing a Pap monolayer specimen S. The monolayer
specimen comprises cervical cells and related cytological
components which have been prepared according to the well-
known Fap protocol. In order to be viewed in the visible
spectrum range, these cells are stained according to the
Papanicolaou protocol, and each of the digital images I1,
I2, I3 corresponds to a narrow spectral band. The three
CA 02271868 1999-OS-14
WO 98I22909 PCT/CA97/00$7$
- 11 -
narrow spectral bands for the images I1, I2, I3 are chosen
so as to maximize the contrast among the various important
elements of the cervical cells as stained under the
Papanicolaou protocol. In context of this application, the
pipeline processor 100 comprises three parallel pipelines,
one for each channel or spectral band. Each of these
pipeline channels can operate independently or may
contribute data to its neighbours under certain
circumstances.
Referring back to Fig. 4, in addition to the
control CPU 110, the segmentation 120 and feature
extraction 130 pipelines and the uplink transfer module
140, the pipeline processor 110 includes an input
conditioning module l50, a high-speed receiver module l52,
an analog control module 154, and a debug buffer module 155
as shown. The high-speed receiver 152 and analog control
l54 modules interface the pipeline processor 100 to the
camera sub-system 52. The debug buffer module l55 is
coupled to the control bus for the control processor 110.
The debug buffer 155 provides a feedback path for examining
the results of pipeline operations in real-time without
modifying normal operation of the pipeline. This
information is useful for automatic detection and diagnosis
of hardware faults.
The pipeline processor 110 also includes a bi-
directional communication interface 156 for communicating
with the control computer 54. The pipeline processor 110
also includes a general purpose serial RS232 port 158 and
a general purpose parallel (i.e. printer) port 160. As
shown in Fig. 4, the uplink transfer module 190 includes a
communication interface 142 for transferring processed data
from the pipeline l00 to the host computer 56.
CA 02271868 1999-OS-14
WO 98I22909 PCT/CA97100878
- 12 -
The receiver module l52, and the communication
modules 142, 156 are preferably implemented as fiber-optic
based links in order to provide a high speed communication
link with a very wide bandwidth. The receiver interface
module 152 is used to receive the output images from the
camera sub-system 52. The bi-directional communication
interface I56 is used for receiving control commands and
status requests from the control computer 54. The uplink
communication interface 142 is used to send segmentation
and feature extraction results generated by the pipeline
processor 100 to a classification module in the host
computer 56. --
Reference is made to Fig. 6 which shows a
hardware organization 200 for a pipeline processor 100
according to the present invention. The hardware
organization 200 allows the computational elements to be
rapidly reconfigured or modified for different types of
image processing applications.
As shown in Fig. 6, the hardware organization 200
for the pipeline processor 100 comprises a backplane 202
and a set of printed-circuit cards which plug into the
backplane 202. The printed-circuit cards include a
processor and input card 204, uplink communications card
206, and the pipeline module cards 208 shown individually
as 208a, 208b, 208c and 208d. Data flow between the cards
plugged into the backplane 202 is via four principal
parallel data busses 203: B-Bus 203a, 5-Bus 203b, L-Bus
203c, and F-bus 203d. The backplane 202 also carries two
serially connected data busses: video-out bus 205a and
video-in bus 205b. The layout of the parallel 203 and
serial 205 busses on the backplane 202 is shown in more
detail in Fig. 7.
CA 02271868 1999-OS-14
WO 98/22Q09 PCT/CA97/00878
- 13 -
The pipeline modules 208 comprise "quad" module
cards that can accept up to four smaller pipeline modules
209. Advantageously, this arrangement allows for rapid
prototyping, reconfiguration and modification of the
pipeline processor 100. As shown in Fig. 8, the quad module
cards 208 include a circuit arrangement 300 far controlling
the direction of data flow between the backplane 202 and
the plug-in modules 209 on the card 208 without the need
for external jumpers. The circuit 300 comprises a field
programmable gate array (FPGA) 30l and a set of
transceivers 302 shown individually as 302a, 302b and 302c.
The first transceiver 302 couples the quad module card 308
to the F-bus 203d. The second transceiver 302b couples the
quad module card 208 to the S-bus 203b, and the third
transceiver 303c couples the card 208 to the L-bus 203c. In
response to control signals, the FPGA 301 sets the
appropriate direction of data flow into or out of the
backplane 202 and the plug-in modules 209.
Referring to Fig. 7, the six data busses 203a to
203d and 205a to 205b on the backplane 202 provide the
means for distributing information to and from the control
processor card 204, the uplink communications card 206, and
the quad module cards 208. On the backplane 202, the six
busses are arranged as 26-bit signal.busses on a pair of
96-pin DIN connectors. In addition, the backplane 202
distributes power and a small set of global signals to a11
of the cards 204 to 208. The backplane 202 also includes a
mechanism for identifying each card slot during a reset
through the use of a 4-bit slot identification.
The video-out 205a and video-in 205b busses
comprise respective 26-bit signal busses. The video busses
205a, 205b are connected in series through the cards
inserted in the backplane 202, for example, the control
CA 02271868 1999-05-14
WO 98I22909 PCTICA97100878
- 14 -
processor card 204 and one of the quad module cards 208a as
shown in Fig. 7. The serial connection scheme means that
the control processor 204 and quad module 208 cards should
be mounted next to each other in the backplane 202 to avoid
breaking the serial link.
Referring to Fig. 7, the B-bus 203a, S-bus 203b,
L-bus 203c and F-bus 203d provide the data busses for the
backplane 202. The four data busses 203a- 203d are arranged
in parallel so that any card 204 to 208 plugged into the
backplane 202 is coupled to the busses for data transfer in
or out of the card.
The Levelled image or L-bus 203c is driven by the
input levelling or conditioning circuit 150 (Fig. 4). The
L-bus 203c provides normalized image data for each of the
modules 209 in the backplane 202. The L-bus 203c is
synchronized with the Segmentation data or S-bus 203b.
The Segmentation bus 203b is driven by a
segmentation output stage 121 in the segmentation pipeline
120 (Fig. 4) which is found on the quad card 209. The
segmentation output module 12l provides a set of binary
images of the segmentation maps generated in the
segmentation pipeline 120 as well as two label maps (one
each for cytoplasm and nuclei).
The Feature or F-bus 203d carries smoothed image
information from the input levelling module 150 during each
image frame. During .this time, the frame and line
synchronization bus lines are in step with the segmentation
bus 203b. At the end of each image frame, the feature bus
203d is then used by each of the feature modules in the
feature extraction pipeline 130 in turn to send feature
information to the uplink communication module 140.
CA 02271868 1999-OS-14
WO 98I22909 PCTICA97100878
- 15 -
The operation of the data busses 203a-203d is
further described with reference to the timing diagram in
Fig. 9. The timing for transferring images from the CCD's
62 is shown by timing signal TccU. The time to transfer an
image, i.e. Il, I2 or I3 (Fig. 5), is time t1 . After the
first image is transferred to the image processing section
(i.e. the high-speed receiver module 152), the input
levelling module 150 performs the calculations for
levelling the image Il (I2, I3) and puts the levelled image
onto the video-out bus 205a for the segmentation pipeline
120. The time taken for the input levelling module to
complete its operations is t2 and this time is less than
tl. This difference provides a window, denoted as time t3,
in the data streams where nothing needs to be transferred
on the data busses 203a to 203d. (The interval t3 is
approximately 150 of the duration of time t2.) The
segmentation bus 203b carries the segmentation results and
the feature bus 203d carries the smoothed image results.
When the segmentation pipeline 120 operation is complete,
the segmentation results together with the levelled and
smoothed images are simultaneously placed on the respective
S-bus 203b and L-bus 203c. During the timing window t3,
there is no data being transferred, and therefore the
results from the feature extraction pipeline 130 can be
transmitted to the next stage of operation without
interfering with the normal data flow. This amounts to a
multiplexing of the features on the F- bus 203d.
Each of the 26-bit busses 203a to 203d and 205a
to 205b comprises 24 data bits and 2 synchronization bits.
One synchronization bit is for the image frame and the
other is for the image line. In the case of the backplane
control B-bus 203a, the frame synchronization signal is
used as a data strobe signal and the line synchronization
signal is used as a data direction, i.e. read/not write,
CA 02271868 1999-OS-14
WO 98I22909 PCT/CA97100878
- 16 -
signal. In the video feedback mode, the backplane control
bus B-bus 203a may be used to monitor the outputs of any
stage of the segmentation pipeline 120 while the latter is
in operation without interruption.
The pipeline control CPU 110 and the input
conditioning/levelling module 150 are for convenience
carried on the same printed circuit card 204. The control
CPU 110 is responsible for the control of the camera sub-
system 52, the segmentation 120 and feature extraction 130
pipelines and the uplink transfer module 140. The control
CPU 110 comprises a processor, boot memory, instruction and
data memory, a control port, a backplane interface, and a
watch-dog reset (not shown). The control CPU 110 receives
instructions (e. g. commands and initialization data) from
the control computer 54 via the bi-directional interface
156 (Fig. 4), and returns status information. Following
start-up, the control CPU 110 scans, initializes and tests
the various elements of the pipeline processor 100 and
returns the status information to the control computer 54.
Only after successful completion of start-up will the
control computer 54 instruct the pipeline processor 100 to
begin image capture by the camera 52.
The control CPU 110 is preferably implemented
using a highly integrated RISC-based microcontroller which
includes a variety of on-board support functions such as
those for ROM and DRAM, the serial port, the parallel
printer port, and a set of peripheral strobes. A suitable
device for the control CPU is the AMD 29200 RISC
microcontroller manufactured by Advanced Micro Designs,
Inc., Sunnyvale, California.
The fibre-optic bi-directional interface l56 to
the control computer 54, the analog control interface 154
CA 02271868 1999-OS-14
WO 98/22909 PCTlCA97/008?8
- 17 -
for the camera subsystem 52, the image capture (CCD)
control and levelling circuitry, and the control bus
interface for the backplane 202 are preferably implemented
in a Field Programmable Gate Array (FPGA) which, in
addition, controls the other FPGA's in the system 50. The
watch-dog circuit is implemented as a single chip. The
implementation of the FPGA's and watch-dog circuit are
within the knowledge of one skilled in the art.
The control CPU 1l0 preferably integrates contro2
of the boot memory (not shown) ( the serial port 158, and
the parallel port l60. In addition, the control CPU 1l0
decodes six strobe signals for the off-chip peripherals. In
the preferred embodiment, the control CPU 110 is configured
to use an 8-bit wide ROM (Read Only Memory) as the boot
memory and is able to control this memory directly. A
single SIMM (Single In-line Memory Module), or dual-banked
SIMM's, are used for instruction and data memory. The
serial port 158 is pin compatible with a 9-pin PC serial
port (additional serial control lines are connected to the
on-chip I/O lines). The control lines for the parallel port
160 are connected to a bi-directional latch (not shown) to
drive a PC compatible 25-pin parallel port. Additional
control lines may be handled by on-chip I/0 lines. The
control port drives two dual control channels. Each of
these channels can be used serially, via the interface l54
(Fig. 4), to send data to an analog device such as the
camera sub-system 52.
Reference is next made to Fig. 10, which shows
the input conditioning module 150 in more detail. The input
conditioning module l50 is coupled between the high-speed
receiver module 152 and the segmentation pipeline 120. The
input conditioning circuit 150 first acquires the set of
three raw images I1, I2, I3 from the camera sub-system 52.
CA 02271868 1999-OS-14
WO 98I22909 PCTJCA97/00878
- 1$ -
To ensure high-speed operation, the camera data is
transferred over a fibre-optical data link in the receiver
module 152 (Fig. 4) which has the additional advantage of
being relatively immune from electrical noise generated by
the environment of the automated system 50. The principle
function of the .input conditioning module 150 is to
condition the set of the three images I1, I2, I3 before the
images are passed to the segmentation pipeline 120. The
conditioning involves correcting the images Il, I2, I3 for
local variations in illumination level across the visual
field and for variations in the illumination intensity.
The input module 152 comprises an input buffer
170, a levelling pipeline function 171, a background level
buffer 172, a background level detector 173 and a focus
calculator 174. The output of the levelling function
pipeline 171 is coupled to the input of the segmentation
pipeline 120. The output of the levelling pipeline 17l is
also coupled to the F-bus 203d through a delay buffer 175
and a smoothing function filter 176. The output of the
levelling pipeline 171 is also coupled to the Levelling bus
203c through the two delay buffers 175 and 177.
Each of the three images I1, I2, I3 in the set is
stored in the input data buffer 170 ready to be processed by
the levelling pipeline function 17I and the other
computational tasks required by the system 50.
A second set of fibre-optic links is used to
receive image information from the camera sub-system 52 via
the analog interface 154. While each image I1, I2, I3 is
received and stored, the histogram calculator 173 calculates
background information and the focus calculator 174
calculates level information. The background levelling
buffers 172 store a background correction for each of the
CA 02271868 1999-OS-14
WO 98I22909 PCTlCA97l00878
- 19 -
three images I1, I2s I3. Once calculated the background
correction data does not change and is called-for repeatedly
for each of the new images that enters the input
conditioning module l50. The control CPU 110 transmits the
focus information to the control computer 54. After each
image I1, I2, I3 has been stored in the input buffer 170,
the images are sent along with the background level values
stored in the background buffer 172 to the levelling (i.e.
normalization) pipeline function 171. The levelling pipeline
function 17l levels the images I1, I2, I3 to extract
cytoplasm and nuclear binary maps.
The Background illumination (flash) level detector
173 uses a histogram technique to find and measure the peak
in the intensity of each image I1, T2, I3. While the
intrinsic background can be easily corrected in the pixel
pipeline, the variations in the illumination levels of the
stroboscopic flash in the camera sub-system 52 needs to be
assessed and corrected so that the maximum dynamic range can
be extracted from the levelled images. With the background
level information, the images can be optimally corrected for
variations in the stroboscopic flash level intensity. The
histogram peak detect interface captures the most frequently
occurring pixel input value for each image frame and each
image channel. This information is used to level (normalize)
the input images. In addition, this information is used in
a feedback path to control and stabilize the flash intensity
of the stroboscopic flash lamp via an analogue control line.
The Focus calculator 174 is used to calculate the
optimal focus position. While the optimal focus position is
not generally required by the image processing routine, the
focal position is useful at the opening phase of the
specimen's analysis when the focal positions are not yet
known. Thus, during this initial phase, the input
CA 02271868 1999-OS-14
WO 98I22909 PCT/CA97100878
- 20 -
conditioning module 150 performs the tasks of receiving the
raw images II, I2, I3, levelling these images and then
calculating the so-called focus number (based on a Laplacian
measure of image sharpness). This measure of focal
correctness is returned to the control computer 54 to allow
the optimal focal position to be discovered in a regular
algorithm of motion and measurements.
The levelling pipeline function 17l comprises a
pipelined computational system that accepts single pixels
for each of the three image channels and performs the
levelling operations on them. In a first stage, the
levelling pipeline l71 uses the raw image and the background
correction data to correct the images for an intrinsic
inhomogeneity associated with the imaging system 50. This
is done by dividing the raw image pixel by the appropriate
background image pixel and can thus be implemented in a
single pixel pipeline architecture. This is implemented with
FPGA's (in conjunction with a look-up table for the divide
operations) at the logical or gate level and as such
comprises the first of the fine-grained pipelines in the
processor 100.
The levelled images, i.e. cytoplasm and nuclear
binary maps, from the levelling pipeline 171 are then sent
to the segmentation pipeline 120. In addition, frame
synchronization and line synchronization signals are sent to
the segmentation pipeline 120. The synchronization signals
are provided to simplify the detection of the edges of the
images I1, I2, I3 for special handling.
The first stage in the segmentation pipeline 120
is a Nuclear detect (NetCalc) function. This stage 122 (Fig.
4) utilizes a neural-network based procedure for deciding
whether an individual pixel is to be associated with a
CA 02271868 1999-OS-14
WO 98I22909 PCT/CA97/00878
- 21 -
nuclear region or a cytoplasmic region. The neural network
is implemented as a look-up table held in memory and
accessed by the decoding of an address made up of pixel
intensity values. This allows the neural network (or a
scheme for this type of decision) to be rapidly updated and
modified when needed and includes the possibility of a
real-time adjustment of the nuclear detection function based
on preliminary measurements of image quality and nature. The
implementation of a neural network is described in co-
pending application no. CA96/00619 filed on September 18,
l996 in the name of the common applicant.
The next stage in the segmentation pipeline l20
comprises Sobel and cytoplasm threshold functions. The
Sobel function comprises a known algorithmic technique for
the detection of edges in grey-scale images. The Sobel
function is required by the segmentation pipeline 120 to
guide subsequent refinements of the segmentation. For
efficiency, the Sobel function is implemented to process 3x3
blocks of pixels. The Cytoplasm detect function uses a
threshold routine to distinguish, at a preliminary phase,
the cytoplasmic regions from the background debris based on
the integrated optical density.
The levelled images from the levelling pipeline
17l also pass through the delay buffer 175. The delay buffer
175 delays levelled images to be held until the feature
extraction pipeline 130 begins processing so that all of the
images generated by the various pipeline operations will be
present at the same time. The smoothing function filter 176
smooths the levelled images before they are outputted to the
Feature bus 203d. The smoothing function 176 utilizes a
standard image smoothing operation that requires blocks of
3x3 pixels implemented, again, in a wider pipeline. The
smoothing operations are based on two different weighted
CA 02271868 1999-05-14
WO 98l22909 PCTICA97/00878
- 22 -
averages of neighbouring pixels. As shown in Fig. 10,
another delay 177 is applied to the levelled images before
being outputted on the Levelling bus 203c. The total delay
along this path is set so that the images appearing on the
L-bus 203c and the F-bus 203d are synchronized with the
output of the segmentation pipeline 120.
The result of the operation of the input
conditioning module 150 is the output of binary images of
preliminary nuclear positions and preliminary cytoplasm
positions on the video-out bus 205a together with the
smoothed results of the Sobel operations on the Feature data
bus 203d. These three data streams are received by the next
stage of the segmentation pipeline carried in modules on the
Quad module card 209.
Referring back to Fig. 6, the quad module card 208
is designed to carry up to four pipeline boards 209 for
performing segmentation or feature extraction operations.
The quad module card 208 is configured to provide line
driving capabilities, time multiplexing of feature and
control data busses, and power distribution. As described
above in the discussion of the custom backplane, it consists
of four parallel busses 203a to 203d and 2 serial buses 205a
to 205b. The busses are driven by bus transceivers and the
necessary logic is implemented in a small FPGA as will be
within the understanding of those skilled in the art.
The quad module card 208 is central to the general
purpose design of the image processing system. The quad
module card 208 allows various modules that implement
segmentation and feature extraction operations in
configurable hardware (i.e. FPGA's) to be arranged thereby
providing flexibility in the operation of these configurable
elements so as to improve the accuracy of a segmentation
CA 02271868 1999-OS-14
WO 98I22909 PCT/CA97/00878
- 23 -
result or add additipnal features that may be required by
the subsequent classification algorithms.
Each one of these filters 122 consumes a block of
3x3 pixels iof either 8 bits or one bit) fed to it after the
preliminaries of levelling and input conditioning in general
have been performed as described above.
The order of operations within the segmentation
pipeline 120 comprises a generalized noise reduction
operation, followed by the labelling of the resultant
cytoplasmic regions. This is followed by another generalized
noise reduction operation, a subsequent nuclear region
labelling, and final noise reduction before the results are
presented to the S-bus 203b.
Reference is made to Fig. 11, which shows a filter
stage 122 for the segmentation pipeline 120 in greater
detail. It will be understood that the filter stage 122 does
not comprise'a microprocessor unit implementing some variety
of software or a set of general-purpose adders. Such an
implementation would represent a coarse-grained pipeline
approach and would fail to properly exploit the power of
this type of computing architecture. Instead, the pipeline
processor 100 according to the present invention utilizes
a fine-grained approach and accordingly each filter unit l22
comprises a number of elementary logic elements which are
arranged in the form of the serial pipeline and perform
their logical functions rapidly so as to quickly pass to the
next block of data waiting in line.
As shown in Fig. 11, the filter stage 122
comprises a filter pipeline function module 180. The filter
pipeline module 180 has a pixel stream input 181 and an
output 182. The pixel stream is also coupled to an input l83
CA 02271868 1999-05-14
WO 98l22909 PCT/CA97100878
- 24 -
through a row delay buffer 184 and to another input l85
through an additional row delay buffer 186. The filter stage
122 includes a mask stream input 188, a frame
synchronization input I89, and a line synchronization input
190. The frame synchronization is applied to another input
191 through a delay buffer I92, and the line synchronization
190 is applied to another input l93 through a delay buffer
194.
In operation, the input pixel stream 181 is fed
directly into the filter pipeline function l80. There is a
latency amounting to two full image rows plus three pixels
of the third row before the pipeline 180 can begin
operations (corresponding to a 3x3 processed element). The
pixel stream 18l is delayed by the one row buffer 184 to
fill in the second line of the pixel block and that same
stream of pixels is further delayed by the row buffer 186 to
fill the final row of the 3x3 block. However, it will be
understood that the total in-out delay time (as opposed to
the latency) for the pipeline is only one row and three
clocks. The mask stream 188 held in memory is available for
the logical functions. The frame 189 and line 190
synchronization signals together with the delayed inputs
191, 193 complete the inputs to the filter pipeline function
180.
The noise reduction elements in the pipeline
comprise a combination of specialized erosion and dilation
operations. The principle result of these operations is the
alteration of the state of the centre pixel in the 3x3 block
based on the state of one or more of the neighbour pixels.
In the case of erosion, the pixel at the centre is turned
"off" under the right neighbourhood condition. For the
dilation operations, the pixel is turned "on".
CA 02271868 1999-OS-14
WO 98l22909 PCT/CA97/00878
- 25 -
The dilation function works on a binary pixel map
to fill in irregularities. A 3x3 matrix is examined to
determine if the central pixel should be turned on based on
the number of neighbours that are on (the "order"). If this
pixel is already on it is left on.
The erosion function reverses the action of the
dilation function by restoring the boundary of a block of
pixels to the original dimensions. A 3x3 matrix is examined
to determine if the central pixel should be turned off based
on the number of neighbours that are on (the "order"). If
this pixel is already off it is left off.
The dilate special function works on a source
binary map and an edge binary map to fill in irregularities.
A 3x3 matrix is examined to determine if the central pixel
should be turned on. If this pixel is already on it is left
on. The central pixel of the edge map enables an alternate
rule for filling in the central pixel of the source map.
The dilation not join function works on a binary
pixel map to fill in irregularities while avoiding joining
adjacent objects. The 3x3 input matrix and the 4 previously
calculated result pixels are examined to determine if the
central pixel result should be turned on. If this pixel is
already on it is left on.
The dilate special not join function works
identically to the Dilate Not Join function with the
addition of a mask bit. The central pixel of the mask map
enables an alternate rule for filling in the central pixel
of the source map.
The dilate label not join function works on a
source label map the result label map and an edge map to
CA 02271868 1999-OS-14
WO 98I22909 PCT/CA97/00878
- 26 -
fill in irregularities while avoiding joining adjacent
objects. A 3x3 matrix of the source and result map is
examined to determine if the central pixel should be turned
on based on which of the neighbours are on. If this pixel is
already non-zero or the edge map is zero its value is left
unchanged. The central pixel of the mask map enables an
alternate rule for filling in the central pixel of the
source map. In addition, the following operations are
implemented in the hardware as a part of the noise reduction
scheme in the segmentation pipeline 120:
Subadd2 Module - returns total of input bits as
0, 1 or 2 and greater.
Subadd3 Module - returns total of input bits as
0, 1, 2 or 3 and greater.
Subadd4 Module - returns total of input bits as
0, l, 2, 3 or 4.
Subsum3 Module - returns sum of 2 input numbers
as 0, 1, 2 or 3 and more.
Subsum6 Module - returns sum of 2 input numbers
as 0, 1, 2, 3, A, 5 or 5.
Subjoin Module - returns join sub-sum for 1 edge
and 1 corner.
Join Module - returns true if dilation operation
would join 2 regions.
~rderl Module - returns true if 1 or more nearest
neighbours are on.
CA 02271868 1999-OS-14
WO 98/22909 PCT/CA97/00878
- 27 -
Order2 Module - returns true if 2 or more nearest
neighbours are on.
Order3 Module - returns true if 3 or more nearest
neighbours are on.
Order4 Module - returns true if 4 or more nearest
neighbours are on.
Orders Module - returns true if 5 or more nearest
neighbours are on.
Order6 Module - returns true if 6 or more nearest
neighbours are on.
Order? Module - returns true if 7 or more nearest
neighbours are on.
Order8 Module - returns true if 8 nearest
neighbours are on.
After the noise reduction is complete, the
pipeline processor 100 proceeds to the detection of either
cytoplasmic or nuclear material in the image I. The
"detection" function is also implemented in the segmentation
pipeline 120 and can comprise both nuclear and cytoplasm
detection operations or only the nuclear detect operation.
This module in the segmentation pipeline l20 receives the
Sobel, NetCal and BinCyt bit streams from the input
conditioning module l50 (as described above) over the video-
in bus 205b. The signals are processed in parallel and fine
grained pipelines to produce the UnfilteredNuc, BinNuc,
NucPlus and BinCyt bit streams. These results from the
segmentation pipeline 120 are then used in various phases of
the feature extraction modules in the feature extraction
CA 02271868 1999-OS-14
WO 98l22909 PCTlCA97I00878
- 28 -
pipeline 130 to calc~zlate the feature sets which are then
used in a subsequent image classification. These signals and
a binary cytoplasm image to be labelled are passed on the
video-out bus 205a to a labelling module. In the case where
only the nuclear detect function is implemented, the
UnfilteredNuc, BinNuc, NucPlus and an intermediate form of
BinCyt are passed to the cytoplasm detect module using a
diffexent set of pins on the video-out bus 205a.
The Primary Labelling operation comprises an
operation in which a segmented region (either nuclear
material or cytoplasmic material) is given a unique number
within the image so that it may be later identified when the
classification is complete. This is done before the feature
extraction phase is begun so that feature extraction can be
applied to the labelled and segmented objects and also since
the location of disparate nuclei within any single
cytoplasmic region can be an important feature in itself
when attempting a classification of cytological material.
This function can be either implemented at the gate level in
a Field Programmable Gate Array, or alternatively
application specific integrated circuits (ASIC) can be used.
With the completion of the segmentation of both
nuclear and cytoplasmic material in the image, and their
appropriate labelling, processing proceeds to the feature
extraction pipeline 130. The primary function of the
pipeline 130 is to extract mathematically-based or
logically-based features to be used in the classification of
each of the segmented objects in the image I. The feature
extraction pipeline 130 comprises a number of feature
extraction modules 132 in parallel, shown individually as
130a,... 130m in Fig. 4.
CA 02271868 1999-OS-14
WO 98I22909 PCTlCA97/00878
- 29 -
Reference ~s next made to Fig. 12 which shows the
feature extraction module 132 in more detail. The feature
extraction module 132 comprises a feature calculator 210 and
accumulator arrays 2l2 shown individually as 212a, 212b,
212c, 212d. One block of these accumulator arrays 212 is
assigned to each feature and within each accumulator block,
one accumulator is assigned to each label. In total, each
block may be expected to hold in excess of 21,000
accumulators.
In the context of the present image processing
application, the features that can be extracted fall into
five general categories: (1) morphological features; (2)
textural features; (3) colorometric features; (4)
densitometric features; and (5) contextual features.
Morphological features describe the general shape and size
of the segmented objects. Textural features describe the
distribution and inter-relation of light and dark levels
within the segmented objects. Colorometric features pertain
to the spectral properties of the segmented objects.
Densitometric features describe the light intensities within
the segmented objects. Contextual features establish the
physical relationship between and among the segmented
objects.
Referring back to Fig. 4, the uplink transfer
module 140 comprises an uplink buffer 19l and the uplink
communications interface 142. The uplink buffer 141 stores
the image data from the Levelled image bus 203c and the
Segmentation bus 203b. Each image is written to a separate
bank of memory in the buffer 141as follows: a11 three
levelled images, cytoplasm labels, nuclear labels and a11
the binary images. Once an image is stored, the image banks
can be transmitted on request.
CA 02271868 1999-OS-14
WO 98I22909 PCTlCA97I00$78
- 30 -
Following the end of each image frame, the feature
information is input from the feature bus 203d on the frame
l89 and line 190 synchronization signals. This data is
written into the buffer memory l41 in the same block as the
images. The feature memory start row and number of rows is
used to determine when the feature storage is complete . When
all the feature data from all the feature cards has been
stored, this data is automatically transmitted to the host
computer 56 via the fiber optic communication interface 142.
The image data is transmitted upon a request by the host
computer 56.
The present invention may be embodied in other
specific forms without departing from the spirit or
essential characteristics thereof. Therefore, the presently
discussed embodiments are considered to be illustrative and
not restrictive, the scope of the invention being indicated
by the appended claims rather than the foregoing
description, and a11 changes which come within the meaning
and range of equivalency of the claims are therefore
intended to be embraced therein. -