Patent 2366373 Summary

(12) Patent Application:	(11) CA 2366373
(54) English Title:	TELECOMMUNICATIONS NETWORK DISTRIBUTED RESTORATION METHODS AND SYSTEMS
(54) French Title:	PROCEDES ET SYSTEMES DE RETABLISSEMENT DE RESEAU DE TELECOMMUNICATION DISTRIBUE
Status:	Deemed Abandoned and Beyond the Period of Reinstatement - Pending Response to Notice of Disregarded Communication

Bibliographic Data

(51) International Patent Classification (IPC):	H04Q 11/04 (2006.01) H03J 3/14 (2006.01) H04Q 3/00 (2006.01)
(72) Inventors :	BADT, SIG H., JR. (United States of America)
(73) Owners :	ALCATEL
(71) Applicants :	ALCATEL (France)
(74) Agent:	ROBIC AGENCE PI S.E.C./ROBIC IP AGENCY LP
(74) Associate agent:
(45) Issued:
(86) PCT Filing Date:	2001-01-05
(87) Open to Public Inspection:	2001-07-19
Availability of licence:	N/A
Dedicated to the Public:	N/A
(25) Language of filing:	English

Patent Cooperation Treaty (PCT):	Yes
(86) PCT Filing Number:	PCT/US2001/000449
(87) International Publication Number:	WO 2001052591
(85) National Entry:	2001-09-05

(30) Application Priority Data:

Application No.	Country/Territory	Date
09/479,101	(United States of America)	2000-01-07

Abstracts

English Abstract

A network wherein a plurality of links connects a plurality of nodes such as
cross-connects in a communication circuit network with paths interconnecting
the nodes, and with there being spare capacity between a sufficient number of
nodes to accommodate at least some rerouting of traffic immediately upon
detection of a break in a traffic span in the network so as to restore circuit
continuity within a predetermined maximum time using an improved failure
detection, isolation, and recovery scheme.

French Abstract

L'invention concerne un réseau comprenant une pluralité de liaisons, qui permettent de relier une pluralité de noeuds, tels que des interconnexions, dans un réseau de circuits de communications pourvu de chemins interconnectant lesdits noeuds, une capacité de réserve existant entre un nombre suffisant de ces noeuds pour recevoir au moins une partie du réacheminement du trafic, dès qu'une coupure est détectée dans une liaison du trafic dans le réseau, de manière à rétablir la continuité du circuit dans un intervalle de temps prédéterminé, au moyen d'un plan de détection, d'isolation et de réparation des pannes amélioré.

Claims

Note: Claims are shown in the official language in which they were submitted.

50
CLAIMS
1. A method for identifying a pair of neighboring nodes in a
telecommunications
network having at least one distributed restoration sub-network, comprising:
constructing a first C-bit keep alive message for a first node in a
neighboring node pair
connected by a link;
embedding the first C-bit keep alive message within the C-bit of a first DS3
signal;
determining a quality of service information for the link when looking from
the first node to
the neighboring node;
embedding the quality of service information within the C-bit keep alive
message; and
transmitting the first DS3 signal from the first node to a second node in the
neighboring node
pair over the link, wherein the first C-bit keep alive message identifies the
first node to the second
node and the quality of service information for the link when looking from the
first node to the second
node.
2. The method according to Claim 1, further comprising:
receiving the first DS3 signal at the second node; and
processing the first DS3 signal to identify the first node from the first C-
bit keep alive
message to the second node and identifying the quality of service information
of the link, when
looking from the first node to the second node, from the first C-bit keep
alive message.
3. The method according to Claim 1, further comprising:
constructing a second C-bit keep alive message for a the second node of the
neighboring node
pair;
embedding the second C-bit keep alive message within the C-bit of a second DS3
signal;
determining a quality of service information for the link when looking from
the second node
to the first node;
embedding the quality of service information within the C-bit keep alive
message; and
transmitting the second DS3 signal from the second node to the first node in
the neighboring
node pair over the link to identify the second node to the first node and the
quality of service
information for the link, when looking from the second node to the first node,
from the second C-bit
keep alive message.
4. The method of Claim 3, wherein the first node has a first identification
designation
and the second node has a second identification designation and the link
quality of service
information, and further wherein the first C-bit keep alive message contains
the first identification

51
designation and the second C-bit keep alive message contains the second
identification designation
and the link quality of service information.
5. The method according to Claim 4, wherein the first identification
designation
includes a first node identifier, a first node port number, a first node a
wide area network address, and
a first node "I am custodial" indicator in the first C-bit keep alive message,
and wherein the second
identification designation includes a second node identifier, a second node
port number, a second
node a wide area network address, and a second node "I am custodial" indicator
in the second C-bit
keep alive message.
6. The method of Claim 1, wherein the network comprises a plurality of
neighboring
node pairs, and further wherein a C-bit embedded keep alive message is sent
between each pair of
neighboring nodes connected by a link.
7. The method of Claim 1, wherein each node has an identification designation,
and
further wherein each C-bit embedded keep alive message includes the
identification designation for
the node from which the C-bit embedded keep alive message originates and the
quality of service
information of the link looking from one node to the next node.
8. The method according to Claim 7, wherein the link is a spare link.
9. The method according to Claim 3, wherein the link is a spare link.
10. The method according to Claim 2, wherein the link is a spare link.
11. The method according to Claim 1, wherein the link is a spare link.
12. The method of Claim 7, wherein each identification designation further
comprises a
node identifier, a port number, a wide area network address, and an "I am
custodial" indicator for the
node from which the C-bit embedded keep alive message originates.
13. The method of Claim 12, wherein each DS3 signal travels in-band.
14. The method according to Claim 12, wherein the quality of service
information of the
link includes at least one of errored seconds, severely errored seconds, and
loss of signal seconds.

52
15. The method of Claim 8, wherein each identification designation further
comprises a
node identifier, a port number, a wide area network address, and an "I am
custodial" indicator for the
node from which the C-bit embedded keep alive message originates.
16. The method of Claim 15, wherein each DS3 signal travels in-band.
17. The method according to Claim 16, wherein the quality of service
information of the
link includes at least one of errored seconds, severely errored seconds, and
loss of signal seconds.
18. The method according to Claim 17, wherein the errored seconds, severely
errored
seconds and loss of signal seconds and quality of service are defined by the
DS3 standard.
19. A telecommunications network comprising a plurality of nodes
interconnected by a
plurality of links, and a distributed restoration sub-network, comprising:
a first node having a first unique identifier;
a second node having a second unique identifier;
a link connecting the first node to the second node; and
a DS3 signaling channel within the link, wherein the first node and second
node are operable
to send a DS3 signal having a keep alive message and a quality of service
information of the link
embedded within a C-bit to one another.
20. The network of Claim 19, wherein a DS3 signal from the first node contains
the first
unique identifier and a DS3 signal from the second node contains the second
unique identifier.
21. The network of Claim 20, wherein the first node is operable to send a
first node DS3
signal to the second node over the link to identify the first node to the
second node, and wherein the
second node is operable to send a second node DS3 signal to the first node
over the link to identify
the second node to the first node.
22. The network of Claim 19, wherein the network further comprises a plurality
of
neighboring node pairs, and further wherein each pair connected by a spare
link is operable to send a
plurality of DS3 signals with a C-bit embedded keep alive messages to each
other.
23. The network of Claim 22, wherein each node has a unique numerical
identifier, and
further wherein each keep alive message comprises the unique numerical
identifier for the node from
which the DS3 signal is generated.

53
24. The network of Claim 23, wherein each unique numerical identifier includes
at least
one of a node identifier, a port number, a wide area network address, and an
"I am custodial"
indicator for the node from which the DS3 signal is generated.
25. The network of Claim 19, wherein each node has a unique numerical
identifier, and
further wherein each keep alive message comprises the numerical identifier for
the node from the
keep alive message originators.
26. The network of Claim 19, wherein the keep alive message includes at least
one of a
node identifier, a port number, a wide area network address, and an "I am
custodial" indicator.
27. The network of Claim 19, wherein all DS3 signals travel in-band.
28. The network according to Claim 27, wherein the errored seconds, severely
errored
seconds and loss of signal seconds and quality of service are defined by the
DS3 standard.

Description

Note: Descriptions are shown in the official language in which they were submitted.

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
DESCRIPTION
TELECOMMUNICATIONS NETWORK DISTRIBUTED
RESTORATION METHODS AND SYSTEMS
BACKGROUND OF THE INVENTION
TECHNICAL FIELD OF THE INVENTION
The present invention relates generally to telecommunications systems and
their methods of
operation and for dynamically restoring communications traffic through a
telecommunications
network, and more particularly to a messaging method by which the origin and
destination nodes of a
failed path can receive information on which spans or links remain usable in
the failed path.
This invention also relates to a distributed restoration algorithm (DRA)
network, and more
particularly to a method for isolating the location of a fault in the network
and the apparatus for
effecting the method, and more particularly to a method of monitoring the
topology of the spare links
in the network for rerouting traffic in the event that the traffic is
disrupted due to a failure in one of
the working links of the network.
This invention further relates to a distributed restoration method and system
for restoring
communications traffic flow in response to sensing a failure within spans of
the telecommunications
network, and even more specifically, to telecommunications network having 1633-
SX broadband
digital cross-connect switches.
DISCUSSION OF THE RELATED ART
Whether caused by a backhoe, an ice storm or a pack of hungry rodents, losing
a span or
bundle of communication channels such as DS3 and SONET telephone channels
means losing
significant revenues. After the first 1.5 seconds of an outage, there is also
a significant risk that the
outage may disable one of more local offices in the network due to an excess
of carrier group alarms.
Several techniques are commonly used to restore telecommunications networks.
Several of
these are well known. The first of which is called route diversity. Route
diversity addresses the
situation of two cables running between a source and a destination. one cable
may take a northward
path, while the other takes a southward path. If the northward path fails,
traffic may be sent over the
southward path, or vice-versa. This is generally a very high quality
restoration mechanism because of
its speed. A problem with route diversity, however, is that, generally, it is
very expensive to employ.
The use of rings is another well-known technique that also provides for
network restoration.
This is particularly attractive when a large number of stations are connected
together. The stations
may be connected in a ring; thus, if any one connection of the ring fails,
traffic may be routed in a
direction other than the one including the failure, due to the circular nature
of the ring. Thus, a ring

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
2
may survive one cut and still be connected. A disadvantage with rings, is that
the nodes of
telecommunication networks must be connected in a circular manner. Without
establishing the
circular configuration that a ring requires, this type of restoration is not
possible.
Another method of network restoration, mesh restoration, entails re-routing
traffic through the
network in any way possible. Thus, mesh restoration uses spare capacity in the
network to re-route
traffic over spare or under utilized connections. Mesh restoration generally
provides the lowest
quality of service in the sense that it generally requires a much longer time
than does route diversity
or ring restoration to restore communications. On the other hand, mesh
restoration has the attraction
of not requiring as much spare capacity as do route diversity or ring
restoration. In performing
network restoration using mesh restoration, two techniques are possible.
One is known as centralized restoration, the other is known as distributed
restoration. In
centralized mesh restoration, a central computer controls the entire process
and all of the associated
network elements. All of the network elements report to and are controlled by
the central computer.
The central computer ascertains the status of the network, calculates
alternative paths and sends
commands to the network elements to perform network restoration. In some ways,
centralized mesh
restoration is simpler than distributed mesh restoration. In distributed mesh
restoration, there is no
central computer controlling the entire process. instead, the network
elements, specifically the cross-
connects communicate among themselves sending messages back and forth to
determine the optimum
restoration path. Distributed mesh restoration, therefore, performs a level of
parallel processing by
which a single restoration program operates on many computers simultaneously.
Thus, while the computers associated with the network elements are
geographically
distributed, parallel processing still occurs. There is yet one set of
instructions that runs on many
machines that are working together to restore the network.
The telecommunications network is comprised of a plurality of nodes connected
together by
spans and links. These spans and links are for the most part fiber optical
cables. A path is defined in
the network as the connection between the two end nodes by which information
or traffic can
traverse. One of these end nodes is defined as an origin node while the other
is the destination node.
There could be a number of paths that connect the origin node to the
destination node. And if one of
those paths that carnes the traffic is disrupted, another path can be used as
an alternate for rerouting
the traffic. Thus, if a path based approach for restoring disrupted traffic is
employed in a
telecommunications network and the network is provisioned with a distributed
restoration algorithm
(DRA), the origin and destination nodes are taken into consideration.
In the telecommunications network, it is likely that when a fault occurs, only
a single link or
span along a path is affected. Such a catastrophic event may be due to for
example a cable cut. The
remainder of the path through the network on either side of the failure may
therefore remain intact
and could be used for circumventing the failed portion. Yet a path based DRA
provisioned network

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
nonetheless may disregard the intact portions and seek a completely different
alternate route for
restoring the traffic. This can be a problem because indiscriminate selection
of a candidate path to
restore one failed path can preclude the restoration of other simultaneously,
or subsequently, failed
paths as damage to a single span in a network can cause the failure of many
paths that may happen to
pass through the same span. Therefore, a DRA provisioned network must take
into account all such
end to end paths that need to be restored in order to ensure the highest
possible degree of restoration
following a failure.
There is therefore a need for an improved DRA provisioned path based
restoration scheme
that allows the intact portions of a failed path to be ascertained and made
available for forming
alternate restoration paths, to thereby improve the efficiency and the
completeness of the restoration.
In a telecommunications network provisioned with a distributed restoration
algorithm (DRA),
the network is capable of restoring traffic that has been disrupted due to a
fault or malfunction at a
given location thereof. In such DRA provisioned network, or portions thereof
which are known as
domains, the nodes, or digital cross-connect switches, of the network are each
equipped with the DRA
algorithm and the associated hardware that allow each node to seek out an
alternate route to reroute
traffic that has been disrupted due to a malfunction or failure at one of the
links or nodes of the
network. Each of the nodes is interconnected, by means of spans that include
working and spare
links, to at least one other node. Thus, ordinarily each node is connected to
an adjacent node by at
least one working link and one spare link. It is by means of these links that
messages, in addition to
traffic signals, are transmitted to and received by the nodes.
In a DRA network, when a failure occurs at one of the working links, the
traffic is rerouted by
means of the spare links. Thus, to operate effectively, it is required that
the spare links of the DRA
network be functional at all times, or at the very least, the network has a
preconceived notion of which
spare links are functional and which are not.
In addition to routing traffic, the links also provide to each node signals
that inform the node
of the operational status of the network. Thus, a signal is provided to each
node to inform the node
that traffic is being routed among the nodes effectively, or that there has
been a malfunction
somewhere in the network and that an alternate route or routes are required to
reroute the disrupted
traffic.
Conventionally, when everything is operating correctly, an idle signal, or
some other similar
signal, is propagated among the various nodes of the network to inform those
nodes that traffic is
being routed correctly. However, if a fault occurs somewhere in the network
that disrupts the flow of
traffic, an alarm is sent out from the fault location and propagated to the
nodes of the network. Such
alarm signal causes the equipment in the network downstream of the fault
location to go into alarm.
To suppress the alarm in the downstream equipment, a follow-up signal is sent.

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
4
This prior art method of sending out an alarm signal from the location of the
fault aids in the
fault isolation along a single link. Unfortunately, the standard that requires
the sending of an alarm
signal downstream from the fault also requires that the downstream nodes, upon
receipt of the alarm
signal, further propagate it downstream. As a consequence, since all nodes in
the network will
receive the alarm signal within a short period of time after the fault has
occurred, it becomes very
difficult, if not downright impossible, for the management of the network to
identify the custodial
nodes of a failed link, or the site where the fault occurred. This is due to
the fact that, in addition to
the custodial nodes, many other nodes in the network likewise are in receipt
of the alarm signal.
Therefore, a method is required a method in which the true custodial nodes of
a failed link be
made aware that they indeed are the custodial nodes. Putting it differently, a
method is required to
differentiate the alarm signal received by nodes other than the custodial
nodes from the alarm signal
received by the custodial nodes, in order to preserve the accepted practice of
sending an alarm signal
to downstream equipment.
Since in most instances a distributed restoration domain is a portion of an
overall
telecommunications network, or a number of different networks, it is therefore
also required that the
status of whatever signals received by the nodes outside of the distributed
restoration domain be
maintained as if there has not be any differentiation between the time when
those signals are received
by the custodial nodes and when those signals are received subsequently by the
nodes outside the
domain.
There is also a need for the instant invention DRA network to always have an
up-to-date map
of the functional spare links, i.e. the spare capacity, of the network, so
that traffic that is disrupted due
to a failure can be readily restored.
SUMMARY OF THE INVENTION
The present invention thus comprises the concept of connecting a plurality of
nodes such as
cross-connects in a communication circuit network with control channels
interconnecting all nodes,
and with there being spare capacity between a sufficient number of nodes to
accommodate at least
some rerouting of traffic as quickly as possible upon detection of a break in
a traffic span in the
network so as to restore circuit continuity within a predetermined maximum
time.
Furthermore, to enable the DRA provisioned network to utilize the intact
portions of a failed
path, the present invention messaging technique provides information to both
the origin and
destination nodes of a failed path on which spans or limes remain intact
leading up to the point of
failure.
To achieve this end, the failure is first detected by the adjacent custodial
nodes bracketing the
fault. Each of these custodial nodes adjacent to the failure then would
initiate the propagation of a
"reuse" message to either the origin node or the destination node. This reuse
message has a variable

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
length route information field and an identifier identifying it as a reuse
message. As the reuse
message is propagated from node to node back to the origin or destination
node, each node through
which the reuse message passes would append its own unique node identification
(ID) to the route
information field. Thus, when an origin or destination node receives the reuse
message, it can read
from the route information field of the reuse message a description of the
intact portion of the path.
By allowing the restoration logic to take into account the intact portions of
the original paths,
better restoration decision making use of the intact portions of the path can
take place. In one
embodiment, such restoration logic is only permitted to apply the intact
portions to the restoration of
the failed path that originally included the portions. In other words, such
intact portions are restricted
for use in restoring the failed path. Such restriction considerably simplifies
the restoration process '
and avoids the possibility of unforeseen problems.
The present invention also involves modifying the functionality of each node
of the
distributed restoration domain, so that, when in receipt of a failed signal
(or an AIS signal that
suppresses the alarm of downstream equipment), each node of the domain would
cause the
propagation of a distinct non-alarm signal (or non-AIS signal) that would
accomplish the same thing
as the original failed signal. Consequently, only the custodial nodes of a
failed link, or those nodes
that bracket a malfunctioned site, are in receipt of the true alarm or AIS
signal.
Since adjacent nodes are connected by links, or spans, the kinds of signals
that traverse
among the nodes in a network have different formats, depending on the type of
connection. In the
case of a Digital Service 3 (DS3) facility, each of the nodes of the
distributed restoration domain is
provisioned with a converter, so that when in receipt of an AIS signal, the
converter would convert
the AIS signal into an idle signal (or a modified AIS signal), and propagate
the idle signal to nodes
downstream thereof.
To achieve this conversion of an AIS signal, a modification of at least one of
the C-bits of the
idle signal takes place. Indications of directly adjacent failures such as
loss of signal (LOS), loss of
frame (LOF) and loss of pointer (LOP) also result in the propagation of a
modified idle signal to
nodes downstream of where the fault occurred.
At the perimeter of the distributed restoration domain, each of the
access/egress nodes that
communicatively interconnects the domain to the rest of the network, or other
networks, is
provisioned such that any incoming modified idle signal is reconverted, or
replaced, by a standard
AIS signal so that the equipment outside of the distributed restoration domain
continue to receive the
standard compliant signal.
For those networks that are interconnected by optical fibers where the SONET
Synchronous
Transport Signal (STS-n) such as the STS 3 standard is used, each node of the
distributed restoration
domain is provisioned so that, in receipt of an incoming STS N AIS signal,
such AIS signal is
replaced by a STS N Incoming Signal Failure (ISF) signal. In the preferred
embodiment, the ZS bit

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
6
of the STS-3 signal is changed. This modification serves the same purpose as
the C-bit modification
to the DS3 signal.
To provide an up-to-date map of the functional spare links of the network, a
topology of the
network connected by the functional spare links is made available to the
custodial nodes that bracket a
malfunctioned link as soon as the failure is detected. The custodial node that
is designated as the
sender or origin node then uses the topology of the spare links to quickly
reroute the traffic through
the functional spare links.
To ensure that the spare links are functional, prior to the DRA process,
special messages,
referred to in this invention as keep alive messages, are continuously
exchanged on the spare links
between adjacent nodes. Each of these keep alive messages has a number of
fields which allow it to
identify the port of the node from which it is transmitted, the identification
of the node, the incoming
IP address and the outgoing IP address of the node, as well as a special field
that identifies the keep
alive message as coming from a custodial node when there is a detected
failure. These keep alive
message may be transmitted over the C-bit channels as idle signals.
So long as a spare link is operating properly, the keep alive messages that
traverse
therethrough will contain data that informs the network, possibly by way of
the operation support
system, of the various pairs of spare ports to which a spare link connects a
pair of adjacent nodes.
This information is collected by the network and constantly updated so that at
any moment, the
network has a view of the entire topology of the network as to what spare
links are available. This
data can be stored in a database at the operation support system of the
network, so that it may be
provided to the origin node as soon as a failure is detected.
Additionally, information from so-called keep alive messages utilizing the C-
bit as a carrier
can be utilized to assigned a "Quality of Service" (QoS)to each link in the
network. The QoS can be
utilized to assign priorities with respect to how and what data will be re-
routed first or if the data will
be re-routed at all during a restoration. The QoS can include, among others,
quality of performance
parameters such as error seconds or severely errored seconds. Once the link
has been assigned a
value then an algorithm is executed using the data to determine the priority
of the data and hence
facilitating the restoration process.
One aspect of the present invention can be characterized as providing a
special message that
is exchanged continuously between adjacent nodes before the occurrence of the
failure in order to
continually collect data relating to the available spare links of the network.
Another aspect of the present invention can be characterized as providing an
improved
communication failure detection, isolation and recovery scheme or algorithm.
A further aspect of the present invention can be characterized as providing a
method of
communicating the description of the intact portions of a failed path to the
origin and destination
nodes of a failed path.

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
7
An additional aspect of the present invention includes providing a method of
determining the
portions of a failed path that remain usable for carrying traffic.
A still further aspect of the present invention can be characterized as
identifying the reusable
links or spans that connect nodes of a failed path so that these links or
spans can be used by an
alternate path for restoring the disrupted traffic.
Another aspect of the present invention can be characterized as providing a
method of
mapping a topology of the spare capacity of a DRA network so that traffic may
be routed through the
functional spare links when a failure occurs at the network.
Additional aspects and advantages of the invention will be set forth in part
in the description
which follows, and in part will be obvious from the description, or may be
learned by practice of the
invention. The aspects and advantages of the invention will be realized and
attained by means o the
elements and combinations particularly pointed out in the appended claims.
To achieve these and other advantages, and in accordance with the purpose of
the present
invention, as embodied and broadly described, the present invention can be
characterized according to
one aspect as a method for identifying a pair of neighboring nodes in a
telecommunications network
having at least one distributed restoration sub-network, including
constructing a first C-bit keep alive
message for a first node in a neighboring node pair connected by a link,
embedding the first C-bit
keep alive message within the C-bit of a first DS3 signal, determining a
quality of service information
for the link when looking from the first node to the neighboring node,
embedding the quality of
service information within the C-bit keep alive message, and transmitting the
first DS3 signal from
the first node to a second node in the neighboring node pair over the link,
wherein the first C-bit keep
alive message identifies the first node to the second node and the quality of
service information for
the link when looking from the first node to the second node.
The present invention can be characterized according to another aspect of the
present
invention as a telecommunications network including a plurality of nodes
interconnected by a
plurality of links, and a distributed restoration sub-network, including a
first node having a first
unique identifier, a second node having a second unique identifier, a link
connecting the first node to
the second node, and a DS3 signaling channel within the link, wherein the
first node and second node
are operable to send a DS3 signal having a keep alive message and a quality of
service information of
the link embedded within a C-bit to one another.
It is to be understood that both the foregoing general description and the
following detailed
description are exemplary and explanatory only and are not restrictive of the
invention, as claimed.

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
BRIEF DESCRIPTION OF THE DRAWINGS
The accompanying drawings, which are incorporated in and constitute a part of
this
specification, illustrate several embodiments of the present invention and,
together with the
description, serve to explain the principles of the invention.
Other objects and advantages will be apparent from a reading of the
specification and
appended claims in conjunction with the drawings wherein:
FIGURE 1 conceptually illustrates a simplified telecommunications restoration
network to
provide certain definitions applicable to the present invention;
FIGURE 2 illustrates a restoration subnetwork for illustrating concepts
applicable to the
present invention;
FIGURE 3 conceptually shows a failure within a restoration subnetwork;
FIGURE 4 illustrates two origins/destination nodes pairs for demonstrating the
applicable
scope of the present invention;
FIGUREs SA and SB illustrate the loose synchronization features of the present
invention;
FIGURE 6 shows the failure notification message flow applicable to the present
invention;
FIGURE 7 illustrates the flow of keep-alive messages according to the present
invention;
FIGURE 8 illustrates the flow of path verification messages according to the
teachings of the
present invention;
FIGURE 9 shows a time diagram applicable to the failure notification and fault
isolation
process of the present invention;
FIGURES 10 and 11 illustrate the AIS signal flow within the restoration
subnetwork of the
present invention;
FIGURE 12 describes more completely the failure notification message flow
within the
restoration subnetwork according to the present invention;
FIGURE 13 illustrates the beginning of an iteration of the restoration process
of the present
invention;
FIGURE 14 provides a timed diagram applicable to the explore, return, max flow
and connect
phases of the first iteration of the restoration process of the present
invention;
FIGURE 15 provides a timed diagram associated with the explore phase of the
process of the
present invention;
FIGURE 16 illustrates the possible configuration of multiple
origins/destination node pairs
from a given origin node;
FIGURE 17 depicts two steps of the explore phase of the first iteration of the
restoration
process;
FIGURE 18 provides a timed diagram applicable to the return phase of the
restoration process
of the present invention;

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
9
FIGURE 19 shows steps associated with the return phase of the present process;
FIGURES 20, 21 and 22 illustrates the link allocation according to the return
phase of the
presentinvention;
FIGURE 23 illustrates a typical return message for receipt by the origin node
of a restoration
subrietwork;
FIGURE 24 provides a timed diagram for depicting the modified map derived from
the return
messages received at the origin node;
FIGURE 25 illustrates that part of the restoration subnetwork mode within
origin node has
been allocated to the origin node destination node pair;
FIGURE 26 shows the max flow output for the max flow phase of the present
process;
FIGURE 27 illustrates an optimal routing applicable to the max flow output of
the present
invention;
FIGURE 28 provides a timed diagram for showing the sequence of the connect
phase for the
first iteration of the process of the present invention;
FIGURE 29 illustrates the connect messages for providing the alternate path
routes between
an origin node and destination node of a restoration subnetwork;
FIGUREs 30 and 31 show how the present invention deals with hybrid restoration
subnetworks;
FIGUREs 32 and 33 illustrate the explore phase and return phase, respectively,
applicable to
hybrid networks;
FIGURE 34 shows the time diagram including an extra iteration for processing
hybrid
networks according to the teachings of the present invention;
FIGURES 35 and 36 illustrate a lower quality spare according to the teachings
of the present
invention;
FIGURE 37 illustrate the use of a "I am custodial node" flag of the present
invention;
FIGUREs 38 through 42 describe the restricted re-use features of the present
invention;
FIGURE 43 describes the path inhibit feature of the present invention;
FIGURE 44 further describes the path inhibit feature of the present invention;
FIGURE 45 is a path of a telecommunications network for illustrating the
instant invention;
FIGURE 46 illustrates the messages that are sent from the custodial nodes of
the failed path
of FIGURE 1;
FIGURE 47 is another view of the failed path of FIGURE 45 in which messages
are sent
from the intermediate nodes to their respective downstream nodes;
FIGURE 48 is the failed path of FIGURE 45 showing a message reaching the
destination
node, the interconnections between the origin and destination nodes, and the
use of a spare link for
bypassing the fault; and

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
FIGURE 49 is an illustration of the reuse message of the present invention.
FIGURE 50 is an illustration of a telecommunications network of the instant
invention;
FIGURE 51 is a block diagram illustrating two adjacent cross-connect switches
and the
physical interconnection therebetween; and
FIGURE 52 is an illustration of the structure of an exemplar keep alive
message of the
presentinvention.
FIGURE 53 shows a plurality of nodes of a distributed restoration domain
through which an
alarm signal generated as a result of a malfunction at a link interconnecting
two of the nodes is shown
to be propagated to downstream nodes;
10 FIGURE 54 illustrates the same nodes as shown in the FIGURE 53 DS3
environment, but in
this instance those nodes of the distributed restoration domain each are
provisioned to convert an
incoming alarm signal into a non-alarm signal, and the access/egress nodes of
the distributed
restoration domain are further provisioned to reconvert any received modified
alarm signal back into
an alarm signal;
FIGURE 55 shows the frame structure of a DS3 signal for illustrating the
conversion of an
alarm signal into a non-alarm signal in a DS3 format;
FIGURE 56 is an illustration that is similar to the FIGURE 54 embodiment,
except that the
FIGURE 56 embodiment illustrates a SONET network;
FIGURE 57 shows the format of a STS-3 frame for explaining the conversion of
an alarm
signal into a non-alarm signal in a SONET network;
FIGURE 58 is a simplified block diagram illustrating the flow of a signal into
a node of a
distributed restoration domain;
FIGURE 59 is a block representation of a node in the distributed restoration
domain of the
instant invention provisioned to convert an alarm signal into a non-alarm
signal, and in the case of an
access/egress node, to reconvert a non-alarm signal back into an alarm signal;
FIGURE 60 is a graph demonstrating the statuses of the different signals into
and out of the
nodes of a distributed restoration domain; and
FIGURE 61 is a block representation depicting a communications network with
multiple
interconnect nodes connected via links and ports of the present invention.
DETAILED DESCRIPTION OF THE INVENTION
FIGURE 1 shows telecommunications network portion 10, that includes node 12
that may
communicate with node 14 and node 16, for example. Connecting between node 12
and 14 may be a
set of links such as links 18 through 26, as well as for example, links 28
through 30 between node 12
and node 16. Node 14 and node 16 may also communicate between one another
through links 32
through 36, for example, which collectively may be thought of as a span 38.

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
11
The following description uses certain terms to describe the concepts of the
present invention.
The term 1633SX is a cross-connect switch and is here called a "node." Between
nodes are links,
which may be a DS3, and STS-1, which is essentially the same thing as a DS3,
but which conforms to
a different standard. A link could be an STS-3, which is three STS-is
multiplexed together to form a
single signal. A link may also be a STS-12, which is twelve STS-is multiplexed
together, or a link
could be an STS-12C, which is twelve STS-12s, which are actually locked
together to form one large
channel. A link, however, actually is one unit of capacity for the purposes of
the present invention.
Thus, for purposes of the following description, a link is a unit of capacity
connecting between one
node and another. A span is to be understood as all of the links between two
adjacent nodes.
Adjacent nodes or neighbor nodes are connected by a bundle, which itself is
made up of links.
For purposes of the present description, links may be classified as working,
spare, fail, or
recovered. A working link is a link that currently carries traffic. Spare
links are operable links that
are not currently being used: A spare link may be used whenever the network
desires to use the link.
A failed link is a link that was working, but has failed. A recovered link is
a link that, as will be
described more completely below, has been recovered.
FIGURE 2 illustrates the conceptual example of restoration subnetwork 40 that
may include
origin node 42 that through tandem nodes 44 and 46 connects to destination
node 48. In restoration
subnetwork 40, a path such as paths 50, 52, 54, and 56 includes connections to
nodes 42 through 48,
for example, as well as links between these nodes. As restoration subnetwork
40 depicts, each of the
paths enters restoration subnetwork 40 from outside restoration subnetwork 40
at origin node 42.
With the present embodiment, each of nodes 42 through 48 includes an
associated node
identifier. Origin node 42 possesses a lower node identifier value, while
destination node 48
possesses a higher node identifier value. In the restoration process of the
present invention, the nodes
compare node identification numbers.
The present invention establishes restoration subnetwork 40 that may be part
of an entire
telecommunications network 10. Within restoration subnetwork 40, there may be
numerous paths 50.
A path 50 includes a number of links 18 strung together and crossconnected
through the nodes 44.
The path 50 does not start within restoration subnetwork 40, but may start at
a customer premise or
someplace else. In fact, a path SO may originate outside a given
telecommunications network 10. The
point at which the path SO enters the restoration subnetwork 40, however, is
origin node 42. The
point on origin node 42 at which path 50 comes into restoration subnetwork 40
is access/egress port
58.
In a restoration subnetwork, the failure may occur between two tandem nodes.
The two
tandem nodes on each side of the failure are designated as "custodial" nodes.
If a single failure occurs
in the network, there can be two custodial nodes. In the network, therefore,
there can be many
origin/destination nodes. There will be two origin nodes and two destination
nodes. An origin node

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
12
together with an associated destination node may be deemed an
origin/destination pair. One failure
may cause many origin/destination pairs.
FIGURE 3 illustrates the concept of custodial nodes applicable to the present
invention.
Referring again to restoration subnetwork 40, custodial nodes 62 and 64 are
the tandem nodes
positioned on each side of failed span 66. Custodial nodes 62 and 64 have
bound the failed link and
communicate this failure, as will be described below. FIGURE 4 illustrates the
aspect of the present
invention for handling more than one origin-destination node pair in the event
of a span failure.
Referring to FIGURE 4, restoration subnetwork 40 may include, for example,
origin node 42 that
connects through custodial nodes 62 and 64 to destination node 48. Within the
same restoration
subnetwork, there may be more than one origin node, such as origin node 72. In
fact, origin node 72
may connect through custodial node 62 and custodial node 64 to destination
node 74. As in FIGURE
3, FIGURE 4 shows failure 66 that establishes custodial nodes 62 and 64.
The present invention has application for each origin/destination pair in a
given restoration
subnetwork. The following discussion, however, describes the operation of the
present invention for
one originldestination pair. obtaining an understanding of how the present
invention handles a single
origin/destination pair makes clear how the algorithm may be extended in the
event of several
origin/destination pairs occurring at the same time. An important
consideration for the present
invention, however, is that a single cut may produce numerous
origin/destination pairs.
FIGURES SA and SB illustrate the concept of loose synchronization according to
the present
invention. "Loose synchronization" allows operation of the present method and
system as though all
steps were synchronized according to a centralized clock. Known restoration
algorithms suffer from
race conditions during restoration that make operation of the restoration
process unpredictable. The
restoration configuration that results in a given network, because of race
conditions, depends on
which messages arnve first. The present invention eliminates race conditions
and provides a reliable
result for each given failure. This provides the ability to predict how the
restored network will be
configured, resulting in a much simpler restoration process.
Referring to FIGURE SA, restoration subnetwork 40 includes origin node 42,
that connects to
tandem nodes 44 and 46. Data may flow from origin node 42 to tandem node 46,
along data path 76,
for example. Origin node 42 may connect to tandem node 44 via path 78.
However, path 80 may
directly connect origin node 42 with destination node 48. Path 82 connects
between tandem node 44
and tandem node 46. Moreover, path 84 connects between tandem node 46 and
destination node 48.
As FIGURE SA depicts, data may flow along path 76 from origin node 42 to
tandem node 46, and
from destination node 48 to origin node 42. Moreover, data may be communicated
between tandem
node 44 and tandem node 46. Destination node 48 may direct data to origin node
42 along data path
80, as well as to tandem node 46 using path 84.

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
13
These data flows will all take place in a single step. At the end of a step,
each of the nodes in
restoration subnetwork 40 sends a "step complete" message to its neighboring
node. Continuing with
the example of FIGURE SA, in FIGURE SB there are numerous step complete
messages that occur
within restoration subnetwork 40. In particular, step complete message
exchanges occur between
origin node 42 and tandem node 44 on data path 78, between origin node 42 and
tandem node 46 on
data path 76, and between origin node 42 and destination node 48 on data path
80. Moreover, tandem
node 46 exchanges "step complete" messages with tandem node 44 on data path
82, and between
tandem node 46 and destination node 48 on data path 84.
In the following discussion, the term "hop count" is part of the message that
travels from one
node to its neighbor. Each time a message flows from one node to its neighbor,
a "hop" occurs.
Therefore, the hop count determines how many hops the message has taken within
the restoration
subnetwork.
The restoration algorithm of the present invention may be partitioned into
steps. Loose
synchronization assures that in each step a node processes the message it
receives from its neighbors
in that step. Loose synchronization also makes the node send a step complete
message to every
neighbor. If a node has nothing to do in a given step, all it does is send a
step complete message.
When a node receives a step complete message from all of its neighbors, it
increments a step counter
associated with the node and goes to the next step.
Once a node receives step complete messages from every neighbor, it goes to
the next step in
the restoration process. In looking at the messages that may go over a link,
it is possible to see a
number of messages going over the link. The last message, however, will be a
step complete
message. Thus, during the step; numerous data messages are exchanged between
nodes. At the end
of the step, all the nodes send step complete messages to their neighbors to
indicate that all of the
appropriate data messages have been sent and it is appropriate to go to the
next step. As a result of
the continual data, step complete, data, step complete, message traffic, a
basic synchronization occurs.
In practice, although the operation is not as synchronized as it may appear in
the associated
FIGUREs, synchronization occurs. During the operation of the present
invention, messages travel
through the restoration subnetwork at different times. However, loose
synchronization prevents data
messages from flowing through the restoration subnetwork until all step
complete messages have
been received at the nodes. It is possible for one node to be at step 3, while
another node is at step 4.
In fact, at some 'places within the restoration subnetwork, there may be even
further step differences
between nodes. This helps minimize the effects of slower nodes on the steps
occurring within the
restoration subnetwork.
The steps in the process of the present invention may be thought of most
easily by
considering them to be numbered. The process, therefore, starts at step 1 and
proceeds to step 2.
There are predetermined activities that occur at each step and each node
possesses its own step

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
14
counter. However, there is no master clock that controls the entire
restoration subnetwork. In other
words, the network restoration process of the present invention may be
considered as a distributive
restoration process. With this configuration, no node is any different from
any other node. They all
perform the same process independently, but in loose synchronization.
FIGURE 6 shows the typical form of a failure notification message through
restoration
subnetwork 40. If, for example, origin node 42 desires to start a restoration
event, it first sends failure
notification messages to tandem node 44 via data path 78, to tandem node 46
via data path 76, and
destination node 48 via data path 80. As FIGURE 6 further shows, tandem node
44 sends failure
notification message to tandem node 46 on path 82, as does destination node 48
to tandem node 46 on
path, 84.
The process of the present invention, therefore, begins with a failure
notification message.
The failure notification message is broadcast throughout the restoration
subnetwork to begin the
restoration process from one node to all other nodes. once a node receives a
failure message, it sends
the failure notification message to its neighboring node, which further sends
the message to its
neighboring nodes. Eventually the failure notification message reaches every
node in the restoration
subnetwork. Note that if there are multiple failures in a network, it is
possible to have multiple failure
notification messages flooding throughout the restoration subnetwork
simultaneously.
The first failure notification message initiates the restoration algorithm of
the present
invention.
Moreover, broadcasting the failure notification message is asynchronous in the
sense that as
soon as the node receives the failure notification message, it broadcasts the
message to its neighbors
without regard to any timing signals. It is the failure notification message
that begins the loose
synchronization process to begin the restoration process of the present
invention at each node within
the restoration subnetwork. Once a node begins the restoration process, a
series of events occurs.
Note, however, that before the restoration process of the present invention
occurs, numerous
events are already occurring in the restoration subnetwork. One such event is
the transmission and
receipt of keep alive messages that neighboring nodes exchange between
themselves.
FIGURE 7 illustrates the communication of keep-alive messages that the
restoration process
of the present invention communicates on spare links, for the purpose of
identifying neighboring
nodes. Referring to FIGURE 7, configuration 90 shows the connection via spare
link 92 between
node 94 and node 96. Suppose, for example, that node 94 has the numerical
designation fll", and port
designation 11103". Suppose further that node 96 has the numerical designation
3 and the port
designation 5. On spare link 92, node 94 sends keep-alive message 98 to node
96, identifying its node
number "11" and port number "103". Also, from node 96, keep-alive message 100
flows to node 94,
identifying the keep-alive message as coming from the node having the
numerical value "3", and its
port having the numerical value "5".

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
The present invention employs keep-alive signaling using C-Bit of the DS3
formatted
messages in restoration subnetwork 40, the available spare links carry DS3
signals, wherein the C-bits
convey special keep-alive messages. In particular, each keep-alive message
contains the node
identifier and port number that is sending the message, the WAN address of the
node, and an "I am
S custodial node" indicator to be used for assessing spare quality.
An important aspect of the present invention relates to signaling channels
which occurs when
cross-connect nodes communicate with one another. There are two kinds of
communications the
cross-connects can perform. One is called in-band, another is out-of band.
With in-band
communication, a signal travels over the same physical piece of media as the
working traffic. The
10 communication travels over the same physical media as the path or the same
physical media as the
link. With out-of band signals, there is freedom to deliver the signals
between cross-connects in any
way possible. Out-of band signals generally require a much higher data rate.
In FIGURE 7, for example, in-band messages are piggybacked on links. out-of
band message
traffic may flow along any other possible path between two nodes. With the
present invention,
15 certain messages must flow in-band. These include the keep-alive message,
the path verification
message, and the signal fail message. There are some signaling channels
available to the restoration
process of the present invention, depending on the type of link involved. This
includes SONET links
and asynchronous links, such as DS3 links.
A distinguishing feature between SONET links and DS3 links is that each
employs a different
framing standard for which unique and applicable equipment must conform. It is
not physically
possible to have the same port serve as a SONET port and as a DS3 port at the
same time. In SONET
signal channeling, there is a feature called tandem path overhead, which is a
signaling channel that is
part of the signal that is multiplexed together. It is possible to separate
this signal portion from the
SONET signaling channel. Because of the tandem path overhead, sometimes called
the ZS byte, there
is the ability within the SONET channel to send messages.
On DS3 links, there are two possible signaling channels. There is the C-bit
and the X-bit.
The C-bit channel cannot be used on working paths, but can only be used on
spare or recovered links.
This is because the DS3 standard provides the option using the C-bit or not
using the C-bit. If the C-
bit format signal is used, then it is possible to use the C-bit for signaling.
However, in this instance,
working traffic does not use that format. Accordingly, the C-bit is not
available for signaling on the
working channels. It can be used only on spare links and on recovered limes.
FIGURE 8 illustrates in restoration subnetwork 40 the flow of path
verification messages
from origin node 42 through tandem nodes 44 and 46 to destination node 48.
Path verification
message 102 flows from origin node 42 through tandem nodes 44 and 46 to
destination node 48. In
35. particular, suppose origin node 42 has the label 18, and that working path
52 enters port 58. Path
verification message 102, therefore, contains the labels 18 and 53, and
carries this information

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
16
through tandem nodes 44 and 46 to destination node 48. Destination node 48
includes the label 15
and egress port 106 having the label 29. Path verification message 104 flows
through tandem node 46
and 44 to origin node 42 for the purpose of identifying destination node 48 as
the destination node for
working path 52. .
A path verification message is embedded in a DS3 signal using the X-bits which
are normally
used for very low speed single-bit alarm signaling. In the present invention,
the X-bit state is
overridden with short bursts of data to communicate signal identity to
receptive equipment
downstream. The bursts are of such short duration that other equipment relying
upon traditional use
of the X-bit for alarm signaling will not be disturbed.
The present invention also provides for confining path verification signals
within a network.
In a DRA controlled network, path verification messages are imbedded in
traffic-bearing signals
entering the network and removed from signals leaving the network. Inside of
the network,
propagation of such signals is bounded based upon the DRA-enablement status of
each port. The path
verification messages identify the originating node and the destination node.
The path verification
messages occur on working links that are actually carrying traffic. The path
verification message
originates at origin node 42 and the restoration subnetwork and passes through
tandem nodes until the
traffic reaches destination node 48. Tandem nodes 44 and 46 between the origin
node 42 and
destination node 48, for example, can read the path verification message but
they cannot modify it. At
destination node 48, the path verification message is stripped from the
working traffic to prevent its
being transmitted from the restoration subnetwork.
The present invention uses the X-bit to carry path verification message 104.
one signal format
that the present invention may use is the DS3 signal format. While it is
possible to easily provide a
path verification message on SONET traffic, the DS3 traffic standard does not
readily permit using
path verification message 104. The present invention overcomes this limitation
by adding to the DS3
signal, without interrupting the traffic on this signal and without causing
alarms throughout the
network, path verification message 104 on the DS3 frame X-bit.
The DS3 standard specifies that the signal is provided in frames. Each frame
has a special bit
in it called the X-bit. In fact, there are two X-bits, X-1 and X-2. The
original purpose of the X-bit,
however, was not to carry path verification message 104. The present invention
provides in the X-bit
the path verification message. This avoids alarms and equipment problems that
would occur if path
verification message 104 were placed elsewhere. An important aspect of using
the X-bit for path-
verification message 104 with the present embodiment relates to the format of
the signal. The present
embodiment sends path verification message 104 at a very low data rate, for
example, on the order of
five bits per second. By sending path verification message 104 on the X-bit
very slowly, the
possibility of causing an alarm in the network is significantly reduced. Path
verification message 104
is sent at a short burst, followed by a long waiting period, followed by a
short burst, followed by a

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
17
long waiting period, etc. This method of "sneaking" path verification message
104 past the alarms
permits using path verification message 104 in the DS3 architecture systems.
FIGURE 9 shows conceptually a timeline for the restoration process that the
present
invention performs. With time moving downward, time region 108 depicts the
network status prior to
a failure happening at point 110. At the point that a failure happens, the
failure notification and fault
isolation events occur in time span 112. Upon completion of this step, the
first generation of the
present process occurs, as indicated by space 114. This includes explore phase
116 having, for
example three steps 118, 120 and 122. Return phase 124 occurs next and may
include at least two
steps 126 and 128. These steps are discussed more completely below.
Once a failure occurs, the process of the present invention includes failure
notification and
fault isolation phase 112. Failure notification starts the process by sending
failure notification
messages throughout the restoration subnetwork. Fault isolation entails
determining which nodes are
the custodial nodes. One reason that it is important to know the custodial
nodes is that there are
spares on the same span as the failed span. The present invention avoids using
those spares, because
they are also highly likely to fail. Fault isolation, therefore, provides a
way to identify which nodes
are the custodial nodes and identifies the location of the fault along the
path.
FIGURE 10 illustrates the flow of AIS signals 130 through restoration
subnetwork 40. In the
event of failure 66 between custodial nodes 62 and 64, the AIS message 130
travels through custodial
node 62 to origin node 42 and out restoration subnetwork 40. Also, AIS message
130 travels through
custodial node 64 and tandem node 46, to destination node 48 before leaving
restoration subnetwork
40. This is the normal way of communicating AIS messages 130. Thus, normally
every link on a
failed path sees the same AIS signal.
FIGURE 11, on the other hand, illustrates the conversion of AIS signal 130 to
"signal fail"
signals 132 and 134. SF message 132 goes to origin node 42, at which point it
is reconverted to AIS
message 132. Next, signal 134 passes through tandem node 46 en route to
destination node 48, which
reconverts SF message 134 to AIS message 130.
FIGUREs 10 and 11, therefore, illustrate how the DS3 standard specifies
operations within
the restoration subnetwork. For a DS3 path including origin node 42 and
destination node 48, with
one or more tandem nodes 44, 46. Custodial nodes 62 and 64 are on each side of
the link failure 66.
AIS signal 130 is a DS3 standard signal that indicates that there is an alarm
downstream. Moreover,
- AIS signal 130 could actually be several different signals. AIS signal 130
propagates downstream so
that every node sees exactly the same signal.
With AIS signal 130, there is no way to determine which is a custodial node
62, 64 and which
is the tandem node 44, 46. This is because the incoming signal looks the same
to each receiving
node. The present embodiment takes this into consideration by converting AIS
signal 130 to a signal

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
18
fail or SF signal 132. When tandem node 46 sees SF signal 134, it propagates
it through until it
reaches destination node 48 which converts SF signal 134 back to AIS signal
130.
Another signal that may propagate through the restoration subnetwork 40 is the
ISF signal.
The ISF signal is for a signal that comes into the restoration subnetwork and
stands for incoming
signal fail. An ISF signal occurs if a bad signal comes into the network. if
it comes in as an AIS
signal, there is the need to distinguish that, as well. In the SONET standard
there is already an ISF
signal. The present invention adds the SF signal, as previously mentioned. In
the DS3 standard, the
SF signal already exists. The present invention adds the ISF signal to the DS3
standard.
Consequently, for operation of the present invention in the DS3 standard
environment, there is the
addition of the ISF signal. For operation in the SONET standard environment,
the present invention
adds the SF signal. Therefore, for each of the standards, the present
invention adds a new signal.
To distinguish whether an incoming non-traffic signal received by a node has
been asserted
due to an alarm within a DRA-controlled network, a modified DS3 idle signal is
propagated
downstream in place of the usual Alarm Indication Signal (AIS). This alarm-
produced idle signal
differs from a normal idle signal by an embedded messaging in the C-bit
maintenance channel to
convey the presence of a failure within the realm of a particular network. The
replacement of AIS
with idle is done to aid fault isolation by squelching downstream alarms. Upon
leaving the network,
such signals may be converted back into AIS signals to maintain operational
compatibility with
equipments outside the network. A comparable technique is performed in a SONET
network, where
STS-N AIS signals are replaced with ISF signal and the ZS byte conveys the
alarm information.
Another aspect of the present invention is the ability to manage
unidirectional failures. In a
distributed restoration environment, failures that occur along one direction
of a bi-directional link are
handled by first verifying that the alarm signal persists for a period of time
and then propagating an
idle signal back along the remaining working direction. This alarm produced
idle signal differs from
a normal idle signal by embedded messaging in the C-bit maintenance channel to
convey the presence
of a far end receive failure. In this manner, custodial nodes are promptly
identified and restorative
switching is simplified by treating unidirectional failures as if they were bi-
directional failures.
FIGURE 12 illustrates the broadcast of failure notification messages from
custodial nodes 62
and 64. As FIGURE 12 depicts, custodial node 62 sends a failure notification
to origin node 42, as
well as to tandem node 136. Tandem node 136 further broadcasts the failure
notification message to
tandem nodes 138 and 140. In addition, custodial node 64 transmits a failure
notification message to
tandem node 46, which further transmits the failure notification message to
destination node 48.
Also, custodial node 64 broadcasts the failure notification message to tandem
node 140.
FIGURE 13 illustrates the time diagram for the first iteration following fault
isolation. In
particular, FIGURE 13 shows the time diagram for explore phase 116 and return
phase 124 of
iteration 1. FIGURE 14 further illustrates the time diagram for the completion
of iteration 1 and a

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
19
portion of iteration 2. As FIGURE 14 indicates, iteration 1 includes explore
phase 116, return phase
124, max flow phase 142 and connect phase 144. Max flow phase 142 includes a
single step 146.
Note that connect phase 144 of iteration 2 shown by region 148 includes six
steps, 150 through 160,
and occurs simultaneously with explore phase 162 of iteration 2. Note further
that return phase 164 of
iteration 2 also includes six steps 166 through 176.
Each iteration involves explore, return, maxflow, and connect phases. The
restored traffic
addressed by connect message and the remaining unrestored traffic conveyed by
the explore message
are disjoint sets. Hence, there is no conflict in concurrently propagating or
combining these
messaging steps in a synchronous DRA process. In conjunction with failure
queuing, this practice
leads to a restoration process that is both reliably coordinated and
expeditious.
The iterations become longer in duration and include more steps in subsequent
iterations.
This is because with subsequent iterations, alternate paths are sought. A path
has a certain length in
terms of hops. A path may be three hops or four hops, for example. In the
first iteration, for example,
a hop count may be set at three. This, means that alternate paths that are
less than or equal to three
hops are sought. The next iteration may seek alternate paths that are less
than or equal to six hops.
Setting a hop count limit per iteration increases the efficiency of the
process of the present
invention. With the system of the present invention, the number of iterations
and the number of hop
counts for each iteration is configurable. However, these may also be preset,
depending on the degree
of flexibility that a given implementation requires. Realize, however, that
with increased
configurability, increased complexity results. This increased complexity may,
in some instances,
generate the possibility for inappropriate or problematic configurations.
FIGURE 15, for promoting the more detailed discussion of the explore phase,
shows explore
phase 116, which is the initial part of the first iteration 114. FIGURE 16
shows restoration network
portion 170 to express the idea that a single origin node 42 may have more
than one destination node.
In particular, destination node 180 may be a destination node for origin node
42 through custodial
nodes 62 and 66. Also, as before, destination node 48 is a destination node
for origin node 42. This
occurs because two working paths, 182 and 184, flow through restoration
subnetwork portion 170,
both beginning at origin node 42. During the explore phase, messages begin at
the origin nodes and
move outward through the restoration subnetwork. Each explore message is
stored and forwarded in
a loosely synchronized manner. Accordingly, if a node receives the message in
step 1, it forwards it
in step 2. The neighboring node that receives the explore message in step 1
transmits the explore
message to its neighboring node in step 2. Because the present invention
employs loose
synchronization it does not matter how fast the message is transmitted from
one neighbor to another,
it will be sent at the next step irrespectively.
If the explore phase is three steps long, it may flood out three hops and no
more. The
following discussion pertains to a single origin-destination pair, but there
may be other

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
origin/destination pairs performing the similar or identical functions at the
same time within
restoration subnetwork 40. If two nodes send the explore message to a
neighboring node, only the
first message received by the neighboring node is transmitted by the
neighboring node. The message
that is second received by the neighboring node is recognized, but not
forwarded. Accordingly, the
5 first node to reach a neighboring node with an explore message is generally
the closest node to the
neighboring node. When an explore message reaches the destination node, it
stops. This step
determines the amount of spare capacity existing in the restoration subnetwork
between the origin
node and the destination node.
Because of loose synchronization, the first message that reaches origin node
42 and
10 destination node 48 will be the shortest path. There are no race conditions
within the present
invention's operation. In the explore message, the distance between the origin
node and destination
node is included. This distance, measured in hops, is always equal to or less
than the number of steps
allowed for the given explore phase. For example, if a destination node is
five hops from the origin
node by the shortest path, the explore. phase with a three hop count limit
will never generate a return
15 message. On the other hand, an explore phase with a six hop count limit
will return the five hop
count information in the return message.
In the explore message there is an identification of the origin-destination
pair to identify
which node sent the explore message and the destination node that is to
receive the explore message.
There is also a request for capacity. The message may say, for example, that
there is the need for
20 thirteen DS3s, because thirteen DS3s failed. In practice, there may be not
just DS3s, but also STS-ls,
STS-12C's, etc. The point being, however, that a certain amount of capacity is
requested. At each
node that the explore message passes through, the request for capacity is
noted. The explore phase is
over once the predetermined number of steps have been completed. Thus, for
example, if the explore
phase is to last three steps, at step 4, the explore phase is over. This
provides a well-defined end for
the explore phase.
FIGURE 17 illustrates restoration subnetwork 40 for a single-origin
destination pair,
including origin node 42 and destination node 48. In restoration subnetwork
40, origin node 42, at the
beginning of the explore phase, takes step 1 to send an explore message to
tandem node 44, tandem
node 46 and tandem node 186. At step 2, tandem node 46 sends an explore
message to tandem node
188 and to destination node 48. At step 2, tandem node 44 sends an explore
message to tandem node
46, tandem node 46 sends an explore message to tandem node 188, and to
destination node 48, and
tandem node 186 sends explore messages to tandem node 46 and to destination
node 48. Note that
explore messages at step 2 from tandem node 44 to tandem node 46 and from
tandem node 186 to
tandem node 46 are not forwarded by tandem node 46.

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
21
FIGURE 18 illustrates the time diagram for the next phase in the restoration
process of the
present invention, the return phase 24, which during the first iteration,
includes three steps, 126, 128
and 129.
FIGURE 19 illustrates the return phase of the present invention, during the
first iteration.
Beginning at destination node 48, at step 4, return message flows on path 192
to tandem node 46, and
on path 190 to tandem node 186. At step S, the return message flows on path 76
to origin node 42.
Also, from tandem node 186, a return message flows to origin node 42.
During the return phase, a return message flows over the same path traversed
by its
corresponding explore phase, but in the opposite direction. Messages come from
the destination node
and flow to the origin node. In addition, the return phase messages are
loosely synchronized as
previously described. The return phase messages contain information relating
to the number of spare
limes available for connecting the origin node to the destination node.
In the return phase, information relating to the available capacity goes to
the origin node.
Beginning at destination node 48, and continuing through each tandem node 44,
46, 186 en route to
origin node 42, the return message becomes increasingly longer. The return
message, therefore,
contains information on how much capacity is available on each span en route
to the origin node. The
result of the return message received is the ability to establish at the
origin node a map of the
restoration network showing where the spare capacity is that is useable for
the restoration.
FIGURE 20 illustrates tandem node 44, that connects to tandem node 46 through
span 38.
Note that span 38 includes six links 32, 34, 36, 196, 198 and 200. FIGURES 21
and 22 illustrate the
allocation of links between the tandem nodes 44, 46 according to the preferred
embodiment of the
present invention. Referring first to FIGURE 21, suppose that in a previous
explore phase, span 38
between tandem nodes 44 and 46 carnes the first explore message (5,3)
declaring the need for four
links for node 46, such as scenario 202 depicts. Scenario 204 shows further a
message (11,2)
requesting eight link flows from tandem node 44, port 2.
FIGURE 22 illustrates how the present embodiment allocates the six links of
span 38. In
particular, in response to the explore messages from scenarios 202 and 204 of
FIGURE 21, each of
tandem nodes 44 and 46 knows to allocate three links for each origin
destination pair. Thus, between
tandem nodes 44 and 46, three links, for example links 32, 34 and 36 are
allocated to the (5,3) origin
destination pair. Links 196, 198 and 200, for example, may be allocated to the
origin/destination pair
(11,2).
FIGURE 23 illustrates the results of the return phase of the present
invention. Restoration
subnetwork 40 includes origin node 42, tandem nodes 208, 210 and 212, as well
as tandem node 44,
for example. As FIGURE 23 depicts, return messages carry back with them a map
of the route they
followed and how much capacity they were allocated on each span. Origin node
42 collects all the
return messages. Thus, in this example, between origin node 42 and tandem node
44, four links were

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
22
allocated between origin node 42 and node 208. Tandem node 208 was allocated
ten links to tandem
node 210. Tandem node 210 is allocated three links, with tandem node 17. And
tandem node 17 is
allocated seven links with tandem node 44.
The next phase in the first iteration of the process of the present invention
is the maxflow
phase. The maxflow is a one-step phase and, as FIGURE 24 depicts, for example,
is the seventh step
of the first iteration. All of the work in the maxflow phase for the present
embodiment occurs at
origin node 42. At the start of the maxflow phase, each origin node has a
model of part of the
network. This is the part that has been allocated to the respective
origin/destination pair by the
tandem nodes.
FIGURE 25 illustrates that within origin node 42 is restoration subnetwork
model 214, which
shows what part of restoration subnetwork 40 has been allocated to the origin
node 42-destination
node 48 pair. In particular, model 214 shows that eight links have been
allocated between origin node
42 and tandem node 46, and that eleven links have been allocated between
tandem node 46 and
destination node 48. Model 214 further shows that a possible three links may
be allocated between
tandem node 46 and tandem node 186.
As FIGURE 26 depicts, therefore, in the maxflow phase 142 of the present
embodiment,
origin node 42 calculates alternate paths through restoration subnetwork 40.
This is done using a
maxflow algorithm. The maxflow output of FIGURE 26, therefore, is a flow
matrix indicating the
desired flow of traffic between origin node 42 and destination node 48. Note
that the maxflow output
uses neither tandem node 44 nor tandem node 188.
FIGURE 27 illustrates a breadth-first search that maxflow phase 142 uses to
find routes
through the maxflow phase output. In the example in FIGURE 27, the first route
allocates two units,
first from origin node 42, then to tandem node 186, then to tandem node 46,
and finally to destination
node 48. A second route allocates three units, first from origin node 42 to
tandem node 186, and
finally to destination node 48. A third route allocates eight units, first
from origin node 42 to tandem
node 46. From tandem node 46, these eight units go to destination node 48.
The last phase in the first iteration in the process of the present embodiment
includes connect
phase 144. For the example herein described, connect phase includes steps 8
through 13 of the first
iteration, here having reference numerals 150, 152, 154, 156, 220 and 222,
respectively.
The connect phase is loosely synchronized, as previously described, such that
each connect
message moves one hop in one step. Connect phase 144 overlaps explore Phase
162 of each
subsequent next iteration, except in the instance of the last iteration.
Connect phase 144 distributes
information about what connections need to be made from, for example, origin
node 42 through
tandem nodes 46 and 186, to reach destination node 48.
In connect phase 144, messages flow along the same routes as identified during
maxflow
phase 142. Thus, as FIGURE 29 suggests, a first message, Ml, flows from origin
node 42 through

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
23
tandem node 186, through tandem node 46 and finally to destination node 48,
indicating the
connection for two units. Similarly, a second message, M2, flows from origin
node 42 through
tandem node 186 and then directly to destination node 48, for connecting a
three-unit flow path.
Finally, a third connect message, M3, emanates from origin node 42 through
tandem node 46, and
then the destination node 48 for allocating eight units. Connect phase 144 is
synchronized so that
each step in a message travels one hop.
For implementing the process of the present invention in an existing or
operational network,
numerous extensions are required. These extensions take into consideration the
existence of hybrid
networks, wherein some nodes have both SONET and DS3 connections. Moreover,
the present
invention provides different priorities for working paths and different
qualities for spare links. Fault
isolation presents a particular challenge in operating or existing
environments, that the present
invention addresses. Restricted reuse and spare links connected into paths are
additional features that
the present invention provides. Inhibit functions such as path-inhibit and
node-inhibit are additional
features to the present invention. The present invention also provides
features that interface with
existing restoration processes and systems, such as coordination with an
existing restoration algorithm
and process or similar system. To ensure the proper operation of the present
invention, the present
embodiment provides an exerciser function for exercising or simulating a
restoration process, without
making the actual connections for subnetwork restoration. Other features of
the present
implementation further include a drop-dead timer, and an emergency shutdown
feature to control or
limit restoration subnetwork malfunctions. Additionally, the present invention
handles real life
situations such as glass-throughs and staggered cuts that exist in
communications networks. Still
further features of the present embodiment include a hold-off trigger, as well
as mechanisms for hop
count and software revision checking, and a step timer to ensure proper
operation.
FIGURES 30 through 33 illustrate how the present embodiment addresses the
hybrid
networks. A hybrid network is a combination of asynchronous and SONET links.
Restrictions in the
way that the present invention handles hybrid networks include that all
working paths must either be
SONET paths with other than DS3 loading, or DS3 over asynchronous and SONET
working paths
with DS3 accesslegress ports. Otherwise, sending path verification messages
within the restoration
subnetwork 40, for example, may not be practical. Referring to FIGUREs 30 and
31, restoration
subnetwork 40 may include SONET origin A/E port 42, that connects through
SONET tandem port
44, through sonnet tandem port 46 and finally to sonnet destination A/E port
48. In FIGURE 31,
origin AB port 42 is a DS3 port, with tandem port 44 being a sonnet node, and
tandem port 46 being
a DS3 port, for example. Port 106 of destination node 48 is a DS3 port. In a
hybrid network, during
the explore phase, origin node 42 requests different types of capacity. In the
return phase, tandem
nodes 44, 46 allocate different types of capacity.

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
24
An important aspect of connect phase 144 is properly communicating in the
connect message
the type of traffic that needs to be connected. This includes, as mentioned
before, routing DS3s, STS-
ls, OC-3s, and OC-l2Cs, for example. There is the need to keep track of all of
the implementation
details for the different types of traffic. For this purpose, the present
invention provides different
priorities of working paths and different qualities of spare links. With the
present embodiment of the
invention, working traffic is prioritized between high priority and low
priority working traffic.
SONET traffic includes other rules to address as well. For instance, a SONET
path may
include an OC-3 port, which is basically three STS-1 ports, with an STS-1
representing the SONET
equivalent of a DS3 port. Thus, an OC-3 node can carry the same traffic as can
three STS-1. An OC-
3 node can also carry the same traffic as three DS3s or any combination of
three STS-1 and DS3
nodes. In addition, an OC-3 node may carry the same traffic as an STS-3. So,
an OC-3 port can carry
the same traffic as three DS3, three STS-1, or one OC-3. Then, an OC-12 may
carry an OC-12C. It
may also carry the same traffic as up to four OC-3 ports, up to 12 STS-1
ports, or up to twelve DS3
ports. With all of the possible combinations, it is important to make sure
that the large capacity
channels flow through the greatest capacity at first.
An important aspect of the present invention, therefore, is its ability to
service hybrid
networks. A hybrid network is a network that includes both SONET and
asynchronous links, such as
DS3 links. The present invention provides restoration of restoration
subnetwork 40 that may include
both types of links. The SONET standard provides that SONET traffic is
backward compatible to
DS3 traffic. Thus, a SONET link may include a DS3 signal inside it. A
restoration subnetwork that
includes both SONET and DS3 can flow DS3-signals, provided that both the
origin A/E port 42 and
the destination A/E port 48 are DS3 ports. If this were not the case, there
would be no way to send
path verification messages 104 within restoration subnetwork 40.
As with pure networks, with hybrid networks, explore messages request capacity
for network
restoration. These messages specify what kind of capacity that is necessary.
It is important to
determine whether DS3 capacity or SONET capacity is needed. Moreover, because
there are different
types of SONET limes, there is the need to identify the different types of
format of SONET that are
needed. In the return phase, tandem nodes allocate capacity to origin-
destination pairs. Accordingly,
they must be aware of the type of spares that are available in the span. There
are DS3 spares and
SONET spares. Capacity may be allocated knowing which type of spares are
available. There is the
need, therefore, in performing the explore and return phases, to add
extensions that allow for different
kinds of capacity. The explore message of the present invention, therefore,
contains a request for
capacity and decides how many DS3s and how many SONET limos are necessary.
There could be the
need for an STS-1, an STS-3C, or an STS-12C, for example. Moreover, in the
return phase it is
necessary to include in the return message the information that there is more
than one kind of capacity
in the network. When traffic routes through the network it must be aware of
these rules. For

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
instance, a DS3 failed working link can be carried by a SONET link, but
notrice versa. In other
words, a DS3 cannot carry a SONET failed working path.
FIGURES 32 and 33 illustrate this feature. For example, referring to FIGURE
32, origin node
42 may generate explore message to tandem node 44 requesting five DS3s, three
STS-ls, two STS-
5 3(c)s, and one STS-12(c)s. As FIGURE 33 depicts, from the return phase,
origin node 42 receives
return message from tandem node 44, informing origin node 42 that it received
five DS3s, one STS-1,
one STS-3(c), and no STS-12s.
For a hybrid restoration subnetwork 40, and in the maxflow phase, the present
invention first
routes OC-12C failed working capacity over OC-12 spare links. Then, the max
flow phase routes
10 OC-3C, failed working capacity, over OC-12 and OC-3 spare links. Next, the
present embodiment
routes STS-1 failed working links over OC-12, OC-3 and STS-1 spare links.
Finally, the max flow
phase routes DS3 failed working links over OC-12, OC-3, STS-1, and DS3 spare
links. In the
connect phase, the restoration subnetwork of the present invention responds to
hybrid network in a
manner so that tandem nodes get instructions to cross-connect more than one
kind of traffic.
15 FIGURE 34 relates to the property of the present invention of assigning
different priorities for
working paths, and different qualities for spare links. The present embodiment
of the invention
includes 32 levels of priority for working paths; priority configurations
occur at origin node 42, for
example. Moreover, the preferred embodiment provides four levels of quality
for spare links, such as
the following. A SONET 1 for N protected spare link on a span that has no
failed links has the
20 highest quality. The next highest quality is a SONET 1 for N protect port
on a span that has no failed
links. The next highest quality is a SONET 1 for N protected port on the span
that has a failed link.
The lowest quality is a SONET 1-for-N protect port on a span that has a failed
link.
With this configuration, different priorities relate to working paths, and
different qualities for
spare links. At some stages of employing the present process, the feature of
priority working paths
25 and different quality spare links for some uses of the present process, it
is possible to simplify the
different levels of priority and different levels of quality into simply high
and low. For example, high
priority working links may be those having priorities 1 through 16, while low
priority working links
are those having priorities 17 through 32. High quality spares may be, for
example, quality 1 spares,
low quality spares may be those having qualities 2 through 4.
With the varying priority and quality assignments, the present invention may
provide a
method for restoring traffic through the restoration subnetwork. For example,
the present invention
may first try to restore high priority failed working links on high-quality
spare links, and do this as
fast as possible. Next, restoring high-quality failed working links on low-
quality spares may occur.
Restoring low-priority failed working paths on low-quality spare links occurs
next. Finally, restoring
low priority failed working paths on high quality spare links.

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
26
To achieve this functionality, the present invention adds an extra iteration
at the end of
normal iterations. The extra iteration has the same number of steps as the
iteration before it. Its
function, however, is to address the priorities for working paths and
qualities for spare links.
Referring to FIGURE 34, during normal iterations, the present invention will
restore high priority
working paths over high-quality spare links. During the extra iteration, as
the invention restores high-
priority working paths over low-quality spare links, then low-priority working
paths over low-quality
spare links, and finally low-priority working paths over high-quality spare
links. This involves
running the max flow algorithm additional times.
The network restoration process of the present invention, including the
explore, return, and
connect messaging phases may be repeated more than once in response to a
single failure episode
with progressively greater hop count limits. The first set of iterations are
confined in restoring only
high priority traffic. Subsequent or extra iterations may be used seek to
restore whatever remains of
lesser priority traffic. This approach give high priority traffic a preference
in terms of path length.
FIGURES 35-37 provide illustrations for describing in more detail how the
present invention
handles fault isolation. Referring to FIGURE 35, between tandem notes 44 and
46 appear spare lime
92. Between custodial nodes 62 and 64 are working link 18 having failure 66
and spare link 196. If a
spare link, such as spare link 196, is on a,span, such as span 38 that has a
failed working lime, that
spare link has a lower quality than does a spare link, such as spare link 92
on a span that has no failed
links. In FIGURE 35, spare link 92 between tandem notes 46 and 48 is part of a
span that includes no
failed link. In this example, therefore, spare link 92 has a higher quality
than does spare link 196.
Within each node, a particular order is prescribed for sorting lists of spare
ports and lists of
paths to restore. This accomplishes both consistent mapping and preferential
assignment of highest
priority to highest quality restoration paths. Specifically, spare ports are
sorted first by type (i.e.,
bandwidth for STS-12, STS-3), then by quality and thirdly by port label
numbers. Paths to be
restored are sorted primarily by type and secondarily by an assigned priority
value. This quality of a
given restoration path is limited by the lowest quality link along the path.
In addition to these sorting orders, a process is performed upon these lists
in multiple passes
to assign traffic to spare ports while making best use of high capacity, high-
quality resources. This
includes, for example, stuffing high priority STS-1's onto any STS-12's that
are left after all other
STS-12 and STS-3 traffic has been assigned.
Rules determine the proper way of handling different priorities of working
paths and different
qualities of spares in performing the restoration process. In our embodiment
of the invention, there
may be, for example, 32 priority levels. The working traffic priority may
depend on business-related
issues, such as who is the customer, how much money did the customer pay for
communications
service, what is the nature of the traffic. Higher priority working channels
are more expensive than
are lower priority channels. For example, working are assigned priorities
according to these types of

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
27
considerations. Pre-determined configuration information of this type may be
stored in the origin
node of the restoration subnetwork. Thus, for every path in the origin node
priority information is
stored. Although functionally there is no difference between a high priority
working path and lower
priority working path, though higher priority working paths will have their
traffic restored first and
lower priority working paths will be restored later.
The present embodiment includes four qualities of spare links. Spare link
quality has to do
with two factors. A link may either be protected or nonprotected by other
protection schemes. In
light of the priorities of failed working paths and the quality of spare
links, the present invention uses
certain rules. The first rule is to attempt to restore the higher priority
failed working paths on the
highest quality spare links. The next rule is to restore high quality failed
working paths on both high
quality and low quality spares. The third rule is to restore low priority
failed working paths on low
quality spares. The last thing to do is to restore low priority working paths
over high and low quality
spares.
The present invention also it possible for a node to know when it is a
custodial node. Because
there are no keep-alive messages on working links, however, the custodial node
does not know on
what span the failed link resides. Thus, referring to FIGURE 36, custodial
node 64 knows that
custodial node 62 is on the other end of spare link 196. The difficulty
arises, however, in the ability
for custodial nodes 62 and 64 to know that working link 18 having failure 66
and spare lime 196 are
on the same span, because neither custodial node 62 nor custodial node 64
knows on what span is
working link 18.
FIGURE 37 illustrates how the present embodiment overcomes this limitation.
Custodial
node 64, for example, sends a "I am custodial node", flag in the keep alive
messages that it sends on
spare links, such as to non-custodial tandem node 46. Also, custodial node 64
and custodial node 62
both send "I am custodial node" flags on spare 196, to each other. In the
event that the receiving non-
custodial node, such as tandem node 46, is not itself a custodial node, then
it may ignore the "I am
custodial node", flag. Otherwise, the receiving node determines that the
failure is on the link between
itself and the custodial node from which the receiving custodial node receives
the "I am custodial
node" flag.
There may be some limitations associated with this procedure, such as it may
be fooled by
"glass throughs" or spans that have no spares. However, the worst thing that
could happen is that
alternate path traffic may be placed on a span that has a failed link, i.e., a
lower quality spare.
The present embodiment provides this functionality by the use of an "I am
custodial node"
flag that "piggybacks" the keep alive message. Recalling that a custodial node
is a node on either side
of a failed link, when the custodial node is identified, the "I am custodial
node" flag is set. If the flag
appears on a spare link, that means that the neighboring link is the custodial
node. This means that
the node is adjacent to a failure. If the node receiving the flag is also a
custodial node, then the spare

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
28
is on the span that contains the failed lime. So, the custodial node that is
sending the flag to the non-
custodial node, but not getting it back from a non-custodial node a flag, this
means that the spare link
is not in a failed span.
FIGURES 38-42 illustrate the restricted re-use feature of the present
invention. The present
invention also includes a restricted re-use function. A recovered link relates
to the feature of
restricted re-use. Given a path with a failure in it, a recovered link may
exist between two nodes. The
recovered link is a good link but is on a path that has failed. FIGURE 38
shows restoration
subnetwork 40 that includes origin node 42 on link 18 and through custodial
nodes 62 and 64
connects to destination node 48. Failure 66 exists between custodial nodes 62
and 64. The restricted
re-use feature of the present invention involves what occurs with recovered
links, such as recovered
link 224.
With the present invention, there are at least three possible modes of re-use.
One mode of re-
use is simply no re-use. This prevents the use of recovered links to carry
alternate path traffic.
Another possible re-use mode is unrestricted re-use, which permits recovery
links to carry alternate
path traffic in any possible way. Still another re-use mode, and one that the
present embodiment
provides, is restricted re-use. Restricted re-use permits use of recovered
links to carry alternate path
traffic, but only the traffic they carry before the failure.
FIGURE 39 illustrates the restricted re-use concept that the present invention
employs. Link
18 enters origin node 42 and continues through tandem node 226 on link 228 and
230 through
custodial node 64 through recovered link 48.
Restricted re-use includes modifications to the explore and return phases of
the present
invention wherein the process determines where recovered links are in the
network. The process finds
the recovered links and sends this information to the origin node. The origin
node collects
information about where the recovered links are in the network to develop a
map of the recovered
links in the restoration subnetwork. The tandem nodes send information
directly to the origin node
via the wide are network about where the re-use links are.
FIGURE 40 through 42 illustrate how the present embodiment achieves restricted
re-use.
Referring to restoration subnetwork portion 40 in FIGURE 40, origin node 42
connects through
tandem node 44 via link 78, to tandem node 46 via link 82, to tandem node 186
via link 84, and to
destination node 48 via link 190. Note that between tandem node 46 and tandem
node 186 appears
failures 66.
To implement restricted re-use in the present embodiment, during the explore
and return
phases the origin node 42 will acquire a map of recovered links. Thus, as
FIGURE 40 shows within
origin node 42, recovered links 232, 234, and 236 are stored in origin node
42. This map is created
by sending in-band messages, re-use messages, during the explore phase, along
recovered links from
the custodial nodes to the origin and destination nodes, such as origin node
42 and destination node

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
29
48. Thus, as FIGURE 41 illustrates, in the explore phase, reuse messages
emanate from tandem node
46.to tandem node 44 and from there to origin node 42. From tandem node 186,
the re-use message
goes to destination node 48.
In the return phase, such as FIGURE 42 depicts, the destination node sends the
information
that it has acquired through re-use messages to the origin node by
piggybacking it on return messages.
Thus, as shown in FIGURE 42, designation node 48 sends on link 192 a return
plus re-use message to
tandem node 46. In response, tandem node 46 sends a return plus re-use message
on link 76 to origin
node 42.
With the restricted re-use feature and in the max flow phase, origin node 42
knows about
recovered links and "pure" spare links. When the origin node runs the max flow
algorithm, the
recovered links are thrown in with the pure spare links. When the breadth-
first-search is performed,
the present invention does not mix recovered links from different failed
working paths on the same
alternate path.
Another feature of the present invention relates to spare links connected into
paths. In the
event of spare links being connected into paths, often these paths may have
idle signals on them or a
test signal. If a spare link has a test signal on it, it is not possible to
distinguish it from a working
path. In this instance, the present invention avoids using spare links with
"working" signals on them
In the max flow phase, the origin has discovered what may be thought of as
pure spare link.
The origin node also receives information about recovered links, which the
present invention limits to
restricted re-use. In running the max flow algorithm during the max flow phase
of the present
process, the pure spare and recovered links and used to generate a restoration
map of the restoration
subnetwork, first irrespective of whether the links are pure, spare or
recovered.
Another aspect of the present invention is the path inhibit function. FIGURES
43 and 44
illustrate the path inhibit features of the present invention. For a variety
of reasons, it may be
desirable to temporarily disable network restoration protection for a single
port on a given node. It
may be desirable, later, to turn restoration protection back on again without
turning off the entire
node. All that is desired, is to turn off one port and then be able to turn it
back on again. This may be
desirable when maintenance to a particular port is desired. When such
maintenance occurs, it is
desirable not to have the restoration process of the present invention
automatically initiate. The
present invention provides a way to turn off subnetwork restoration on a
particular port. Thus, as
FIGURE 43 shows, origin node 42 includes path 2 to tandem node 44. Note that
no link appears
between node 42 and 44. This signifies that the restoration process of the
present invention is
inhibited along path 240 along origin node 42 and tandem node 44. Working path
242, on the other
hand, exist between origin node 42 and tandem node 46. Link 76 indicates that
the restoration
process of the present invention is noninhibited along this path if it is
subsequently restored.

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
During the path inhibit function, the process of the present invention
inhibits restoration on a
path by blocking the restoration process at the beginning of the explore
phase. The origin node either
does not send out an explore message at all or sends out an explore message
that does not request
capacity to restore the inhibited path. This is an instruction that goes to
the origin node. Thus, during
5 path inhibit, the process of the present invention is to inform origin node
42, for example, to inhibit
restoration on a path by sending it a message via the associated wide area
network.
Referring to FIGURE 44, therefore, tandem node 46 sends a path inhibit message
to origin
node 42. Tandem node 46 receives, for example, a TLl command telling it to
temporarily inhibit the
restoration process on a port. It sends a message to origin node 42 for that
path via wide area network
10 as arrow 246 depicts.
Tandem node 46 sends inhibit path message 246 with knowledge of the Internet
protocol
address of its source node because it is part of the path verification
message. There may be some
protocol involved in performing this function. This purpose would be to cover
the situation wherein
one node fails while the path is inhibited.
15 Another feature of the present invention is that it permits the inhibiting
of a node. With the
node inhibit function, it is possible to temporarily inhibit the restoration
process of the present
invention on a given node. This may be done, for example, by a TLl command. A
node continues to
send its step-complete messages in this condition. Moreover, the exerciser
function operates with the
node in this condition.
20 To support the traditional field engineering use of node port test access
and path Ioopback
capabilities, the restoration process must be locally disabled so that any
test signals and alarm
conditions may be asserted without triggering restoration processing.
According to this technique as
applied to a given path, a port that is commanded into a test access,
loopback, or DRA-disabled mode
shall notify the origin node of the path to suppress DRA protection along the
path. Additional
25 provisions include automatic timeout of the disabled mode and automatic
Ioopback
detectionlrestoration algorithm suppression when a port receives an in-band
signal bearing its own
local node ID.
Direct node-node communications are accomplished through a dedicated Wide Area
Network. This approach bypasses the use of existing in-band and out-of band
call processing
30 signaling and network control links for a significant advantage in speed
and simplicity. In addition,
the WAN approach offers robustness by diversity.
A triggering mechanism for distributed restoration process applies a
validation timer to each
of a collection of alarm inputs, keeps a count of the number of validated
alarms at any point in time,
and generates a trigger output whenever the count exceeds a preset threshold
value. This approach
reduces false or premature DRA triggering and gives automatic protect
switching a chance to restore

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
31
individual link failures. It also allows for localizing tuning of trigger
sensitivity based on quantity and
coincidence of multiple alarms.
The preferred embodiment provides a step Completion Timer in Synchronous DRA.
For each
DRA process initiated within a network node, logic is provided for
automatically terminating the local
S DRA process whenever step completion messages are not received within a
certain period of time as
monitored by a failsafe timer. Other causes for ending the process are loss of
keep alive signals
through an Inter-node WAN link, normal completion of final DRA iteration,
consumption of all
available spare ports, or an operation support system override command.
Another aspect of the present invention is a method for Handling Staggered
Failure Events in
DRA. In a protected subnetwork, an initial link failure, or a set of nearly
simultaneous failures,
trigger a sequence of DRA processing phases involving message flow through the
network. Other
cuts that occur during messaging may similarly start restoration processing
and create confusion and
unmanageable contentions for spare resources. The present technique offers an
improvement over
known methods. In particular, during explore and return messaging phases, any
subsequent cuts that
occur are "queued" until the next Explore phase. Furthermore, in a multiple
iteration approach,
Explore messaging for new cuts is withheld while a final
Explore/Return/Connect iteration occurs in
response to a previous cut. These late-breaking held over cuts effectively
result in a new, separate
invocation of the DPA process.
The present invention includes failure notification messages that include
information about
the software revision and hop count table contents that are presumed to be
equivalent among all
nodes. Any nodes that receive such messages and find that the local software
revision or hop count
table contents disagree with those of the incoming failure notification
message shall render
themselves ineligible to perform further DRA processing. However, a node that
notices a mismatch
and disable DPA locally will still continue to propagate subsequent failure
notification messages.
The present invention provides a way to Audit restoration process data within
nodes that
include asserting and verifying the contents of data tables within all of the
nodes in a restoration-
protected network. In particular, such data may contain provisioned values
such as node id, WAN
addresses, hop count sequence table, and defect threshold. The method includes
having the
operations support system disable the restoration process nodes, write and
verify provisionable data
contents at each node, then re-enabling the restoration process when all nodes
have correct data
tables.
In a data transport network that uses a distributed restoration approach, a
failure simulation
can be executed within the network without disrupting normal traffic. This
process includes an initial
broadcast of a description of the failure scenario, modified DRA messages that
indicate they are
"exercise only" messages, and logic within the nodes that allows the exercise
to be aborted if a real
failure event occurs during the simulation.

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
32
Another aspect of the present invention is the ability to coordinate with
other restoration
processes such as, for example, the RTR restoration system. With the present
invention, this becomes
a challenge because the port that is protected by the restoration process of
the present invention is
often also protected by other network restoration algorithms.
Another aspect of the present invention is the exerciser function. The
exerciser function for
the restoration process of the present invention has two purposes. one is a
sanity check to make sure
that the restoration process is operating properly. The other is an exercise
for capacity planning to
determine what the restoration process would do in the event of a link
failure. With the present
invention, the exerciser function operates the same software as does the
restoration process during
subnetwork restoration, but with one exception. During the exerciser function,
connections are not
made. Thus, when it comes time to make a connection, the connection is just
not made.
With the exerciser function, essentially the same reports occur as would occur
in the event of
a link failure. Unfortunately, because of restrictions to inband signaling,
there are some messages that
may not be exchanged during exercise that would be exchanged during a real
event. For that reason,
during the exercise function it is necessary to provide the information that
is in these untransmittable
messages. However, this permits the desired exerciser function.
Another aspect of the present invention is a dropdead timer and emergency shut
down. The
drop-dead timer and emergency shut down protect against bugs or defects in the
software. If the
restoration process of the present invention malfunctions due to a software
problem, and the
instructions become bound and aloof, it is necessary to free the restoration
subnetwork. The dropdead
timer and emergency shut down provide these features. The drop-dead timer is
actuated in the event
that a certain maximum allowed amount of time in the restoration process
occurs. By establishing a
maximum operational time the restoration network can operate for 30 seconds,
for example, but no
more. If the 30 second point occurs, the restoration process turns off.
An emergency shut down is similar to a drop-dead timer, but is manually
initiated. For
example, with the present invention, it is possible to enter a TLl command to
shut down the
restoration process. The emergency shut down feature, therefore, provides
another degree of
protection to compliment the drop dead timer.
Out-of band signaling permits messages to be delivered over any communication
channel that
is available. For this purpose, the present invention uses a restoration
process wide area network. For
purposes of the present invention, several messages get sent out of band.
These include the explore
message, the return message, the connect message, the step complete message,
as well as a message
known as the exercise message which has to do with an exerciser feature of the
present invention.
The wide area network of the present invention operates under the TCP/IP
protocol, but other
protocols and other wide area networks may be employed. In order to use the
wide area network in
practicing the present invention, there is the need for us to obtain access to
the network. For the

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
33
present invention, access to the wide area network is through two local area
network Ethernet ports.
The two Ethernet ports permit communication with the wide area network. In the
present
embodiment of the invention, the Ethernet is half duplex, in the sense that
the restoration subnetwork
sends data in one direction on one Ethernet while information flows to the
restoration subnetwork in
the other direction on the other Ethernet port. The wide area network of the
present invention
includes a backbone which provides the high bandwidth portion of the wide area
network. The
backbone includes the same network that the restoration subnetwork protects.
Thus, the failure in the
restoration subnetwork could potentially cut the wide area network. This may
make it more fragile.
Accordingly, there may be more attractive wide area networks to use with the
present
invention. For example, it may be possible to use spare capacity as the wide
area network. In other
words, there may be spare capacity in the network which could be used to build
the wide area network
itself. This may provide the necessary signal flows to the above-mentioned
types of messages. With
the present invention, making connections through the wide area network is
done automatically.
For the cross-connects of the present invention, there is a control system
that includes a
number of computers within the cross-connect switch. The crossconnect may
include possibly
hundreds of computers. These computers connect in the hierarchy in three
levels in the present
embodiment. The computers that perform processor-intensive operations appear
at the bottom layer
or layer 3. Another layer of computers may control, for example, a shelf of
cards. These computers
occupy layer 2. The layer 1 computers.control the layer 2 computers.
The computers at layer 1 perform the instructions of the restoration process
of the present
invention. This computer may be centralized in the specific shelf where all
layer 1 computers are in
one place together with the computer executing the restoration process
instructions. Because the
computer performing the restoration process of the present invention is a
layer 1 computer, it is not
possible for the computer itself to send in-band messages. If there is the
desire to send an in-band
message, that message is sent via a layer 3 computer. This is because the
layer 3 computer controls
the local card that includes the cable to which it connects. Accordingly, in-
band messages are
generally sent and received by layer 2 and/or layer 3 computers, and are not
sent by layer 1
computers, such as the one operating the restoration instructions for the
process of the present
invention.
Fault isolation also occurs at layer 2 and layer 3 computers within the cross-
connects. This is
because fault isolation involves changing the signals in the optical fibers.
This must be done by
machines at lower layers. Moreover, a port, which could be a DS3 port or a
SONET port, has a state
in the lower layer processors keep track of the port state. In essence,
therefore, there is a division of
labor between layer 2 and 3 computers and the layer 1 computer performing the
instructions for the
restoration process of the present invention.

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
34
With reference to FIGURE 45, an end to end path in a telecommunications
network
provisioned with a distributed restoration algorithm is shown to include an
origin node and a
destination node, represented by O and D, respectively. Interconnecting the
origin node and
destination node are a number of nodes, for example the intermediate nodes Nl
N5. Each of these
nodes in fact is a digital cross-connect switch such as the 1633-SX switch
made by the Alcatel
Company. Each of these switches has a number of ports to which are connected a
number of links for
interconnecting each switch to other switches of the network. For ease of.
explanation and illustration,
as shown in FIGURE 45, each adjacent pair of nodes is connected by a span or
link such as 302, 304,
306, 308, 310 and 312. As is well known, each span can have a number of links
and each adjacent
pair of nodes may in fact have a number of interconnected spans. Also for ease
of discussion, no
other nodes of the telecommunications network are shown to be connected to the
path of FIGURE 45.
As illustrated in FIGURE 45, a fault has occurred in span 306. Such fault may
be for example
a cut in which one or more links of the span have been cut, or in the worst
case scenario, the whole
span has been cut. As is well known, when a fault occurs, the nodes bracketing
or sandwiching the
fault 314 are the first nodes to receive an alarm signal, which then is
propagated by those nodes to
nodes downstream thereof. Thus, as soon as fault 314 occurs at span 306,
custodial nodes N2 and N3
each receive an alarm. Nodes N2 and N3 would in turn propagate the received
alarm to nodes
downstream thereof such as for example nodes NI and the origin node for
custodial node N2, and
nodes N4, NS and the destination node for custodial node N3.
With fault 314 at span 306, the communicative path of FIGURE 45 becomes non-
functioning.
This is despite the fact that there is only one fault, namely fault 314 for
the whole path. Putting it
differently, there is only one span, namely span 306, that has malfunctioned.
Yet data from the origin
node can no longer be routed to the destination node in the path shown in
FIGURE 1. This is so in
spite the fact that span or links 302, 304, 308, 310 and 312 each remain
operational.
One objective of the present invention, as noted above, is to be able to
utilize the functioning
spans or links of a failed path so that there may be a better utilization of
the available resources of the
telecommunications network.
To achieve this end, the network of the instant invention is provisioned with
the ability, in its
distributed restoration algorithm, for the custodial nodes of a fault to send
out a message that informs
nodes downstream thereof of any portions of the failed path that remain intact
and functional.
The first step of the inventive scheme is illustrated in FIGURE 46. As shown,
a "reuse"
message is sent from each of nodes N2 and N3 to their respective adjacent
nodes Nl and N4. The
reuse message propagated from N2 is designated 316 while the reuse message
propagated from N3 is
designated 318. In particular, as shown in FIGURE 49, the reuse message is
shown to include an
identifier field 320 to which an identifier, represented by R, has been added
to designate the message
as being a reuse message. The message of FIGURE 49 further includes a variable
length route

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
information field 322 to which the identification ID for each node can be
added. Other fields of the
FIGURE 49 message not germane to the discussion of the instant invention are
left blame and are not
shown in the messages shown in FIGURES 46-48.
Returning to FIGURE 46, it can be seen that reuse message 316 has in its route
information
5 field the node ID of node N2. On the other hand, reuse message 318 has in
its route information field
the node ID of node N3.
In receipt of reuse message 316, node Nl would append to the route information
field its own
node ID, before propagating the reuse message onward to the origin node, by
way of link 302.
Similarly, upon receipt of the reuse message 318, node N4 would append its own
node ID to the route
10 information field of reuse message 318, before propagating the reuse
message 318 to node NS by way
of link 310. See FIGURE 47. Thus, as the reuse message, be it 316 or 318, gets
propagated from one
node to other nodes downstream thereof, additional node IDs are appended to
the message, until the
message gets to the end node of the path, for example the origin node shown in
FIGURE 47. At that
point, the origin node reads from the route information field of reuse message
316 to find out what
15 intact portions there are of the failed path. As shown in FIGURE 47, origin
node can readily ascertain
from reuse message 316 that nodes N2 and Nl are the nodes that have forwarded
the reuse message.
Therefore, the span or links interconnecting those nodes, as well as the span
or link that interconnects
node Nl to itself are operating properly. Thus, as far as origin node 0 is
concerned, the fault that
causes the path to fail occurs somewhere beyond node N2, and therefore the
elements before node N2
20 remain usable and can be restricted for use by an alternate route for
rerouting the traffic that had been
disrupted by fault 314.
Also shown in FIGURE 47 is the reuse message 318, as sent by custodial node
N3. As
shown, reuse message 318 has been propagated by node N4 to node N5. As seen by
node N5, reuse
message 318 has in its route information field node N3 and node N4. Therefore,
node NS knows that
25 the path connecting it to node N3 remains good. Node NS then appends its
own node ID to the route
information field of reuse message 318 before propagating it, via link 312, to
the destination node.
As shown in FIGURE 48, the destination node is now in receipt of reuse message
318. From
the route information field of reuse message 318, the destination node can
ascertain that fault 314
occurs beyond node N3 and that links 308, 310 and 312 interconnecting nodes
N3, N4, NS remain
30 usable. For the purpose of the instant invention, the conveying of
information between the
intermediate nodes, such as between nodes N3, N4, NS and the destination node
is done by means of
in-band massaging between those nodes. Similarly, the propagation of the reuse
message between
nodes N2, Nl and the origin nodes are done by in-band massaging.
Once in receipt of reuse message 318, the destination node repackages that
reuse message
35 into another "reuse" message 320 and transmits that message to the origin
node by means of a wide
area network (WAN) massaging connection 322. In receipt of the reuse message
320, the origin node

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
36
becomes aware of all intact portions of the failed path shown in FIGURE 48. It
can then formulate an
alternate path by using, for example, the spare link 324 that interconnects
node Nl to N5. Thus, for
the exemplar embodiment shown in FIGURE 48, the alternate restoration path is
able to use links 302
and 312 of the failed path for rerouting the traffic from the origin node to
the destination node. Of
course, other alternate routes) interconnecting the origin node to the
destination node could also be
used.
In order to enhance the restoration of the failed path once fault 314 is
fixed, the useful links of
the failed path of FIGURE 48 are, for the most part, restricted for routing
information from the origin
node to the destination node. In other words, in addition to links 302 and
312, intact links 304, 308
and 310 are reserved for the use of the origin node and the destination node
for routing data
therebetween.
Even though FIGURE 48 illustrates that a different reuse message 320 is sent
by the
destination node to the origin node, it should be appreciated that a reuse
message can also be sent
from the origin node to the destination node 322 to inform the destination
node of links or spans of
the failed path that remain functional. In fact, both end nodes of the failed
path can inform each other
of intact portions of the failed path, if needed.
The exemplar telecommunications network of the instant invention, as shown in
FIGURE 50,
comprises a number of nodes 6302-324 each connected to adjacent nodes by at
least one working link
and one spare link. For example, node 6302 is connected.to node 6304 by means
of a working link 2-
4W and a spare link 2-4S. Similarly, node 6304 is connect to node 6306 by a
working link 4-6W and
a spare link 4-6S. For the sake of simplicity, only the specific links
connecting nodes 6302-6304,
6304-6306 and 6302-6310 are appropriately numbered in FIGURE 50. But it should
be noted that the
working and spare links connecting adjacent nodes can be similarly designated.
For the telecommunications network of FIGURE 50, it is assumed that all of the
nodes of the
network are provisioned with a distributed restoration algorithm (DRA), even
though in practice
oftentimes only one or more portions of the telecommunications network are
provisioned for
distributed restoration. In those instances, those portions of the network are
referenced as dynamic
transmission network restoration (DTNR) domains.
Also shown in FIGURE 50 is an operation support system (OSS) 6326. OSS 6326 is
where
the network management monitors the overall operation of the network. In other
words, it is at OSS
6326 that an overall view, or map, of the layout of each node within the
network is provided. OSS
6326 has a central processor 6328 and a memory 6330 into which data retrieved
from the various
nodes are stored. Memory 6330 may include both a working memory and a database
store. An
interface unit, not shown, is also provided in OSS 6326 for interfacing with
the various nodes. As
shown in FIGURE 50, for the sake of simplicity, only nodes 6302, 6304, 6306,
and 6308 are shown to

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
37
be connected to OSS 6326. Given the interconnections between OSS 6326 and the
nodes of the
network, the goings on within each of the nodes of the network is monitored by
OSS 6326.
Each of nodes 6302-6324 of the network comprises a digital cross-connect
switch such as the
1633-SX broadband cross-connect switch made by the Alcatel Network System
company. Two of
S such adj acently connected switches are shown in FIGURE S 1. The FIGURE S 1
switches may
represent any two adjacent switches shown in the FIGURE 50 network such as for
example nodes
6304 and 6306 thereof. As shown, each of the switches has a number of
access/egress ports 6332,
6334 that are shown to be multiplexed to a line terminating equipment (LTE)
6336, 6338. LTEs 6336
and 6338 are SONET equipment having a detector residing therein for detecting
any failure ofthe
links between the various digital cross-connect switches. Again, for the sake
of simplicity, such LTE
is not shown to be sandwiched between nodes 6334 and 6336, as detection
circuits for interpreting
whether a communication failure has occurred may also be incorporated within
the respective
working cards 6340a, 6340b of node 6304 and 6342a and 6342b of node 6306.
As shown in FIGURE S 1, each of the digital cross-connect switches has two
working links
1 S 6344a and 6344b communicatively connecting node 6304 and node 6306, by
means of the respective
working interface cards 6340a, 6340b and 6342a, 6342b. Also shown connecting
node 6304 and node
6306 are a pair of spare links 6346a and 6346b, which are connected to the
spare link interface cards
6348a, 6348b and 63SOa, 63SOb of node 6304 and node 6306, respectively. For
the FIGURE Sl
embodiment, assume that each of working links 6344a, 6344b and spare links
6346a, 6346b is a part
of a logical span 6352. Further note that even though only four links are
shown to connect node 6304
to node 6306, in actuality, adjacent nodes may be connected by more or less
links. Likewise, even
though only four links are shown to be a part of span 6352, in actuality, a
span that connects two
adjacent nodes may in fact have a greater number of links. For the instant
discussion, assume that
working limes 6344a and 6344b correspond to the working link 4-6W of FIGURE SO
while the spare
2S links 6346a and 6346b of FIGURE S 1 correspond to the spare link 4-6S of
FIGURE S0. For the
purpose of this aspect of the instant invention, each of the links shown in
FIGURE S 1 is presumed to
be a conventional optical carrier OC-12 fiber or is a link embedded within a
higher order (i.e., OC-48
or OC-192) Eber.
Focusing onto node 6304 for the time being, note that each of the interfacing
card, or boards,
of that digital cross-connect switch such as 6340a, 6340b, 6348a and 6348b are
connected to a
number of STS-1 ports 6352 for transmission to SONET LTE 6336. Although not
shown, an
intelligence such as a processor residing in each of the digital cross-connect
switches controls the
routing and operation of the various interfacing boards and ports. Also not
shown but present in each
of the digital cross-connect switches is a database storage for storing a map
which identifies the
3S various sender nodes, chooser nodes and addresses, which will be discussed
later. The working
boards 6342a, 6342b and the spare boards 63SOa, 63SOb are likewise connected
to the accesslegress

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
38
ports 6354 in node 6306. Further shown in FIGURE 51 are non-DRA between
adjacent nodes 6304
and 6306.
For the instant invention, the access/egress ports such as 6332 and 6334 send
their respective
port numbers through the matrix in each of the digital cross-connects to its
adjacent nodes. Thus, for
the exemplar interconnected adjacent nodes 6304 and 6306, ports 6352a and
6352b of node 6304 are
connected to ports 6354a and 6354c of node 6 by means of working link 6344a.
Similarly, ports
6352e and 6352f are interconnected to ports 6354e and 6354f of node 6306 by
way of spare links
6346a and 6346b, respectively. Thus, if node 6304 were to transmit a signal
using spare link 6346a to
node 6306, it will be transmitting such a message from its port 6352e to spare
card 6348a, and then
onto spare link 6346a, so that the message is received at spare card 6350a of
a conventional optical
carrier OC-12 fiber or is a lime embedded within a higher order (i.e., OC-48
or OC-192) fiber.
Focusing onto node 6304 for the time being, note that each of the interfacing
card, or boards,
of that digital cross-connect switch such as 6340a, 6340b, 6348a and 6348b are
connected to a
number of STS-1 ports 6352 for transmission to SONET LTE 6336. Although not
shown, an
intelligence such as a processor residing in each of the digital cross-connect
switches controls the
routing and operation of the various interfacing boards and ports. Also not
shown but present in each
of the digital cross-connect switches is a database storage for storing a map
which identifies the
various sender nodes, chooser nodes and addresses, which will be discussed
later. The working
boards 6342a, 6342b and the spare boards 6350a, 6350b are likewise connected
to the access/egress
ports 6354 in node 6306. Further shown in FIGURE 51 are non-DRA between
adjacent nodes 6304
and 6306.
For the instant invention, the access/egress ports such as 6332 and 6334 send
their respective
port numbers through the matrix in each of the digital cross-connects to its
adj acent nodes. Thus, for
the exemplar interconnected adjacent nodes 6304 and 6306, ports 6352a and
6352b of node 6304 are
connected to ports 6354a and 6354c of node 6306 by means of working link
6344a. Similarly, ports
6352e and 6352f are interconnected to ports 6354e and 6354f of node 6306 by
way of spare links
6346a and 6346b, respectively. Thus, if node 6304 were to transmit a signal
using spare link 6346a to
node 6306, it will be transmitting such a message from its port 6352e to spare
card 6348a, and then
onto spare link 6346a, so that the message is received at spare card 6350a of
node 6306 and then
routed to the receiving port 6354e of node 6306. Thus, as long as each of the
working links and spare
links interconnecting a pair of adjacent nodes, such as for example nodes 6304
and 6306 are
operational, when a message is sent between those nodes, the information
relating to the respective
transmit and receiving ports can be collected by the OSS 6326 (FIGURE 50) so
that a record can be
collected of the various ports that interconnect any two adjacent nodes.
For the instant invention, the inventors have seized upon the idea that a
topology, or map, of
the available spare capacity of the network, in the form of the available
spare links that interconnect

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
39
the nodes, can be generated from stored data that is representative of the
different port numbers of the
various nodes to which spare links are connected. In other words, if a message
transmitted by one
node to its adjacent node is able to provide OSS 6326 a number of parameters
which include for
example the ID of the transmit node, the respective IP (internal protocol)
addresses of the transmit
and receiving ports of the node and the port number from which the message is
transmitted from the
node, the OSS can ascertain, from similar messages that are being exchanged
between adjacent nodes
on spare links connecting those adjacent nodes, an overall picture of the
spare capacity of the
network.
Simply put, if each of the digital cross-connect switches in the DRA
provisioned network
knows what port number and the node that it is connected to by its spare lime,
then that node knows
how to reroute traffic if it detects a failure in one of its working links.
And by collecting the
information relating to each of the nodes of the network, the OSS 6326 is able
to obtain an overall
view of all of the available spare links that interconnect the various nodes.
As a consequence, when a
failure occurs at a given working link, OSS 6326 can send to the custodial
nodes of the failed link a
map of the spare capacity of the network, so that whichever custodial node
designated as the sender or
origin node can then use that map of the spare capacity of the network to
begin the restoration process
by finding an alternate route for rerouting the disrupted traffic.
The structure of the special message to be used for continuously monitoring
the available
spare capacity of the network is shown in FIGURE 52. For the instant
invention, this message is
referred to as a keep alive message. As shown, this keep alive message has a
number of fields. Field
6356 has an 8 bit message field. For the FIGURE 52 message, the 8 bits of data
can be configured to
represent the keep alive message so that each node in receipt of the message
will recognize that it is a
keep alive message for updating the availability status of the spare link from
which the message is
received. OSS 6326, on the other hand, upon receipt of a keep alive message,
would group it with all
the other keep alive messages received from the different nodes for mapping
the spare capacity of the
network.
The next field of the message of FIGURE 52 is field 6358, which is an 8 bit
field that
contains the software revision number of the DRA being used in the network.
The next field is 6360,
which is an 8 bit field containing the node identifier of the transmitting
node. Field 6362 is a 16 bit
field that contains the port number of the transmitting node from which the
keep alive message is
sent.
The next field of the message is field 6364. This is a 632 bit field that
contains the IP address
of the DS3 port on the node that is used for half duplex incoming messages.
The IP address of the
DS3 port of the node that is used for half duplex outgoing messages is
contained in the 632 bit field
66.

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
Field 6368 is a 1 bit field that, when set, indicates to the receiving node
that the message is
sent from a custodial node for a failure. In other words, when there is a
failure, the custodial node of
the failed lime will send out a keep alive message that informs nodes
downstream thereof that the keep
alive message is being sent from a custodial node since a failure has
occurred, and a restoration
5 process will proceed.
The last field of the keep alive message is field 6370. It has 7 bits and is
reserved for future
usage.
In operation, before any failure is detected, keep alive messages such as that
shown in
FIGURE 52 are exchanged on the spare links between adjacent nodes
continuously. By the exchange
10 of these keep alive messages, the network is able to keep a tab of the
various available and functional
spare links and also identify the port number of each node from where each
spare link outputs a keep
alive message, as well as the port number of the adjacent node to which the
spare link is connected
and to which the keep alive message is received. By collecting the data that
is contained in each of
the keep alive messages, a record is kept of the various nodes, the port
numbers, the incoming and
15 outgoing IP addresses of the various spare links that are available in the
network. And from these
collected data, a topology of the available spare capacity of the network can
be generated, by either
the OSS 6326, or by each of the nodes, which can have the collected
information downloaded thereto
for storage. In any event, a map of the available spare links of the network
is available, so that when a
failure does occur, the custodial nodes of the failure could retrieve the up-
to-date map of the spare
20 capacity of the network, and based on that, be able to find the most
efficient alternate route for
rerouting the disrupted traffic.
Given that the instant invention relates to a distributed restoration process,
it should be noted
that an OSS is not necessary for storing the topology of the spare capacity of
the network, as each of
the digital cross-connect switches of the network knows what port number and
the nodes that it is
25 connected to by its spare links. Thus, when a failure occurs, each of the
nodes will continue to send
the keep alive message, as the origin node that is responsible for restoration
can build the entire
topology of the available spare links by retrieving the different keep alive
messages from the various
nodes. Putting it differently, an origin node, in attempting to determine the
available spare links, only
needs to take the sum of all of the keep alive messages since each node that
has at least one spare link
30 will send a keep alive message to the origin node. And, by retrieving the
ID of the node and the port
numbers of the node to which spare links are connected, the spare capacity of
the network can be
ascertained. As a consequence, the map of the spare link topology becomes
available in a distributed
matter to the origin node in the instant invention DRA provisioned network.
As previously stated, in the DRA network the C-bit is used to exchange keep
alive (KA)
35 messages on spare links, although it may be employed on links carrying a
payload. A link has two
ports, one on each end of the link.

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
41
KA messages gives each port information about the other port. A node has one
or more ports
to which the links between nodes are connected. In the conventional
configuration, the node does not
know nor needs to know what other node the links that are connected to it are.
For example, as
shown in FIGURE 61 (See also FIGURE 51), node 100 has ports 10, 15 and 20 and
node 200 has
ports 30, 35 and 40. Node 100 through port 20 is connected to port 30 of node
200 via a link. A link
of this nature may be represented by the following short hand notation (node
100, port 20) to (node
200, port 30). In a conventional configuration, node 100 and node 200 do not
know that they are
connected to each other.
In one aspect of the present invention, the KA messages, as previously stated,
gives each port
and in turn each node information about the other port. The KA messages are
exchanged during
normal operation. The information contained within the KA messages and
specifically carried by or
embedded in the C-bit, although other segments within the DS3 signal may carry
the information, is
utilized during restoration of a failed link or span.
Generally in telecommunications so-called performance monitoring is done,
wherein one port
or a node looks at its incoming signal, monitors the quality of that signal,
then it reports to a higher
authority what the quality of that signal is. The corresponding port at the
other end of the link does
the same type of monitoring. The information in the KA message can be expanded
to contain far-end
"Quality of Service" (QoS) information.
As stated, the QoS information is a measure of the quality of the signal being
received at each
port of the link. This information can include, but is not limited to,
received errored seconds, received
severely errored seconds, and received Loss of Signal (LoS). The QoS
information can be reported
over any predetermined time period. The present aspect of the invention
contemplates the time
intervals to be one of the last 15 minutes, last hour, and last day. The time
over which KA messages
and hence the QoS information is sent could be continuously.
The additional information can be used to assign a quality value to the link,
usually the spare
link. The better the QoS, the higher the quality of the link. Since both ports
have the same
information, both ports assign the link the same quality value. A quality
value is associated with the
transmission of data from one port to another in both directions and it may be
the case that the quality
value is better in one direction than the other.
For instance, in FIGURE 61 the quality value may be equal to 3 for the
transmission of
information from (node 100, port 20) to (node 200, port 30), but exhibit a
quality value equal to S for
the information transmitted from (node 200, port 30) to (node 100, port 20),
assuming a scale from 1
to 10 is used with one being the best and 10 the worst. Therefore, the link
between these ports could
be assigned an averaged quality value of 4. Alternately, the (node 100, port
20) to (node 200, port 30)
could be assigned a value of 5 if it is decided that the lowest of the two QoS
values should lie
assigned to the link.

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
42
With this additional information, the assignment bf data to a specific link
may be modified
based on several criteria. For example, during a restoration event the
distributed restoration algorithm
could determine that the data that has the highest priority should be placed
on the link with the best
QoS and the data with the second highest priority should be placed on the link
with the second best
QoS.
For instance, the banking or stock brokerage information in the payload may be
given the
highest priority whereas system information, called overhead, may be given the
lowest priority.
Depending on the type of disruption that has occurred the system information
may be most critical
and then would receive the highest priority. The combinations are numerous and
will not be explored
fizrther at this time.
Now turning back to when a fault occurs, with reference to prior art FIGURE
53, a Digital
Service 3 (DS3) path that connects node 7301 to node 7306 of a distributed
restoration domain of a
telecommunications network is shown. For the sake of simplicity, no other
nodes of the network, or
the domain, are shown.
As is well known, in a DRA network, when a fault occurs at a link
interconnecting two
adjacent nodes, an alarm is generated and sent to each of the adjacent nodes.
Such a fault, or a
malfunctioned link, is shown to have occurred as a failure between nodes 7303
and 7304 in FIGURE
53. This failure may be due to, for example, a loss of signal (LOS), a loss of
frame (LOF), or a loss
of pointer (LOP) in the signal traversing between nodes 7303 and 7304. For the
discussion of the
instant invention, assume such an alarm signal is an alarm indication signal
(AIS).
In the prior art, each of the nodes of a distributed restoration network, or
domain, is
provisioned to follow the standard set forth in the Bellcore document TR-NWT-
00170 which
mandates that each node downstream of the custodial nodes, such as nodes 7303
and 7304, upon
receipt of the AIS signal, in turn should propagate the signal to nodes
downstream thereof. Thus, in
the illustrated FIGURE 53, upon receipt of the AIS signal, node 7303
propagates the AIS signal to
node 7302, which in turn propagates it to node 7301, which in turn propagates
it along the DS3 path
to nodes downstream thereof. The same flow of the AIS signal received by node
7304 occurs with
respect to node 7304, node 7305 and node 7306. For the FIGURE 53 embodiment,
assume node 7301
and node 7306 are access/egress ports each communicatively interconnecting the
distributed
restoration domain to other parts of the telecommunications network, or other
networks in the case
where the distributed restoration domain is not any part of any network.
The problem with the prior art distributed restoration domain is that since
most, if not all, of
the nodes of the domain will eventually receive the AIS signal, it is quite
difficult, if not impossible, .
for the nodes to determine which are the true custodial nodes, i.e. the nodes
that bracket or sandwich
the fault. Thus, even though the management of the network recognizes readily
that a fault has

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
43
occurred at a certain path, it nonetheless could not isolate the precise
location where the fault
occurred.
An aspect of the present invention is applicable to networks incorporating
digital cross-
connect systems generally, and particularly to networks incorporating
broadband 1633-SX digital
cross-connect switches. FIGURE 54 illustrates one embodiment of the present
invention in which a
fault, or a malfunctioned link, could be readily isolated. For the FIGURE 54
illustration, note that
each of the nodes 7301-7306 is connected to an operations support system (OSS)
7310, which
monitors the operational status of each of the nodes. Similar to the scenario
shown in FIGURE 53, a
fault is presupposed to have occurred between node 7303 and node 7304. At the
time of the link
failure, node 7303 and node 7304, each of which being a digital cross connect
switch such as that
shown in FIGURE 59, detects either a LOS defect, a LOF defect, or an AIS
defect signal, each
defined by the American National Standard Institute (ANSI) standard T1.231.
For ease of discussion,
assume that an AIS signal is detected. When a switch, such as node 7303 or
node 7304, detects an
AIS signal, normally it would propagate or pass the AIS signal downstream to
the next switch along
1S the path such as node 7302 or node 7305, respectively.
In the embodiment of the present invention shown in FIGURE 54, each of the
switches or
nodes is an intelligent network element provisioned with the appropriate
hardware to convert or
modify a received AIS signal into a modified AIS signal, before propagating
such non-alarm signal to
nodes downstream thereof. For the FIGURE 54 embodiment, such non-alarm signal
is a DS3 idle
signal. Thus, with respect to node 7303 and node 7304, note that when each of
those nodes receives
the AIS signal, it converts the received AIS signal into an idle signal and
propagates the idle signal to
nodes) downstream from its output port. For the present invention embodiment,
the DS3 idle signal
contains an embedded message on the C-bit maintenance channel that identifies
the presence of a
fault within the distributed restoration domain.
Upon receipt of an idle signal converted from an AIS signal, each of the
downstream nodes
either transmits or propagates the idle signal to nodes downstream therefrom.
Thus, node 7302.
passes the idle signal received at its input port, via its output port, to
node 7301. Similarly, node 7305,
on receiving an idle signal from node 7304 at its input port, would transmit
the idle signal from its
output port to node 7306. This process is repeated ad infinitum until the idle
signal reaches the
access/egress nodes that interconnect the distributed restoration domain to
the rest of the network.
Thus, for the exemplar embodiment of FIGURE 54, given that only node 7303 and
node 7304
are in receipt of an AIS signal, the management of the distributed restoration
domain, per monitoring
of the domain by OSS 7310, can readily ascertain that the fault occurred
between node 7303 and node
7304, and that traffic should be rerouted away from the malfunctioned link
connecting nodes 7303
and 7304.

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
44
As for the network outside of the distributed restoration domain, since such
network is not
capable of distributedly restoring disrupted traffic and is also in most
instances not controlled by the
management of the domain, the idle signal has to be reconverted back into the
AIS signal so that, as
far as the equipment positioned along the paths of the outside network is
concerned, an alarm has
occurred somewhere within the distributed restoration domain and that
appropriate action needs to be
taken. To achieve this end, at each of the access/egress nodes of the domain,
there is further
provisioned the functionality of reconverting an idle signal received at its
input port into an AIS
signal to be sent via its output port to the nodes downstream thereof in the
network outside of the
distributed restoration domain. With the conversion of the idle signal back to
the AIS signal at the
access/egress nodes, customers or equipment outside of the distributed
restoration domain continue to
receive standards compliant AIS signals.
The process by which an alarm signal is converted into a non-alarm signal,
i.e. the AIS signal
into an idle signal, is explained herein with reference to FIGURE 55, which
shows a DS3 frame
structure, in accordance with the format promulgated under ANSI Standard Tl
.107-95 for example.
In particular, a DS3 signal is partitioned into M-frames of 4760 bits each.
The M-frames each are
divided into 7 M-subframes each having 680 bits. Each M-subframe in turn is
further divided into 8
blocks of 85 bits with 84 of the 85 bits available for payload and one bit
used for frame overhead.
Thus, there are 56 frame overhead bits in a M-frame. These are divided into a
number of different
channels: an M-frame alignment channel (M1, M2, and M3), a M-subframe
alignment channel (F1,
F2, F3 and F4), a P-bit channel (P1 and P2), an X-bit channel (X1 and X2), and
a C-bit channel (C1,
C2 and C3).
The M-frame alignment channel signal is used to locate all 7 M-frames. The M-
subframe
alignment channel signal is used to identify all frame overhead bit positions.
The P-bit channel is
used for performance monitoring, with bits P 1 and P2 being set to 11 or 00.
The C-bit channel bit
(C1, C2 and C3) positions are reserved for applications specific uses.
According to the ANSI T1.107-
95 Standard, the C-bit channel can be employed to denote the presence or
absence of stuffed bits.
Thus, if the 3 C-bits (Cl, C2, and C3) in the M-subframes are set to 1,
stuffing occurs. If those C-bits
are set to 0, there is no stuffing. Also, a majority vote of the three
stuffing bits in each of the M-
subframes is used for the determination. Additional description of the various
bits, the M-subframes
and M-frame of a DS3 signal can be gleaned from the aforenoted T1.107-95
Standard.
To convert an AIS signal into an idle signal with an embedded message, the
inventors seize
upon the fact that the 3 C-bits in M-subframe 5 are typically used and set to
1 but are allowed by
ANSI T1.107 for use as a datalink. Thus, by changing at least one of those C-
bits in M-subframe 5,
the digital cross-connect switch, i.e. the node, can transmit an embedded
message within an otherwise
standard idle signal. For example, when AIS signal is detected at the node,
due to a fault or
malfunction having occurred at a link adjacent to the node, according to the
present invention, the

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
node converts the AIS signal to an idle signal by blocking the AIS received so
that it does not pass
through the node and instead transmitting OS3 idle signal as defined by ANSI
T1.107-95 in its place.
At the same time, the node begins transmitting an embedded message by changing
the 3 C-bits in M-
subframe 5 of the idle signal. Thus, what is output from the node is an idle
signal with a changed C-
5 bit.
To the nodes downstream of the custodial nodes, for example nodes 7302 and
7305, the
detected incoming idle signal, even though it contains all the conventional
attributes of a standard idle
AIS signal, nonetheless has an embedded message or change to it due to the
change of the state of at
least one of the C-bits, so that those nodes are put on notice that that idle
signal contains a message
10 not found in a standard signal. And when this idle signal with the changed
C-bit is propagated to an
access/egress port such as node 7301, in sensing this unconventional idle
signal, node 7301 will
reconvert the idle signal back into an AIS signal for propagation to the nodes
outside the distributed
restoration domain. This is in contrast to the access/egress node having
received a conventional AIS
or idle signal, in which case the same AIS or idle signal is propagated to
nodes downstream thereof
15 outside the distributed restoration domain.
Thus, with the present invention, given that any node within the distributed
restoration
domain, in receipt of an AIS signal, would convert the same into an idle
signal with a changed C-
bit(s), there are at most only two nodes within the distributed restoration
domain that would detect an
AIS signal. These two nodes obviously must be adjacent nodes that are
connected by the
20 malfunctioned link from where the alarm signal originated. With that in
mind and with the
continuous monitoring of all of the nodes of the distributed restoration
domain by OSS 7310, the
isolation of a fault in the distributed restoration domain is easily
accomplished.
If a fault occurs in the network outside of the distributed restoration
domain, the AIS signal
generated as a result of the fault would enter the distributed restoration
domain at any one of its
25 accesslegress nodes. These nodes that interconnect the domain to the
outside network are provisioned
such that any incoming alarm signal is converted to a nonalarm signal to be
propagated to the other
nodes within the distributed restoration domain. As before, in the case of an
AIS signal, the
access/egress node would convert the AIS signal into an idle signal with an
embedded message of at
least one changed C-bit so that this converted idle signal is routed
throughout the distributed
30 restoration domain, until it reaches another access/egress node that
interconnects the distributed
restoration domain to the outside network at another end of the domain. At
which time the second
access/egress node, upon sensing the idle plus changed C-bit signal, would
reconvert that signal back
into an AIS signal and propagate it to nodes downstream thereof outside of the
distributed restoration
domain. With this conversion and reconversion of an AIS signal into the
distributed restoration
35 domain, the management of the domain becomes aware that a fault has
occurred in the
telecommunications network, but outside of its domain.

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
46
FIGURE 56 illustrates another aspect of the present invention in which the
nodes of a
distributed restoration domain are interconnected by optical fibers. The same
discussion with respect
to the FIGURE 55 embodiment is equally applicable herein but for the fact that
a different type of
signal, namely a SONET STS-n type signal, is transmitted among the nodes of
the domain and the
telecommunications network to which the domain is interconnected. For this
type of STS-n signal, in
the event of a link failure, the custodial cross-connect switches on either
side of the malfunctioned
link would detect one of the following conditions: loss of signal (LOS), loss
of frame (LOF), an alarm
indication signal-line (AIS-L), a path loss of pointer (LOP-P) and a path AIS
(AIS-P). All of these
various defect signals are referenced in the ANSI T1.231 Standard.
As with the asynchronous scenario discussed above with reference to FIGURES 54
and 55,
the STS-path AIS signal will propagate throughout the. network when a SONET
link fails, and the
same process of conversion and reconversion as discussed above takes place in
the FIGURE 56
embodiment. But instead of the format shown in FIGURE 55, a STS-n frame
format, such as the
STS-3 frame shown in FIGURE 57, is used. And instead of the C-bit, for the STS-
n type format, the
inventors found that the bits that can be manipulated are the ZS bits in the
payload section of the STS-
n format, for example the STS-3 format of FIGURE 57. As was done with the DS3
format shown in
FIGURE 55, by changing the state of one of the ZS bits, the custodial nodes
can convert the detected
AIS signal into an idle signal with a changed ZS bit. When this changed idle
signal reaches the
access/egress node of the distributed restoration domain, it in turn is
reconverted back into an AIS
signal and propagated to nodes outside the domain.
FIGURE 58 is a logical diagram illustrating the various layers of processing
that take place in
a digital cross-connect switch. For example, an AIS signal is fed to the input
port, and specifically a
port card 7312a in layer 3, or level 3, that performs the basic lower-level
functions such as detecting
an incoming signal and whether that signal is an alarm or not. The signal is
then provided to a shelf
processor 7314a in layer 2 that performs other processing functions. In the
event that an idle signal
with a C-bit message, i.e. one of the C-bits in one of the subframes having
been changed, comes onto
port card 12b, the contents of the C-bit message are dropped at the port card
level and the message is
sent to the shelf processor 7314b at level 2 and then routed to a DRA
processor 7316 so that a
decision can be made as to which port the message is to be output. The C-bit
message is also routed
to the administrative processor 7318 which, together with DRA processor 7316,
accesses the
appropriate database (not shown) to obtain information on the particular port
and the card in the
cross-connect switch to which the signal is to be ultimately routed, so that
the signal is routed to the
appropriate input port in the cross-connect switch downstream of the node. The
reverse operations
occur with an input of an AIS signal in the FIGURE 58 logical diagram.
An exemplar node of the-instant invention is illustrated in FIGURE 59. As
shown, a number
of transceiver units 7320 are connected to a digital cross-connect switch 7322
by way of

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
47
demultiplexers 7324 and multiplexers 7326. The transceiver units 7320 each are
connected to an
alarm processor 7328, a restoration signal sensor 7330 and a restoration
signal generator 7332. The
alarm processor 7328, restoration signal sensor 7330 and restoration signal
generator 7332 are
connected to the processor of the node 7334. The operation of the cross-
connect switch 7322 is of
course controlled by node processor 7334.
Within each transceiver units 7320 there is a signal transceiver detector 7336
which is in
communication with a frame receive unit 7338 and a frame transmitter unit
7340. Frame receive unit
7338 in turn is connected to a converter 7342, while frame transmitter unit is
connected to a converter
7344. Since each of the transceiver units 7320 contains the same components
and operates in the
same way, only the operation of one of the transceiver units 7320 is discussed
hereinbelow.
In receipt of a signal, signal transceiver detector 7336 determines whether
the signal is an
alarm signal or other types of signal. If the input signal is indeed an alarm
signal, signal transceiver
detector 7336 first signals alarm processor 7338 via interface 7336 and then
routes the signal to frame
receive unit 7338. There the signal is parsed and forwarded to converter 7342
so that the alarm signal
is converted into the non-alarm signal, with the state of the appropriate C-
bits) being changed, if it is
a DS3 system. A signal is also forwarded to the restoration signal sensor
7330, to be further
transmitted to the node processor 7334.
The non-alarm signal is then forwarded to demultiplexer 7324 and then the
digital cross
connect switch 7322. With the appropriate determination from node processor
7334, the appropriate
port in the node through which the non-alarm signal is to be output to the
downstream nodes is
selected so that the non-alarm signal is provided to multiplexer 7326 and then
to converter 7344, if
fiuther conversion is required. The non-alarm signal is then provided to frame
transmit unit 7340 and
the signal transceiver detector 7336 and provided to the appropriate port for
output to downstream
nodes.
Note that the node shown in FIGURE 59 is also provisioned as an access/egress
node so that
in the event that a signal input thereto is outside the distributed
restoration domain, that signal, if
indeed it is an AIS signal, is converted into an idle signal with a changed C-
bit message and
propagated to the nodes downstream thereof within the distributed restoration
domain until it reaches
an access/egress node at the other end of the domain. At which time the idle
signal with the changed
C-bit message is reconverted back into an AIS signal and propagated to the
nodes outside the
distributed restoration domain. For an in depth discussion of a node and its
various units for receiving
and transmitting signals therefrom in a SONET environment, the reader is
directed to U.S. patent
5,495,471, the disclosure of which being incorporated by reference herein.
FIGURE 60 is a diagram illustrating the respective statuses of the various
signals both within
the distributed restoration domain of a DS3 system and outside of the domain.
Specifically, note that
outside of the distributed restoration domain, the types of ports that are
being used are both nonDRA

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
48
and that, insofar as the management of the distributed restoration domain is
concerned, no attention
needs to be paid to the input and output signals. See the first row of blocks
designated 7350. At the
DRA access port for an accesslegress node, with an input signal being an AIS
signal, the signal being
output becomes an idle signal with a clianged C-bit message. See row 7352a. If
the signal to the
DRA access input port is an idle signal, however, an idle signal is output
from the node. See row
7352b. For the same access node, there should not be any idle plus C-bit
message inputting from
outside of the distributed restoration domain. Accordingly, there is no output
signal from the access
node shown in row 7352c of FIGURE 60.
Focus now to the egress side of an access/egress node of the distributed
restoration domain, as
shown in rows 7354 of FIGURE 60. There, upon receipt of an AIS signal from
within the domain
means, that AIS signal needs to be provided to the nodes outside of the
domain, as indicated at row
7354a. This of course means that the node within the distributed restoration
domain that is adjacent
to the access/egress node likewise would receive an AIS signal, and that the
fault should,be isolated to
the link that connects the access/egress node to its adjacent node within the
domain. At row 7354b,
note that, similar to row 7354a, if an idle signal is provided to the
access/egress from within the
domain, an idle signal is output from the access/egress node to the nodes
outside the domain. At row
7354c, note that if an idle signal with a changed C-bit message is received at
the access/egress node
from within the restoration domain, the access/egress node will reconvert this
previously converted
alarm signal back into an AIS signal, and will propagate the AIS signal to
nodes downstream thereof
outside of the distributed restoration domain.
Row 7356 of FIGURE 60 illustrates the input and output signals of a node other
than an
access/egress node within the distributed restoration domain. As shown in row
7356a, if the DRA
provisioned node were to receive an AIS signal, then byway of the processing
discussed earlier, this
AIS signal is converted into an idle signal with a changed C-bit message. On
the other hand, if the
node were to receive an idle signal, then nothing is to be done, as the same
idle signal is output from
the node and propagated to nodes downstream thereof. See row 7356b. Similarly,
per row 3s56c,
note that if an idle signal with a changed C-bit message is received by the
node, the same idle signal
with the changed C-bit message is sent to nodes downstream thereof. Thus, the
only time a node
within the distributed restoration domain performs a conversion process is
when it receives an alarm
signal, such as the AIS signal. Putting it differently, the only time that it
converts an AIS signal into
an idle signal is when a link connected thereto becomes defective. Thus, by
locating the adjacent pair
of nodes within the distributed restoration domain each of which has detected
an alarm signal, default
within the domain is easily located.
Even though the discussion so far deals with the interconnection of nodes in a
distributed
restoration domain by means of links, it should be appreciated that the
instant invention is equally
applicable to a distributed restoration domain whose nodes are communicatively
interconnected

CA 02366373 2001-09-05
WO 01/52591 PCT/USO1/00449
49
without links. For example, in the case of a restoration domain that operates
by using microwave
transmission, no physical links are used. Instead, the various nodes are
interconnected by microwave
transmission, which also has its own distinct format. The packet message for a
microwave
transmission also includes particular bits that could be altered using the
same principle of the instant
invention so that, if a given domain of a wireless network is provisioned for
distributed restoration, to
quickly locate or isolate the fault in the event that a malfunction has
occurred, the same altering of
some unused bits in the microwave message can change the status of the signal
that is being
transmitted without affecting the operation of the network.
In the case that in place of a link, it is a node that malfunctions, the
present invention is
equally applicable insofar as that malfunctioning node also generates an alarm
signal that is to be
received by nodes adjacent thereto so that those adjacent nodes would convert
the alarm signal into a
non-alarm signal with an embedded message which is then propagated to nodes
further downstream
thereof in the distributed restoration domain. Thus, the site at which the
fault occurred, be it a link or
a node, could be isolated nonetheless.
I S While preferred embodiments of the present invention have been disclosed
for purposes of
explanation, numerous changes, modifications, variations, substitutions, and
equivalents, in whole or
in part, should now be apparent to those skilled in the art to which the
invention pertains.
Accordingly, it is intended that the invention be limited only by the spirit
and scope of the hereto
appended claims.
Inasmuch as the present invention is subject to many variations, modifications
and changes in
detail, it is intended that all matter described throughout this specification
and shown in the
accompanying drawings be interpreted as illustrative only and not in a
limiting sense. Accordingly, it
is intended that the invention be limited only by the spirit and scope of the
appended claims.

Representative Drawing

A single figure which represents the drawing illustrating the invention.

Administrative Status

2024-08-01:As part of the Next Generation Patents (NGP) transition, the Canadian Patents Database (CPD) now contains a more detailed Event History, which replicates the Event Log of our new back-office solution.

Please note that "Inactive:" events refers to events no longer in use in our new back-office solution.

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Event History , Maintenance Fee and Payment History should be consulted.

Event History

Description	Date
Application Not Reinstated by Deadline	2004-01-05
Time Limit for Reversal Expired	2004-01-05
Deemed Abandoned - Failure to Respond to Maintenance Fee Notice	2003-01-06
Inactive: Notice - National entry - No RFE	2002-02-21
Letter Sent	2002-02-21
Inactive: Cover page published	2002-02-18
Inactive: First IPC assigned	2002-02-13
Application Received - PCT	2002-01-31
Application Published (Open to Public Inspection)	2001-07-19

Abandonment History

Abandonment Date	Reason	Reinstatement Date
2003-01-06

Fee History

Fee Type	Anniversary Year	Due Date	Paid Date
Registration of a document			2001-09-05
Basic national fee - standard			2001-09-05

Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
ALCATEL

Past Owners on Record
SIG H., JR. BADT

Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.

Documents

To view selected files, please enter reCAPTCHA code :

To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.

Filter

Download Selected in PDF format (Zip Archive)

Download Selected as Single PDF

Document Description	Date (yyyy-mm-dd)	Number of pages	Size of Image (KB)
Representative drawing	2001-09-05	1	9
Representative drawing	2002-02-18	1	10
Description	2001-09-05	49	3,567
Abstract	2001-09-05	1	57
Drawings	2001-09-05	27	537
Claims	2001-09-05	4	174
Cover Page	2002-02-18	1	40
Notice of National Entry	2002-02-21	1	193
Courtesy - Certificate of registration (related document(s))	2002-02-21	1	113
Reminder of maintenance fee due	2002-09-09	1	109
Courtesy - Abandonment Letter (Maintenance Fee)	2003-02-03	1	176
PCT	2001-09-05	2	58
PCT	2001-09-05	1	137

Language selection

Menus

English Abstract

French Abstract

Event History

Abandonment History

Fee History

Your request is in progress.

Requested information will be available
in a moment.

Thank you for waiting.

Patent 2366373 Summary

English Abstract

French Abstract

Event History

Abandonment History

Fee History

Your request is in progress.Requested information will be availablein a moment.Thank you for waiting.

Your request is in progress.

Requested information will be available
in a moment.

Thank you for waiting.