Note : Les descriptions sont présentées dans la langue officielle dans laquelle elles ont été soumises.
2C3~8~8
COMPUTER DAT~ STORAGE METHOD
1 BACKGROUND OF THE INVENTION
The disclosed invention is directed generally to
computer data storage, and is directed more particularly
to techniques for storing computer data on storage media
such as magnetic tape.
Magnetic tape systems have been utilized for many
years in the mainframe computer environment as "secondary"
data storage by which computer data on "primary" data
storage, such as magnetic disk systems, could be~period-
ically backed up or transported. More recently, magnetic
tape systems have become part of the microprocessor based
computer environment which includes microcomputers and
"personal computers," also for purposes such as data
transportation and back-up of data on primary data storage
such as fixed disk drives.
Like other computer input/output devices, a magnetic
tape system is connected to a computex by an appropriate
interface which is controlled by appropriate driver
software. The operation of such a tape system and inter-
face is known in the art as, for example, disclosed in
U.S. Patent No. 4,414,593, assigned to common assignee
herein and incorporated by reference.
Typically, the information or data actually stored
on tape includes the computer data being stored as well as
data management information that will allow the data
stored on tape to be restcred on a primary computer
-
2C~8'~'~8
1 storage device. The data management information can
include, for example, computer operating system file and
subdirectory attributes, directory information, and error
checking information. Over the years, various techniques
have been developed to logically format the computer data
and data management information to be written to tape,
including, for example, the QIC (the acronym for "Quarter
Inch tape Committee") series of formatting standards.
Important considerations with known tape logical
formatting techniques include the difficulty in transport-
ing tapes between computers of different operating systems
such as IBM PC DOS, MSDOS, UNIX, OS/2, and Apple Computer
operating environments.
Other considerations with known tape logical format-
ting techniques include the inclusion of explicit direc-
tory tree information in the form of full path names in
the data written to tape.
Further considerations with known tape logical
formatting techniques include the procedure of writing
directory information upstream of the computer data. This
procedure can result in computer data being stored on a
tape that is different from the tape containing the
directory information, which makes selective restoration
cumbersome.
Another consideration with known tape logical
formatting techniques is the inability to retrieve stored
information that is downstream of an extensively corrupted
tape area.
SUMMARY OF THE INVENTION
It would therefore be an advantage to provide a data
storage logical formatting technique which allows data
storage and retrieval by different computer operating
systems.
.. . . . . . ~ , .. .
:. .. . . . . .: :~ ~ .. .
. .. ~ , : . , . , . :
, . . . . . . . : ,
- . . ,
2~ i78
1 It would also be an advantage to provide a data
storage logical formatting technique which etores direc-
tory information implicitly to conserve storage.
It would also be an advantage to provide a data
storage logical formatting technique which physically
locates directory information adjacent the data referenced
thereby.
~ he foregoing and other advantages are provided by
the invention in a technique for logically formatting
selected computer files and subdirectories arranged in
directory tree structure for storage on a storage device
such as a magnetic tape system. The method includes the
steps of serially accessing the selected files and sub-
directories and providing for each selected computer file
a file information header having file information that is
common to a plurality of computer operating systems and a
file data header which includes computer file data. For
each selected subdirectory, a subdirectory information
header is provided which includes subdirectory data and
subdirectory information common to a plurality of computer
operating systems. A directory table is provided for
storing directory information of the file data headers and
the subdirectory headers.
In accordance with the invention, the directory
table is provided pursuant to the steps of (A) changing
the current directory to be the root directory, ~B)
accessing a selected entry in the current directory, (C)
if the accessed entry is a file, providing a directory
table file information header for the file, (D) if the
accessed entry is a subdirectory, the providing a direc-
tory table subdirectory information header for the sub-
directory and changing the current directory to the
accessed subdirectory, (E) if all entries in the current
directory have been accessed, providing a subdirectory end
marker and assigning the next higher level directory to be
:, ,., ~. . . .. ~.: ~ . .. - .
2~ 7~ ~
1 the current directory, and (F) repeating steps (~) through
(E) until all selected entries in the root directory have
been accessed.
A further aspect of the invention includes the
provision for each selected file of zero or more operating
system attributes headers that are specific to a particu-
lar operating system. Similarly, zero or more operating
system attributes can be provided for each selected
subdirectory.
Another aspect of the invention is directed to
storing the selected file and subdirectory headers com-
prising a file set in file set data areas and storing the
directory table headers for such file set in directory
tabie areas that are interleaved with the file set data
areas, with each directory table area having the directory
table headers for the file and subdirectory headers in the
adjacent file set data area.
Another aspect of the invention is directed to
formatting information on data storage media to permit
partial data recovery in the event of catastrophic data
corruption. The method for formatting includes the steps
of organizing-the data storage media into segments of
predetermined storage capacity; dividing each segment into
(a) a segment data area for storing data, (b) a segment
overhead area for storing segment management information,
and (c) a segment error correction area for storing the
results of error correction code calculations on the data
and overhead areas; organizing data stored in data segment
areas into a sequence of headers, each header including an
identifying tag and data identified by such tag; and
providing in each segment overhead area a header link that
identifies the location of the first header in such
segment, if any.
. .
:
. ~
2~ 78
1 BRIEF DESCRIPTION OF THE ~RAWINGS
The advantages and features of the disclosed inven-
tion will readily be appreciated by persons skilled in the
art from the following detailed description when read in
conjunction with the drawiny wherein:
FIG. 1 is a conceptual diagram depicting the major
computer system components with which the disclosed data
storage technique is utilized.
FIG. 2 is a schematic diagram illustrating the
logical organization of a Tape Header tape file in accor-
dance with the invention.
FIG. 3 is a schematic diagram illustrating the
logical organization of a Volume Table tape file in
accordance with the invention.
FIGS. 4A through 4D are schematic diagrams illus-
trating the logical organization of a first File Set tape
file in accordance with the invention.
FIG. 5 is a simplified diagram of a directory tree
structure useful in understanding the manner in which
directory information is organized in accordance with the
invention.
FIG. 6 is a schematic diagram illustrating the
logical organization of the directory table provided by
the invention for the directory tree of FIG. 5.
FIG. 7 is a schematic diagram illustrating the
logical organization of a second File Set tape file in
accordance with the invention.
FIG. 8 is a conceptual illustration depicting the
organization of tape storage into segments and the con-
tents of such segments.
FIGS. 9 through 11 are flow diagrams that set forth
illustrative examples of back-up and restore processes in
accordance with the invention.
2~ L78
1 DETAILED DESCRIPTION OF THE DISCLOSURE
In the following detailed description and in the
several figures of the drawing, like elements are iden-
tified with like reference numerals.
The following disclosure is directed to storing
computer data on a medium such as magnetic tape, and for
ease of understanding will be primarily in the context of
utilizing a tape system for backup and restoration pur-
poses for which tape systems are commonly utilized.
How~ver, the backup and restoration context is intended to
be illustrative and not limiting, and it should be appre-
ciated that the disclosed techniques have applications
beyond tape backup and restoration, including, for ex-
ample, the transportation of computer data between dif-
ferent computers which might utilize different operating
systems. Also, the disclosed and claimed techniques can
be utilized with data storage devices such as disk cart-
ridge systems and optical read/write disk systems.
Further for ease of understanding, the computer data
files shall generally be called disk files to clearly
distinguish them from tape file organization of the data
written to tape, although such computer data files might
not physically originate from disk storage devices.
Referring now to FIG. 1, shown therein is the
overall organization of the data storage logical format-
ting provided by the invention and the sources of the
information included in the formatted data. The informa-
tion to be stored includes selected computer disk files
111 which can include, for example, spreadsheet data
files, database files, text files or program files. The
information to be stored further includes system informa-
tion 113 as to the particular operating system or operat-
ing environment in which the data storage takes place, as
well as information about the files being saved.
: ~ - ~- . : . . .: .
, . .
- . ,
-
,
28C~ ~ 7~3
1 Generally, the selected disk files lll are serially
processed for storage. For example, a selected file is
accessed and its attributes are determined. The file
attributes are utilized to generate file management
information which together with the data contents of theselected disk file are logically formatted, physically
formatted and then written to magnetic tape 17 via a tape
interface 19. The logically formatted data includes
appropriate file management data so that the data written
to tape can be later accessed and restored as disk files.
It should be appreciated that although the system informa-
tion 113 is depicted as a discrete bundle of information
for conceptual purposes, much of the system information
113 is included in the implementation of the tape inter
face 119 and in the application program, as is well known
in the art. It should also be appreciated that implemen-
tation of the tape interface involves applications soft-
ware for controlling the tape interface and also for
interfacing with the user, as is well known in the art.
Restoration is achieved by utilizing the tape
interface 19 to read the magnetic tape and create the disk
files in accordance with the file management information
that was written to tape together with the disX file data.
The particular nature of the file management information
will be discussed in further detail with the particulars
of the logical formatting techniques of the invention.
Referring generally to FIGS. 2, 3, 4A through 4D,
and 7, the information written to tape ls organized into
three (3) different types of tape files.
A Tape Header tape file as depicted in FIG. 2
includes general information about the tape cartridge or
reel and the location of the latest version of the Volume
Table for the tape or series of related tapes.
`; :: . ., : ~ ,, ,
. , . . , .. , . ",. ; . . .
~ z~
1 A Volume Table tape file as depicted in FIG. 3
includes information about all File Set tape files on a
tape or series of related tapes.
A File Set tape file includes either disk file/sub-
directory data ~FIG. 4) or disk image data (FIG. 7) for alogical group of di~k files and subdirectories that are
being copied to tape, for example, the contents of a fixed
disk drive. As is well known in the art, storing file/-
subdirectory data is a file-by-file operation which
utilizes the file directory structure, as distinguished
from image data storage which disregards file and file
directory structures.
The information in a tape file generally includes a
sequence of headers and markers. A header includes a
fixed length tag (e.g., 4 bytes) that defines the nature
and length of the associated header data that follows.
That is, a header includes a header tag and the header
data identified by the header tag. A marker is a fixed
length tag (e.g., 2 bytes) that is used to insèrt nulls
into the written data stream or to delimit organizational
units in the written data stream. Particular illustrative
examples of headers and markers will be discussed further
herein after first discussing the logical organization of
the three types of tape files.
The Tape Header Tape File
Referring ln particular to FIG. 2, schematically
depicted therein is the logical organization of a Tape
Header tape file, with the information being sequential
from left to right. A Tape Header tape file begins with a
Tape ~eader Signature header 121 that includes data
uniquely identifying the tape file type, for example an
ASCII string. Next in sequence is a Tape Format Informa-
tion header 123 which defines the contents of the tape and
. . . .
: , . ~; ' ,~ ,
,
2~ 78
1 the location of the latest version of the Volume Table
tape file. A Tape File End marker 125 follows to indicate
the end of the Tape Header tape file, which in turn can be
followed by one or more null markers 127 that pad buffer
space to a convenient boundary.
: .
The Volume Table Tape File
Referring now to FIG. 3, schematically depicted
therein is the logical organization of a Volume Table tape
file. A Volume Table Signature Header 129 indicates the
beginning of Volume Table tape file, and includes data
that uniquely identifies the tape file type, such as an
ASCII string. The data stream that follows includes for
each File Set a File Set Information header 131 that
includes information about the File Set including, for
example, operating system information, file count, subdi-
rectory count, and so forth. The end of the Volume Table
tape file is identified by a Tape File End marker 12S,
which can be followed by one or more null markers 127 to
pad buffer space to a convenient boundary.
The File Set Tape File For File/Subdirectory Data
Referring now to FIG. 4A, schematically depicted
therein is the logical organization of the three (3) types
of areas of a File Set tape file for file/subdirectory
data: a File Set Information area 133, a File Data area
135, and a Directory Table area 137. Depending on factors
such as the particular tape device, file count, and
subdirectory count, there can be multiple occurrences of
File Data areas 135 and Directory Table areas 137 in a
File Set tape file. Each Directory Table area 137 iden-
tifies the contents of the immediately preceding File Data
area 135.
Z63~ 78
1 Briefly, a File Set Information area 133 includes a
File Set signature header and a duplicate of the associ-
ated File Set Information header included in the Volume
Table tape file (FIG. 3). A File Data area 135 includes
file/subdirectory names, attributes, and file data (i.e.,
file contents) from the backup operation. A Directory
Table area 137 includes only file/subdirectory names and
possibly attributes from the backup operation. The data
in the Directory Table area 137 is utilized for direct
access to files and subdirectories in the File Data area
135.
The File Set Tape File Information Area
Referring to FIG. 4B, a File Set Information area
more particularly includes a File Set signature header 139
which indicates the beginning of a File Set tape file, and
includes a unique identifier that identifies the tape file
type. Next in sequence is a File Set Information header
131 which is a duplicate of the corresponding header in
the Volume Table tape file (FIG. 3).
The File Set Tape File Data Area
Referring to FIG. 4C, a File Data area begins with a
File Data Area Start marker 143 which is followed by
file/subdirectory data. The particular structure of the
file/subdirectory data in a File Data area depends on
whether the data being stored is disk file data or subdi-
rectory data.
File Headers
For each disk file, the following headers are
written in a File Data area.
: -:
.
:
.' ~ .
: '.. . :
8f~t78
~ ~`
1 A File Information header 145 includes information
about files that is common to all computer operating
systems. In particular, the File Information header 145
includes sufficient information to allow the transpor-
tation of files among different operating systems at least
to the level of providing the capability of reading the
contents of the file but not necessarily being able to
restore the file with attributes that are specific to
particular computer operating systems.
Depending on the operating system environment in
which backup takes place, zero or more Operating System
Attributes headers 147 follow the File Information header
145. The Operating System Attributes headers 147 include
operating system specific information about files, sub-
directories, and objects.
Next in sequence are zero or more Data Fork headers
149, followed by zero or more Resource Fork headers 151.
For most operating systems, the Data Fork header 149
includes the file data from the backup operation. The
Resource Fork header 151 is provided for those operating
systems that organize file data into two types, such as
the Apple Computer Macintosh operating environment. If
there is no Resource Fork header data, the data length in
the Resource Fork header tag is set to zero. By way of
example, the length of the data in each of the Data Fork
headers 149 and the Resource Fork headers 151 can be
limited to a predetermined maximum, and a given file can
have multiple Data Fork headers 149 and Resource Fork
headers 151.
While only Data Fork and Resource Fork headers have
been specified, it is contemplated that other fork headers
can be utilized if appropriate for a particular computer
operating system.
:, ., ~, . ..
. ~.', . ' , ,~ ' .' . ' ' ~
2~ 78
12
1 Subdirectory Headers
Further as to the File Set Data area, the following
headers are provided for each subdirectory.
A Subdirectory Information header 153 includes
information about subdirectories that is common to all
computer operating systems. In particular, the Subdirec-
tory Information header 153 includes sufficient informa-
tion to allow the transportation of subdirectories among
different operating systems at least to the level of
providing the capability of restoring the overall subdi-
rectory structure but not necessarily with attributes that
are specific to particular operating systems.
Depending on the operating system environment in
which backup takes place, zero or more Operating System
Attributes headers 147 follow the Subdirectory Information
header 153. The Operating System Attributes headers 147
include operating system specific information about files,
subdirectories, and objects.
It should be noted that although Operating System
Attributes headers 147 can be utilized with disk file data
and subdirectory data, they can include different informa-
tion including, for example, user access limitations at
the subdirectory level in network operating environments.
A Subdirectory End marker 157 is written to tape
after all selected files and subdirectories in a given
directory have been processed and stored. As discussed
more fully relative to the Directory Table area 137 of
the File Set tape file, the use of Subdirectory End
markers 157 together with directory processing procedures
produce implicit directory tree information without using
full path names.
~,
' ~
.
.. . ~ ,
,
2~G~ 78
13
1 File Set Tape File Directory Table Area
Referring now to FIG. 4D, a Dixectory Table area in
a File Set tape file begins with a Directory Table Area
Start marker 159. The data that follows includes at least
one header for each disk file or subdirectory stored in
the associated preceding File Data area.
For each disk file, the headers in a Directory Table
area are the File Information header 145 and any Operating
System Attributes headers 147 as utilized in the File Data
area of the tape file for the file. For each subdirecto-
ry, the headers in a Directory Table area are the Subdi-
rectory Information header 153 and any Operating System
Attributes headers 147 as utilized in the File Data area
of the tape file for the subdirectory. A Subdirectory End
marker 157 is entered into the Directory Table area after
all the selected files and subdirectories in a directory
have been processed and stored.
In the backup process, the headers for a Directory
Table area are buffered in memory until all files and
subdirectories selected for the File Set have been backed
up, or until it reaches a predetermined size, at which
time a Directory Table area is written to tape. If more
files and/or subdirectories for the File Set are to be
backed up after a Directory Table area is written to tape,
a new File Data area is started, and memory is utilized
for buffering another Directory Table area which is
associated with the new File Data area. Thus, a File Set
tape file can include interleaved File Data and Directory
Table areas, with each Directory Table area referring to
the immediately preceding File Data area. The addresses
of all the Directory Table Start Markers are written in
the File Set Information header, which allows a complete
directory to be assembled and produced.
78
14
1 Directory Table Area Organization
Pursuant to a predetermined backup sequence in
accordance with the invention, the Directory Table area is
organized in a manner that provides directory tree infor-
mation without storing full path names. In particular,
the backup sequence involves serially processing the
selected entries of the current directory, where the
current directory is changed when a selected subdirectory
entry is processed or when all selected entries in the
current subdirectory have been processed. In particular,
if the entry being processed in the current directory is a
file, the appropriate headers are generated and stored.
If the entry being processed is a subdirectory, the
appropriate headers are generated and stored, and that
subdirectory becomes the current directory in which
processing continues. When all selected entries in a
subdirectory have been processed, a Subdirectory End
marker is written to tape, and the current directory is
changed to the parent, if any, of the subdirectory that
was the current directory. Thus, if all selected entries
in a current- directory have been processed and that
directory is the root directory, processing of all select-
ed files and subdirectories is complete.
The foregoing predetermined backup sequence can be
more readily understood and appreciated with a simplified
example that will be discussed relative to FIGS. 5 and 6.
Referring in particular to FIG. 5, the root directory has
two files, FILE1 and FILE2, and two subdirectories, SUB1
and SUB2.
The subdirectory SUB1 includes the files FILE21 and
FILE22 and the subdirectories SUBll and SUB12. The
subdirectory SUB11 contains the file FILElll, while the
subdirectory SUB12 includes the file FILE121.
. . .
.. .. : . ~
2~ L7~3
~ .
1 The subdirectory SUB2 includes the files FILE21 and
FILE22, and the subdirectories SUB21 and SUB22. The
subdirectory SUB21 includes the file FILE211 while the
subdirectory SUB22 includes the file FILE221.
In the backup operation, the root directory is
selected as the current directory at the start of process-
ing. The first entry is FILEl, and appropriate headers
are generated and written to tape for that file. Next,
appropriate headers are generated and written to tape for
FILE2. The next entry is the subdirectory SUBl, and
appropriate headers are generated and written to tape for
the subdirectory entry SUBl. The current directory for
processing is then changed to SUBl. Appropriate headers
are generated and written to tape for the files FILEll and
FILE12. Appropriate headers are generated and written to
tape for the subdirectory entry SUB11, and the current
directory for processing is changed to be SUBll. Headers
are then generated for the file FILElll. Since no further
entries in the subdirectory SUBll are to be processed, a
Subdirectory End marker is written to tape; and the
current directory is changed to SUB1. The next entry in
that subdirectory to be processed is SUB12~ and appropri-
ate headers are generated and written to tape for that
entry. The current directory is then changed to SUB12,
and appropriate headers are generated and written to tape
for the file entry FILE121. Since no further entries in
the subdirectory SUB12 are to be processed, a Subdirectory
End marker is written to tape, and the current directory
is changed to SUB1. Since no further entries in the sub-
directory SUB1 are to be processed, a Subdirectory End
marker is written to tape and the current directory is
changed to the root directory.
The next entry in the root directory to be processed
is the subdirectory SUB2, for which appropriate headers
3S are generated and written to tape. The current directory
.. . . ...
;~8¢8~'7~
.
16
1 is changed to the subdirectory SUB2, and processing then
continues in the same manner as described above for the
subdirectory SUBl and its entries.
FIG. 6 conceptually depicts the file and subdirec-
tory information that would be included in the DirectoryTable headers for the simplified example of FIG. 5, and
includes only file and subdirectory names without path
information. In particular r the subdirectory headers, the
Subdirectory End markers, and position in the sequence
unambiguously define position in the direction tree. By
following the Directory Table area data in the sequence it
was written and applying the rules about the current
directory and the Subdirectory End markers, the directory
tree of FIG. 5 will be reconstructed.
It is pointed out that for ease of selectively
restoring selected files or subdirectories, the files in a
given directory should be grouped together so that they
can be written to tape sequentially. With such grouping
of files, the subdirectories in a given directory will
20 necessarily also be grouped together.
The File Set tape file structure for image data is
schematically depicted in FIG. 7. As with the file/sub-
directory File Set, the image File Set tape file includes
a Signature header 167 and a File Set Information header
25 169. Next is an Image Information header 171 that in-
cludes the image data. The Image Information header 171
identifies the image data and is operating system specif-
ic. In other words, lt is not intended that image data be
transported to operating systems different from the one
30 utilized for writin~ the tape. The end of an image File
Set tape file is indicated by a Tape File End Marker 125,
which can be followed by one or more padding null markers
127.
,. , :'
. . , :
- . ~
78
. ,
1 Physical Tape Organization
The physical sequence of tape files on a tape is
dependent on the type of tape device utilized. With tape
devices that are capable of rewriting tape blocks (a pre-
determined number of bytes handled as a logical unit), one
copy of the Tape Header and Volume Table tape files per -
tape is sufficient, although copies may be included for
reliability. With tape devices that are not capable of
rewriting blocks, the latest version of the ~ape Header
and Volume Table tape files must be written after the
latest File Set tape file written to tape.
A further aspect of the disclosed data storage
method is directed to a logical organization of the memory
on a tape which allows recovery of stored data downstream
of region of catastrophic corruption. In particular, the
memory on tape is organized into a sequence of tape
segments, one such segment being schematically depicted in
FIG. 8 with the storage bytes in sequence left to right
and top to bottom. Each segment comprises a group of tape
blocks (not shown) totalling a predetermined amount of
memory such as 32R bytes. Each segment includes three
areas: a data area, an overhead area, an error correction
code (ECC) area, and defective tape blocks (not shown
since such blocks can be at different locations within the
segment areas).
The headers and markers described above relative to
FIGS. 2-7 are stored sequentially in the segment data
areas, where a header can span one or more data areas of
contiguous tape segments. The segment overhead area
includes segment control and management information, and
includes a Header Link field which contains the location
of the first header within the segment, as schematically
depicted in FIG. 8. This information allows the restore
: ~ ' ' ' ; . . ~: ' ''' ~ ' , ' ' .
18
1 process to continue after encountering a non-recoverable
segment.
The segment ECC area includes the results of the ECC
calculations on the segment data and overhead areas.
Known error correction codes that can be utilized include
those generally referred to as Reed-Solomon codes.
A Particular Tape Specification
By way of illustrative example, the following sets
forth a tape specification generally referred to herein as
a Universal Tape Format or UTF, in accordance with the
foregoing logical tape formatting technique. Field and
data lengths are in bytes where fixed, and data lengths
lS that are not fixed as indicated by a question mark ("?").
Values to be determined in the backup process are also
indicated by question marks.
1.0 Tape Segment
A tape segment is a group of tape blocks totaling
32k bytes that are operated on as a unit. Each segment is
comprised of a data area, an ECC area, and defective tape
blocks. The following is an outline of the tape segment.
Length Value Description
? ? Segment Data Area
? ? Segment Overhead Area
? ? Segment ECC Area
? ? Segment Defective Tape Blocks
1.1 Segment Data Area
The segment data area contains the Universal Tape Format
data. The size of the data area is dependent on the size
3S of the overhead, the ECC area and the number of the
:
-: .
` - , ' . : ~
- . ' " :
2~C8'~'7~3 :
19
1 defective tape blocks. The size can range from 512 bytes
to 32k.
1.2 Segment Overhead Area
The Segment Overhead Area contains information that is
used for managing segments and includes a Header Link
field defined below. The size of the Segment Overhead
Area can be configured for each file set (see File Set
Information Header further below).
1.2.1 Header Link
The Header Link is a two byte field that contains the
offset, from the beginning of the segment, of the first
Universal Tape Format Header within the segment. It's
value will be -1 if there is no header within the segment.
The Header Link allows the restore process to continue
restoring after encountering an unrecoverable segment. A
File Set can be configured to not use Header Links in
order to allow the segment data area to be equal to a
multiple of the disk sector size during image backup (see
File Set Information Header).
. . .
1.3 Segment ECC Area
The segment ECC area contains the results of the ECC
calculations on the segment data/overhead area. The size
of the ECC area can be Ok for tape devices with internal
ECC.
~ X~ 7~
1 1.4 Segment Defective Tape Blocks
Some devices can have segments that contain defective tape
blocks. The effective size of the segment is reduced by
the size of all defective tape blocks within the segment.
The segment is represented as a smaller segment even
though the defects can be physically anywhere within the
segment.
Other devices can have segments that are shorter than 32k.
This can happen when a partially filled segment at the end
of a Volume Table or File Set is truncated in order to
conserve space on the tape.
2.0 Tape Files
The Universal Tape Format includes three different tape
files that are written to tape. The Tape Header tape file
contains general information about the tape cartridge and
the location of the latest version of the Volume Table.
The Volume Table tape file contains information about all
File Sets on a tape cartridge ~or sequence of cartridges).
The File Set tape file contains either disk file/sub-
directory data or disk image data.
Each tape file contains a Universal Tape Format data
stream which consists of a sequence of markers, headers,
and header data. These markers, headers, and header data
will be defined in detail later in this specification.
2.1 Tape Layout
The sequence of tape files (Tape Header, Volume Table,
File Set) on a tape is dependent on the type of tape
device that is supported. Tape devices that can rewrite
. ~ . : . . : . :
--` 26~8~78
21
: .
1 blocks on the tape need to keep only one copy of the Tape
Header and Volume Table per tape cartridge. Tape devices
that cannot rewrite blocks on the tape must append the
latest copy of the Tape Header and Volume Table following
the last File Set.
2.1.1 Tape Layout For Devices That Support Block Rewrite
Only one copy each of the Tape Header and the Volume Table
needs to be written to the tape. These can be written at
the beginning of the tape for easy access. Also, in
addition to the Universal Tape Format tape files, some
devices may require a Media Header that contains tape
capaci~y, utilization, and defect mapping information.
This header is not part of the Universal Tape Format data
stream and is not defined here. The following shows the
sequence of tape files for tape devices that support
rewriting of blocks.
''.
Physical tape:
Media Header tcapacity, utilization, defect map info)
Universal tape format:
Tape Header (1 segment)
Volume Table (1 or more segments)
File Set (1 or more segments)
File Set (1 or more segments)
etc.
2.1.2 Tape Layout For Devices That Do Not Support Block
Rewrite
~: - : : : :. ' ' : .
;~G~478
22
1 One copy each of the Tape Header and the Volume Table
needs to be written to the tape following every File Set.
A Tape Header will always appear at the end of recorded
data. The following shows the sequence of tape files for
tape devices that do not support block rewrite.
Physical tape:
Universal tape format:
Tape Header (lk data + lk ECC)
Filemark
File Set (1 or more segments)
Filemark
Volume Table (1 or more segments)
Filemark
Tape Header (lk data + lk ECC)
Filemark
etc.
2.2 Tape Header Tape File
The Tape Header tape file contains general information
about the tape cartridge and the location of the latest
version of the Volume Table. The following shows the
sequence of markers and headers that are in the Tape
Header tape file.
Tape Header Contents:
Tape Header Signature "UTF"
Universal Tape Format Information Header
Tape File End Marker
Null Markers
.
, ., . : : . , .. : :~.. : - : .: . : . . ... . : : . , : : .: ,
2~ '7~3
1 2.3 Volume Table Tape File ,~
.~
The Volume Table tape file contains information about all
File Sets on a tape cartridge (or sequence of cartridges).
There is one File Set Information header for each file
set. The following shows the sequence of markers and
headers that are in the Volume Table tape file.
,' :'
Volume Table Contents:
Volume Table Signature "VTBL"
File Set Information Header
File Set Information Header
etc.
Tape File End Marker
Null Markers
'
~ .
2.4 File Set Tape File
The File Set tape file contains either disk file/sub-
directory data or disk image data.
2.4.1 File/Subdirectory File Set Tape File
The File/Subdirectory File Set contains the disk file data
that is generated by a backup operation. The following
shows the sequence of markers and headers that are in the
File/Subdirectory File Set.
File/Subdirectory File Set Contents:
File Set Information Area:
File Set Signature "FSET"
File Set Information Header
. , ~ ,-. .. :. - . . . -;
. ~; - - - - . - . , : . - ; ~- . . : ,:.- ,. :.
~8~78
l File Data Area:
~file/subdirectory names, attributes, and data)
File Data Area Start Marker
(next 4 headers repeated for each file):
File Information Header
Operating System Attributes Header
Data Fork Header
Resource Fork Header
(next 2 headers repeated for each
subdirectory):
Subdirectory Information Header
Operating System Attributes Header
(all files and subdirectories in this
subdirectory)
Subdirectory End Marker
Directory Table Area:
(file/subdirectory names, attributes)
Directory Table Area Start Marker
(next 2 headers repeated for each file):
File Information Header
Operating System Attributes Header
lnext 2 headers repeated for each
subdirectory):
Subdirectory Information Header
Operating System Attributes Header
(all files and subdirectories in this
subdirectory)
Subdirectory End Marker
- . :, .. . . : ~
2~C8~7~3
. .
1 etc.
. .
Tape File End Marker
Null Markers
File Set Information Area - This area contains a duplicate
of the File Set Information header that is in the Volume
Table. Some of the fields in the header are left null
10 because the information is not available until the end of
the file set.
File Data Area - This area contains the file/subdirectory
names, attributes, and data from the backup operation.
15 There can be multiple occurrences of file data areas.
Directory Table Area - This area contains only file/-
subdirectory names and possibly attributes from the backup
operation. The purpose of the table is to provide direct
20 access capability to files and subdirectories in the file
data area. There can be multiple occurrences of directory
table areas and each directory table area refers to a
companion file data area that immediately precedes the
table.
2.4.2 Image File Set Tape File
The Image File Set contains the di~k image data that is
generated by a backup operation. The following shows the
30 sequence of markers and headers that are in the Image File
Set.
Image File Set Contents:
File Set Information Area:
.. ' . ' . : . - - . : ' . , . ~ ~ ... . ... -. ~ . .
., ~ . - . ., ., ., . : - . . :
-:
;~G8~8
1 File Set Signature "FSET"
File Set Information Header
Image Data Area:
Image Information Header
(Image Data)
File Set Information Area - This area contains a duplicate
of the File Set Information header that is in the Volume
Table. Some of the fields in the header are left null
because the information is not available until the end of
the file set.
Image Data Area - This area contains the disk image data.
Only the Image Information header is used in this area.
The rest of the data is operating system specific.
3.0 Universal Tape Format Markers And Headers
; 20
The Vniversal Tape Format defines structures called
markers and headers that are used to organize subdirectory
and file information and data onto a tape. The Universal
Tape Format is designed to allow the transportation of
files among different operating systems, and to allow
enhancements for new or changing operating systems.
; The following table summarizes all the Universal Tape
Format markers and headers.
Tag Subtag Description
,:
0 0 Null Marker
35 0 1 File Data Area Start Marker
.~.... . ,. . . . .. . ' . - - , ~ . ... : . `
: .
20C8'~1'7B
27
1 0 2 Directory Table Area Start Marker
0 3 Subdirectory End Marker
0 4 Compression/Encryption
Algorithm Start Marker
0 5 Tape File End Marker
1 0 Tape Header Signature "UTF "
1 1 Volume Table Signature "VTBL"
1 2 File Set Signature "FSET"
2 * Universal Tape Format
Information Header
3 O/S File Set Information Header
4 O/S Image Information Header
O/S File Information Header
6 O/S Subdirectory Information Header
7 O/S System Object Information Header
8 O/S Operating System Attributes Header
9 0 Data Fork Header
9 1 Resource Fork 9eader
* Set Path Header
11 * Compression/Encryption Header
.
3.1 Data Formats
;~ 78
1 3.1.1 Marker and Header Alignment
All marker and header structures are aligned to start on
an even byte boundary relative to the start of a tape
segment. If a header structure ends on an odd byte
boundary, then a null byte will be appended to it to align
the next marker or header on an even byte byte boundary.
This appended null byte is not considered part of the
header length
3.1.2 Byte Ordering
All integer and long fields are formatted in a low to high
byte order (Intel format) unless specifically declared
otherwise. The Image Information and Operating System
Attributes headers can have other byte orderings defined.
3.1.3 Date/Time Field
All Date/Time fields are formatted as shown in the
following table unless specifically declared otherwise.
The Image Information and Operating System Attributes
headers can have other date/time formats defined.
~ .'
Length Value Description
1 ? Second
l ? Minute
1 ? Hour
l ? Day
1 ? Month
l ? Year lfrom 1950)
,, :, . . . . .
2~1~8~78
29
1 3.l.4 Cartridge/Block Field
All Cartridge/Block fields are formatted as shown in the
following table unless specifically declared otherwise.
The Cartridge/Block field contains addressing information
that allows direct accessing of segments.
Length Value Description
3 ? Block Number
1 ? Cartridge Number
3.2 Markers
,
Markers are 2 byte tags that are used to insert nulls into
the data stream or to mark a transition in the data
stream. There is no specific data associated with mark-
ers. The following is a generalized representation of allmarkers.
Length Value Description
1 0 Tag = Marker
1 ? Subtag (Marker Type)
Tag - This code is used to identify the structure as a
marker. A11 markers have a tag value of 0.
Subtag - This code is used to identify the marker type.
~ . . . .
:
2~ 78
1 3.2.1 Null Marker
The Null Marker is used to pad buffer space to a conven-
ient boundary.
Length Value Description
1 0 Tag = Marker
1 0 Subtag = Null
3.2.2 File Data Area Start Marker
The File Data Area Start Marker marks the transition to a
file data area in a file set. The file data areas contain
the file and subdirectory information and data for the
file set.
20Length Value Description -
:' , "
l 0 Tag = Marker
1 1 Subtag = File Data Area Start
3.2.3 Directory Table Area Start Marker
The Directory Table Area Start Marker marks the transition
to a directory table area in a file set. The directory
table areas contain only the file and subdirectory infor-
mation (no data) for the file set. The directory table is
used for selective restore operations.
X~
1 Length Value Description
1 0 Tag = Marker
1 2 Subtag = Directory Table Area
Start
:,
3.2.4 Subdirectory End Marker
The Subdirectory End Marker marks the end of the current
subdirectory. The parent of this subdirectory then
becomes the the current subdirectory.
Length Value Description
1 0 Tag = Marker
1 3 Subtag = Subdirectory End
3.2.5 Compression/Encryption Algorithm Start Marker
The Compression/Encryption Algorithm Start Marker marks
the starting (or restarting) the compression and encryp-
tion algorithms.
Length Value Description
1 0 Tag = Marker
1 4 Subtag = Compression/
Encryption Algorithm Start
.;.- . ... ~ ..
.. , .- ,
.
.
; ~
Z~ 78
32
1 3.2.6 Tape File End Marker
The Tape File End Marker is used to mark the end of
meaningful Universal Tape Format data in this tape file
(Tape Header, Volume Table, or File Set). The Tape File
End Marker will be followed by Null Markers to fill to the
end of the current segment.
Length Value Description
1 0 Tag = Marker
1 5 Subtag = Tape File End
~: .
3.3 Headers
Headers are variable length structures that are made up of
a 4 byte tag followed by header data. The following is a
generalized representation of all headers.
Length .Value Description
1 ? Tag
1 ? Subtag
2 ? Header Data Length
? ? Header Data
Tag - This code is used to identify the header type. ~11
headers have tag values of 1 thru 255.
.. ....
, . ~ ' , ' .' ' , . : - ' , ', ' ' " ' ' ' '
' ' ' ' '~.~ ' ' ' . ' ' ': ' ' '' ' ' "
2~8~8
33
1 Subtag - This code is used to either further identify ~ -
the header type or to identify the operating system type
for this header.
Header Data Length - This is the length of the header
data. This length does not include the 4 byte tag (Tag,
Subtag, Header Data Length) and it does not include the
null byte which may follow the header data to align the
next marker or header on an even byte boundary.
Header Data - This is data that is specific to each
header type.
3.3.1 Signature Headers
The Signature Headers are used to identify tape files.
They are always the first header that appears in a tape
file.
~. . .
Length Value Description
.
1 1 Tag = Signature
1 ? Subtag (Signature Type)
0 = Tape Header Signature
1 = Volume Table Signature
2 = File Set Signature
2 4 Header Data Length
4 ? Signature
"UTF " = Tape Header Signature
"VTBL" = Volume Table Signature
"FSET" = File Set Signature
,
. ' ; ~ . '"' ' '
- , .
. : ~ ,
2~ '78
1 Subtag - This identifies the type of signature that
follows.
Signature - This is a 4 byte string the provides a unique
ID for each tape file type (e.g. Tape Header, Volume
Table, File Set).
.
3.3.2 Universal Tape Format Information Header
The Universal Tape Format Information Header is used to
define the contents of the tape and to locate the latest
version of the volume table.
Length Value Description
. . .
1 2 Tag = Universal Tape Format
Information
1 ? Subtag ~Tape Header Flags)
Bit:
0: 1 = Continued File Set
. 1: 1 = End of Media
2 ? Header Data Length
2 30 Total Fixed Fields Length
2 ? Universal Tape Format Version
4 ? Cartridge/Block Number
[3] Block Number = O
[1] Cartridge Number
6 ? Cartridge Date/Time
[1] Second
[1] Minute
[1] Hour
[1] Day
~ - . , ,
:
:
- ., .. ~: ' ~ .. . . : .
-, : -~ :- :. : . ~ . . : . - : .
,
1 [1] Month
[1] Year (from 1950)
8 ? Random ID
4 ? Volume Table Cartridge/Block
[3] Block Number
[1] Cartridge Number
6 ? Volume Table Date/~ime
[1] Second -
[1] Minute
[1] Hour
[1] Day
[1] Month -
[1] Year (from 1950)
2 ? Cartridge Label Length
? ? Cartridge Label
(null byte fill to even byte
boundary)
~,
20 Subtag - Following are the definitions for each Tape
Header Flag bit.
Continued File Set - When this bit is set to 1, it
identifies the first file set on this cartridge as
being a continuation from the previous cartridge.
End of Media - When this bit is set to 1, it identi-
fies this cartridge ag being full. No more data
can be written to this cartridge.
Total Fixed Fields Length - This is the length of all the
fixed fields in the header data. This length is measured
from the Universal Tape Format Version field and includes
all fields up to but not including the Cartridge Label
.
.
: .. , . : . . . ..
20~ 8
1 Length field. This length will allow for the future
addition of fixed fields~
Universal Tape Format Version - This is a unique number
that identifies the version of the Universal Tape Format
that generated this tape.
Cartridge/Block Number - This is a sequence number that
identifies each cartridge in a multi-cartridge set. The
cartridge number starts with 1 for the first cartridge and
is incremented for each cartridge that follows in the
multi-cartridge set. The block number is always 0.
Cartridge Date/Time - This is the date and time that this
cartridge was first written. This date/time will be
different on each cartridge in a multi-cartridge set.
Random ID - This is a random number that identifies
cartridges belonging to the same multi-cartridge set. A11
cartridges in a multi-cartridge set will have the same
Random ID.
Volume Table Cartridge/Block - This is the cartridge and
block number of the segment in which the last volume table
was written.
Volume Table Date/Time - This is the date and time that
the last volume table was written.
Cartridge Label Length - This is the length of the
Cartridge Label text string. This length does not include
the null byte that may follow the text string.
Cartridge Label - This is a text string that is used to
label each cartridge. All cartridges in a multi-cartridge
~........ , , . ., , :
2~8~713
1 set will have the same cartridge label. A null byte will
be appended to align on an even byte boundary if the text
string ends on an odd byte boundary.
3.3.3 File Set Information Header
The File Set Information Header is used to maintain file
set information in the Volume Table. A duplicate of the
File Set Information Header is placed at the beginning of
each file set.
Length Value Description
1 3 Tag = File Set Information
1 ? Subtag (Operating System Type)
0 = reserved
1 = Msdos
2 = OS/2
3 = Novell
4 = Unix
5 = Macintosh
6 - 255 = reserved
2 ? Header Data Length
2 46 Total Fixed Fields Length
2 ? File Set Type
0 = Image (Disk)
1 = Image (Volume)
2 = File (All)
3 = File (Selected)
4 = File (Changed)
2 ? Multi-File Set ID
4 ? Start Cartridge/Block
- - , . - . . . -. . ..
2~G8~'~8
l [3] Block Number
[1] Cartridge Number
4 ? End Cartr~dge/Block
[3] Block Number
[1] Cartridge Number
6 ? Creation Date/Time
[1] 5econd
[1] Minute
[1] Hour
[1] Day
[1] Month
[1] Year (from 1950)
4 ? File Count
4 ? Subdirectory Count
4 ? System Object Count
4 ? Corrupted Count
4 ? Total Files Size (1024 byte units)
4 ? Total Tape Size (1024 byte units)
1 ? Compression Algorithm
1 ? Encryption Algorithm
1 ? ECC Algorithm
1 ? Segment Overhead Field Size
2 ? File Set Label Length
? ? File Set Label
(null byte fill to even byte
boundary)
2 ? Disk/Volume Name Length
? ? Disk/Volume Name
(null byte fill to even byte
boundary)
2 ? Password Length
? ? Password
(null byte fill to even byte
boundary)
2 ? Comment Length
. ~ . : : . : . . . . : : : :.
Z~8'~8
39
1 ? ? Comment
~null byte fill to even byte
boundary)
2 ? Directory Table Addresses Length
? ? Directory Table Addresses
[3] Directory Table Block Number
[1] Directory Table Cartridge
Number
[2] Directory Table Offset
; 10 etc.
Subtag - This identifies the operating system that
generated this file set.
Total Fixed Fields Length - This is the length of all the
fixed fields in the header data. This length is measured
from the File Set Type field and includes all fields up to
but not including the File Set Label Length field. This
length will allow for the future addition of fixed fields.
File Set Type - This identifies the type of this file
set. Following are the definitions of each file set type.
Image (Disk) - This type of file set contains disk
image data. This includes all volumes or partitions
on the selected disk.
Image (Volume) - This type of file set contains
volume image data. This includes only one selected
- volume or partition on the selected disk.
File (All) - This type of file set contains sub-
directory and file data. This includes all sub-
directories and files from one selected volume.
. , .. . : . . . - . , , ~ .
. . . :
:- .: . - - . .
, , , . , .- , . .
2~8~78
1 File ~Selected) - This type of file set contains
subdirectory and file data. This includes only
selected subdirectories and files from one selected
volume.
File (Changed) - This type of file set contains
subdirectory and file data. This includes only
changed ~from the last backup operation) subdirec-
tories and files from one selected volume.
Multi-File Set ID - This is used to associate the multi-
ple file sets that were created during a multiple disk
volume backup operation. The ID starts at 1 and is
incremented for every backup operation that is performed
against this volume table.
Start Cartridge/Block - This is the cartridge and block
number of the segment in which the file set starts.
End CartridgetBlock - This is the cartridge and block
number of the segment in which the file set ends.
Creation Date/Time - This is the date and time that the
file set was created.
File Count - This is the count of files that are in the
file set.
Subdirectory Count - This is the count of subdirectories
that are in the file set.
System Objects Count - This is the count of system
objects that are in the file set.
. : ' : . ' .. :! : , ', . , . . : ',, , ' . . . .
2~¢~8
41
1 Corrupted Count - This is the count of corrupted files,
subdirectories, and system objects that are in the file
set.
Total Files Size (1024 byte units) - This is the total
size of all files in the file set. The unit of measure-
ment is 1024 bytes.
Total Tape Size (1024 byte units) - This is the total size
of all tape blocks in the file set. It is the sum of the
Segment Data, Segment Overhead, and Segment ECC Areas for
all segments in the file set. The unit of measurement is
1024 bytes.
Compression Algorithm - This identifies the compression
algorithm that was applied to the file set. It is set to
0 if no compression was applied.
Encryption Algorithm - This identifies the èncryption
algorithm that was applied to the file set. It is set to
0 if no encryption was applied.
ECC Algorithm - This identifies the ECC algorithm that
was applied to the file set. It is set to 0 if no ECC was
applied.
Segment Overhead Area Size - This is the size of the
Segment Overhead Area for this file set. It is set to 0
if there is no Segment Overhead Area.
File Set Label Length - This is the length of the File
Set Label text string. This length does not include the
null byte that may follow the text string.
.", ... . . . .
- ~ .. : . . . - . .
- : .. . ~ . : . , , ~ : .
. , . . . : ,:
:
2~ 8
1 File Set Label - This is a text string that is used to
label each file set. All file sets in a volume set will
have the same file set label. A null byte will be appended
to align on an even byte boundary if the text string ends
S on an odd byte boundary.
Disk/Volume Name-Length - This is the length of the
Disk/Volume Name text string. This length does not
include the null byte that may follow the text string.
Disk/Volume Name - This a text string that is used to
identify the disk/volume that is associated with this file
set. A null byte will be appended to align on an even
byte boundary if the text string ends on an odd byte
boundary.
Password Length - This is the length of the Password
; string. This length does not include the null byte that
may follow the string.
Password - This is an encrypted text string that is used
to protect the file set. A null byte will be appended to
align on an even byte boundary if the encrypted text
string ends on an odd byte boundary.
Comment Length - This is the length of the Comment text
string. This length does not include the null byte that
may follow the text string.
Comment - This is a text string that is used to contain
comments for the file set. A null byte will be appended
to align on an even byte boundary if the text string ends
on an odd byte boundary.
:: . : . . . . . : - . . . .
2~6~8~78
1 Directory Table Addresses Length - This is the length of
the Directory Table Addresses structure.
Directory Table Addresses - This a table of Directory
Table Addresses. Each entry in this table contains the
address of the Directory Table Start Marker that starts a
directory table area in the file set. With the address
information contained in this table all the directory
table areas can be located and assembled to produce a
complete directory of all files and subdirectories in
this file set. Following are the definitions of the
fields in each table entry.
Directory Table Block Number - This is the block
number of the segment in which the directory table
area starts.
Directory Table Cartridge Number - This is the
number of the cartridge on which the directory
table area starts.
:'
Directory Table Offset - This is the offset from the
beginning of the segment where the directory table
area starts.
3.3.4 Image Information Header
The Image Information Header i9 used to identify image
data. The~ Image Information header data is operating
system dependent. It is not intended that image data be
transported to other operating systems. The Image Infor-
mation Header is followed by image data.
:: .
': ., ', : . ~ '
-: ;, ::. .
. . . .
2~3C~ ~'7~3
44
1 Length Value Description
.
1 4 Tag = Image Information
1 ? Subtag (Operating System Type)
0 = reserved
1 = Msdos
2 = OS/2 : -
3 = Novell
4 = Unix
5 = Macintosh ~`
6 - 255 = reserved
2 ? Header Data Length
.
? ? Operating System Specific
Definitions :~
Subtag - This identifies the operating system that ~.
generated this image file set.
Operating System Specific Definitions - This is operating
system dependent data.
3.3.5 File Information Header
The File Information Header is used to contain generic
information about files that is common to all file
systems. This header is used in both the file data and
directory table areas of the file set.
. ~ . , . : . : . , ::: :
`` 20~8~8
1 Length Value Description
-
1 5Tag = File Information
1 ? Subtag IOperating System Type)
0 = reserved
1 = Msdos
2 = OS/2
3 = Novell
4 = Unix
5 = Macintosh
6 - 255 = reserved
2 ? Header Data Length
:~
2 ? Attributes
Bit:
0: 1 = Read Only
1: 1 = Hidden
2: 1 = System
3: 1 = Volume Label -
4: 1 = Directory
5: 1 = Archive
6-13: 0 (Reserved)
14: 1 = Corrupted
15: 0 = File Data Area
1 = Directory Table Area
6 ? Modiflcation Date/Time
[1~ Second
[1] Minute
[1] Hour
[1] Day
[1] Month
[1] Year (from 1950)
4 ? Data Fork Length
.: : - , . . : ~ -
; ,:"; . .~ , .: -
: ~ ' : , ' ~ ' '
. ' -
8~78
46
1 4 ? Resource Fork Length
4 ? Header Cartridge/Block
[3] Block Number
[1] Cartridge Number
2 ? Header Offset
? ? File Name
; ':
Subtag - This identifies the operating system this file
belongs to.
Attributes - Following are the definitions for each
attribute bit. ~
lS Read Only - When this bit is set to 1, it identifies ;
the file as being protected from write and delete
operations.
Hidden - When this bit is set to 1, it identifies
the file as being excluded from normal access.
System - When this bit is set to 1, it identifies
the file as containing system code or data.
Volume Label - This bit is always set to 0 for File
Information Headers.
Directory - This bit is always set to 0 for File
Information Headers.
Archive - When this bit is set to 1, it identifies
the file as having been modified since the last
backup operation.
., . ~ ,.
- . . . .
47
1 Corrupted - When this bit is set to 1, it identifies
the file as being corrupted. This can be caused by
a disk error or a file size change during the
backup process.
File Data or Directory Table Area - This bit identi-
fies the area of the file set (File Data or Direc-
tory Table Area) that this header belongs to. It
will be set to 0 if the header belongs to the File
10 Data Area or it will be set to 1 if the header
belongs to the Directory Table Area.
Modification Date/Time - This is the date and time that
this file was last modified.
Data Fork Length - This is the total length of the data
fork for this file.
~"
Resource Fork Length - This is the total length of the
resource fork for this file.
Header Cartridge/Block - This is the cartridge and block
number of the segment (in the File Data Area) in which
this File Information Header and data starts.
Header Offset - This is the offset from the beginning of
the segment (in the File Data Area) where thls File
Information Header and data starts.
File Name - This is the file name. The length of the name
can be determined by subtracting 22 (the fix length
portion of the header data) from the Header Data Length.
.
. . ..
~.
26~C~7~
48
1 3.3.6 Subdirectory Information Header
The Subdirectory Information Header is used to
contain generic information about subdirectories that is
common to all file systems. This header is used in both
the file data and directory table areas of the file set. -:
Length Value Description
1 6 Tag = Subdirectory Information
1 ? Subtag (Operating System Type)
0 = reserved
1 = Msdos
2 = OS/2
3 = Novell
4 = Unix
5 = Macintosh
6 - 255 = reserved
2 ? Header Data Length
_ _ _
2 ? Attributes :
Bit:
0: 1 = Read Only
1: 1 = Hidden
2: 1 = System
3: 1 = Volume Label
4: 1 = Directory
5: 1 = Archive
6-13: 0 (Reserved)
14: 1 = Corrupted
15: 0 = File Data Area
1 = Directory Table Area
6 ? Modification Date/Time .
2~8`~8
49
1 [1] Second
[1] Minute
[1] Hour
[1] Day
[1] Month
[1] Year (from 1950)
4 ? Data Fork Length
4 ? Resource Fork Length
4 ? Header Cartridge/Block
[3] Block Number
[1] Cartridge Number
2 ? Header Offset
? ? Subdirectory Name
Subtag - This identifies the operating system this sub-
directory belongs to.
Attributes - Following are the definitions for each
attribute bit.
Read Only - When this bit is set to 1, it identifies
the subdirectory as being protected from write and
delete operations.
Hidden - When this bit is set to 1, it identifies
the subdirectory as being excluded from normal
access.
System - When this bit is set to 1, it identifies
the subdirectory as containing system code or data.
Volume Label - This bit is always set to 0 for
Subdirectory Information Headers.
,
.. ~ . . .
- 2C~8f~78
,: :.
1 Directory - This bit is always set to 1 for Subdi-
rectory Information Headers.
Archive - When this bit is set to 1, it identifies
the subdirectory as having been modified since the
last backup operation.
Corrupted - When this bit is set to 1, it identifies
the subdirectory as being corrupted. This can be
caused by a disk error during the backup process.
File Data or Directory Table Area - This bit
identifies the area of the file set (File Data or
Directory ~able Area) that this header belongs to.
It will be set to 0 if the header belongs to the
File Data Area or it will be set to 1 if the header
belongs to the Directory Table Area.
Modification Date/Time - This is the date and time that
this subdirectory was last modified.
Data Fork Length - This is always set to 0 for Sub-
directory Information Headers.
Resource Fork Length - This is always set to 0 for Sub-
directory Information Headers.
Header Cartridge/Block - This is the cartridge and block
number of the segment (in the E'ile Data Area) in which
this Subdirectory Information Header starts.
Header Offset - This is the offset from the beginning of
the segment (in the File Data Area) where this Subdirec-
tory Information Headex starts.
8~8
1 Subdirectory Name - This is the subdirectory name.
The length of the name can be determined by subtracting 22
(the fix length portion of the header data) from the
Header Data Length.
3.3.7 System Object Information Header
The System Object Information Header is used to contain
generic information about system objects. System objects
are objects that are neither files nor subdirectories such
as server information, peripheral access information,
group lists, user lists, etc This header is used in both
the file data and directory table areas of the file set.
Length Value Description
1 7 Tag = System Object Information
1 ? Subtag ~Operating System Type)
0 = reserved
1 = Msdos
2 = OS/2
3 = Novell
4 = Unix
5 = Macintosh
6 -~255 = reserved
2 ? Header Data Length
2 ? Attributes
Bit:
0: 1 = Read Only
1: 1 = Hidden
2: l = System
3: 1 = Volume Label
- . : :
.
: ~ , !
' .'' , " . .
. I ' '
- ~ . ' , ;
52
1 4: 1 = Directory
5: 1 = Archive
6-13: 0 (Reserved)
14: 1 = Corrupted
15: 0 = File Data Area
1 = Directory Table Area
6 ? Modification Date/Time
[1~ Second
[1] Minute
[1] Hour
[1] Day
[1] Month
[1] Year (from 1950)
4 ? Data Fork Length
4 ? Resource Fork-Length -
4 ? Header Cartridge/Block
[3] Block Number
[1] Cartridge Number .
2 ? Header Offset
? ? System Object Name
.... _ _ ,
Subtag - This identifies the operating system this system
object belongs to.
Attributes - Following are the definitions for each
attribute bit.
Read Only - This bit is always set to 0 for System
Object Information Headers.
Hidden - This bit is always set to 0 for System
Object Information Headers.
.. . .
263~ '7~3
`
53
1 System - This bit is always set to 0 for System
Object Information Headers.
Volume Label - When this bit is set to 1, it iden-
tifies the system object as being a volume label.
Directory - This bit is always set to 0 for System
Object Information Headers.
Archive - This bit is always set to 0 for System Object
Information ~eaders.
Corrupted - When this bit is set to 1, it identifies the
system object as being corrupted. This can be caused by a
disk error during the backup process.
File Data or Directory Table Area - This bit identi-
fies the area of the file set (File Data or Direc-
tory Table Area) that this header belongs to. It
will be set to 0 if the header belongs to the File
Data Area or it will be set to 1 if the header
belongs to the Directory Table Area.
Modification Date/Time - This is the date and time that
this system object was last modified.
Data Fork Length - This is the total length of the data
fork for this system object.
Resource Fork Length - This i6 the total length of
the resource fork for this system object.
Header Cartridge/Block - This is the cartridge and block
number of the segment (in the File Data Area) in which
35 this System Object Information Header and data starts.
. .~. .~, ':: ' ;
8~78
54
1 Header Offset - This is the offset from the beginning of
the segment (in the File Data Area) where this System
Object Information Header and data starts.
System Object Name - This is the system object name. The
length of the name can be determined by subtracting 22
(the fix length portion of the header data) from the
Header Data Length.
3.3.8 Operating System Attributes Header
The Operating System Attributes Headers are used to
contain operating system specific information about files,
subdirectories, and objects. These headers are used in
both the file data and directory table areas of the file
set.
Length Value Description
1 8 Tag = Operating System Attributes
1 ? Subtag (Operating System Type)
0 = reserved
1 = Msdos
2 = OS/2
3 = Novell
4 = Unix
5 = Macintosh
6 - 255 = reserved
2 ? Header Data Length
2 ? Operating System Structure ID
? ? Operating System Specific
Attributes
.
: . . . .
- .. . - . ~. . . - . .
: - ,~ ,
, ~ . , - ~ . . :, .
2~ 78
1 Subtag - This identifies the operating system that the
following attributes belong to.
Operating System Structure ID - This field contains a
unique identifier for the operating system specific
structure that follows. The identifier starts at 0
and is incremented for each structure that is defined for
each operating system. Identifiers are unique only within
an operating system type. The identifier is formatted in
a low to high byte order (Intel format).
Operating System Specific Attributes - This is a struc-
ture that contains operating system specific attributes.
The structure can be defined in whatever format is conven-
ient for the operating system. This includes the byteordering and date/time formats. All structures will be
published in order to allow foreign operating systems to
interpret the attributes.
3.3.9 Data/Resource Fork Header
The Data/Resou~ce Fork Header is used to contain the data
that is associated with file forks. If a fork length is
greater than 64k, then the data will be divided into
multiple headers. Only the data and resource forks are
defined, but in the future more may be added.
LengthValue Description
1 9 Tag = Data / Resource Fork
1 ? Subtag (Data or Resource Fork Type)
0 = Data Fork
1 = Resource Fork
2 - 255 = reserved tother forks?)
..
.,
- . .:
zo~
56
1 2 ? Header Data Length
: .
? ? Fork Data
Subtag - This identifies the fork that the data is
associated with.
Fork Data - This is file fork data.
3.3.10 Set Path Header
The Set Path Header is used to change the current sub-
directory path to the specified subdirectory path from theroot directory.
Length Value Description ~
.~ .
2~ ;
1 10 Tag = Set Path
1 ? Subtag (Data or Directory Table
Area)
0 = File Data Area
1 = Directory Table Area
2 ? Header Data Length
2 ? Subdirectory Name Length
? ? Subdirectory Name
(null byte fill to even byte
boundary)
2 ? Subdirectory Name Length
? ? Subdirectory Name
:' ' , ' ', ~ ' ' ' ' ' " ' '~', '
.
20~8~78
57
1 (null byte fill to even byte
boundary)
etc.
2 0 Termination
Subtag - This identifies the area ~File Data or Directory
Table) of the file set that this header is associated
with.
Subdirectory Name Length - This is the length of the
following Subdirectory Name text string. This length does
not include the null byte that may follow the text string.
Subdirectory Name - This is a text string that gives one
subdirectory name of the total path to be set. The name
does not include any path separator. A null byte~will be
appended to align on an even byte boundary if the text
string ends on an odd byte boundary.
3.3.11 Compression/Encryption Header
The Compression/Encryption Header is used to compress
and/or encrypt the Universal TaRe Format data stream.
LengthValue Description
1 11 Tag = Compression/Encryption
1 ? Subtag (Compressed/Encrypted Flags)
8it:
0: 1 = Compressed
1: 1 = Encrypted
2 ? Header Data Length
2 ~3 ~ 8 17 8
58 - -
1 ? ? Compressed and/or Encrypted Data
.
Subtag - Following are the Definition for the Compressed/
Encrypted Flag bits.
Compressed - When this bit is set to 1, it identi-
fies the header data as being compressed.
Encrypted - When this bit is set to 1, it identifies
the header data as being compressed.
Compressed and/or Encrypted Data - This is the data that i -
is compressed and/or encrypted. This can include any part
of the Universal Tape format data stream (markers, head-
ers, and header data) except other Compression/Encryption
headers.
4.0 Operating System Specific Attributes Headers
4.1 Msdos Specific Attributes Headers
4.2 OS/2 Specific Attributes Headers
4.3 Novell Specific Attributes Headers
4.3.1 Novell Directory Attributes Header
LengthValue Description
1 8 Tag = Operating System Attributes
1 8 Subtag = Novell
2 ? Header Data Length
2~ '78
1 2 1 Operating System Structure ID
2 ? Owner
4 ? Creation Date and Time
1 ? Max Rights
4.3.2 Novell File Attributes Header
Length Value Description
_
1 8 Tag = Operating System Attributes
1 3 Subtag = Novell
2 ? Header Data Length
2 2 Operating System Structure ID
2 ? Owner
2 ? Creation Date
2 ? Access Date
4 ? Update Date and Time
4 ` ? Archive Date and Time
1 ? Attribute
1 ? Extended Attribute
4.3.3 Novell Trustee Information Header
Length Value Description
1 8 Tag = Operating System Attributes
1 3 Subtag = Novell
2 ? Header Data Length
~t~ 78
:
1 2 3 Operating System Structuxe ID
? ? Trustee List - :
[4] Trustee ID
[1] Trustee Rights
[1] reserved
4.3.4 Novell Msdos Name Header
Length Value Description
1 8 Tag = Operating System Attributes
1 3 Subtag = Novell
2 ? Header Data Length
2 4 Operating System Structure ID
? ? Msdos Name
_ _
4.3.5 Novell A~P Information Header
Length Value Description
2S
1 8 Tag = Operating System Attributes
1 3 Subtag = Novell
2 ? Header Data Length
2 5 Operating System Structure ID
2 ? Owner
2 ? Creation Date
2 ? Access Date
, . . . . , : .. : :
. . . -, , . . : . ~ - : ~ : . .
' - . .:. , :' '' '- - .
61
1 4 ? Update Date and Time
4 ? Archive Date and Time
1 ? Attribute
1 ? Extended Attribute
2 ? Access Privileges
4.4 Unix Specific Attributes Headers
4.5 Macintosh Specific Attributes Headers
4.5.1 Macintosh Finder Information Header
Length Value Description
-
1 8 Tag = Operating System Attributes
1 5 Subtag = Macintosh
2 ? Header Data Length
2 1 Operating System Structure ID
16 ? Finder Information
16 ? Extended Finder Information
4 ? Creation Date
4 ? Owner
4 ? Group
4 ? Access
5.0 File Set Compression and Encryption
:
The above-described tape formatting technique is
readily implemented in a back-up/restore application
-, . , : ,: . : ;............. : - . :
~ ,. . ~ . , ,
~8'~'78
62
1 program but can be similar to known back-up/restore appli-
cation programs, for example. Generally, the operating
system information required to generate the headers in
accordance with the disclosed tape formatting technique
would be based on information in the application program
as well as information obtained by the application program
from the host operating system and the user, for example,
pursuant to known techniques, such as operating system
calls and reading parameter values in predetermined memory
locations. Further, the application program would include
appropriate operating system calls for file and directory
management, for example, to open or close a file, read a
file, change the current directory, create a file, and
make a directory. Also, the application program would
include or utilize appropriate buffer management to
compensate for overhead processing time as well as differ-
ences in access, read, and write times for different
storage devices. In terms of user interface, the~applica-
tion program could include options for back-up of a file
set that include an entire logical or physical disk, or
selected files and subdirectories. For restore, similar
capabilities would be provided for restoring an entire
file set or selected files and subdirectories of a file
set.
Illustrative Example of Back-Up and Restore
Set forth in FIGS. 9-11 are flow diagrams for
illustrative implementions of a back-up process and an
associated restore process in accordance with the dis-
closed tape formatting technique. For ease of explana-
tion, these illustra~ive implementations are directed to
the back-up of an entire disk and the restoration of an
entire file set, and persons skilled in the pertinent art
? ' - '. . :
, ,,',~ . ; ,'. : ': ' ' : . : . '
:' , , ~ ' . ' : ' '
,' '' ' . ' ': ' : , . .:
., . ; . . , . ' ~;,' ' ~ . ":' :,: .
Z~ 7~3
1 will appreciate that such processes are readily modified
to provide for selective back-up and restoration.
Also for ease of explanation, the following does not
include procedures for writing one or more Tape Header
Files and one or more Volume Table Files as appropriate
for the particular tape device, which is readily achieved,
for example pursuant to known techniques.
Referring particularly to FIG. 9, set forth therein
is the illustrative example of a back-up process. At 211
a File Set Signature header is generated and copied to a
tape buffer which buffers the data that is to be written
to tape. At 213 a File Set Information header is gener-
ated and copied to the tape buffer. Next, a File Data
Area Start marker is generated and copied to the tape
buffer at 215. At 217 a Directory Table Area Start marker
is generated and copied to the directory table buffer, in
accordance with the directory table generation specified
by the disclosed tape formatting technique. At~ 219 the
root directory is selected as the current directory for
processing.
At 221 a search is made for the first entry in the
current directory. At 223 a determination is made as to
whether an entry was found in the current directory. If
yes, a determination is made at 225 as to whether the
entry found is a subdirectory. If the determination at
225 is no, which indicates that the entry is a file, a
call to the subroutine DIR TABLE (discu8sed further herein
relative to FIG. 10) is made. Upon return from the
subroutine, a File Information header is generated and
copied to the directory table buffer at 229 and to the
tape buffer at 231. At 233 Operating System Attributes
headers, if any, are generated and copied to the tape
buffer. At 235 Data Fork headers are generated as re-
quired and copied to the tape buffer. Any Resource Fork
headers as required are generated and copied to the tape
64
1 buffer at 237. It should be appreciated that generating
the File Information header and any Operating System
Attributes headers involves obtaining the attributes o~
the file identified by the entry, and generating Data Fork
headers and Resource Fork headers involves reading the
file identified by the entry.
At 239 a search is made for the next entry in the
current directory, and processing transfers to 223,
described above.
Referring again to the determination at 225, if the
determination is yes, the entry found is a subdirectory, a
call to the subroutine DIR TABLE is made at 241. Upon
return, a Subdirectory Information header is generated and
copied to the directory table buffer at 243 and to the
tape buffer at 245. At 247 Operating System Attributes
headers, if any, are generated and copied into the tape
buffer; and at 249 the current directory for processing is
changed to the subdirectory that was found at 225.
Process flow then continues with 239, described above.
Referring again to the determination at 233, if the
determination is no, there are no further entries in the
current directory, a call to the subroutine DIR TABLE is
made at 251. Upon return, a Subdirectory End marker is
generated and copied to the directory table buffer at 253
and to the tape buffer at 255. A determination is then
made at 257 as to whether the current directory is the
root directory. If no, the current directory is changed
to the parent directory, and processing continues at 239,
described above.
If the determination at 257 is yes, the current
directory is the root directory, the directory table
buffer is copied to the tape buffer at 261. A Tape File
End marker is generated and copied to the tape buffer at
263, and null markers are generated and copied to the tape
.~. .. . . .. . . , ~ . , :
'~ 7~1
1 buffer to pad the tape buffer as required at 265. Back-up
of the specified File Set is then complete.
Referring now to FIG. 10, the subroutine DIR TABLE
begins with a determination at 267 as to whether the
directory table buffer is full. If no, a return is made
from the subroutine. If yes, the directory table buffer
is full, at 269 the contents of the directory table buffer
are copied to the tape buffer and the directory table
buffer is cleared.
At 271 a File Data Area Start marker is generated
and copied to the tape buffer, and at 273 a Directory
Table Area Start Marker is generated and copied to the
directory table buffer. A return in then made from the
subroutine.
Referring now to FIG. 11, set forth therein is the
illustrative example of a restore process. At 311 the
root directory is selected as the current directory for
processing. At 313 the File Set Signature header is
processed, and at 315 the File Set Information header is
processed.
The particular processing of the Signature header
would include a determination as to whether it had a valid
signature. The File Set Information header is processed,
for example, to determine its presence since the informa-
tion therein would have been previously processed relativeto the Volume Table Tape File, fox example in accordance
with known techniques.
At 317 the next marker or header on the tape is
read. A determination is then made at 319 as to whether a
Data Area Start marker was read. If no, a determination
is made at 321 as to whether a Tape File End marker was
read. If yes, indicating the end of a File Set tape file,
processing stops. If the determination at 321 is no,
processing transfers to 317, described above.
' :. . . ' . ,, : ~ : , . ,
66
l Referring again to the determination at 319, if yes,
a Data Area Start marker was read, the next marker or
header is read at 323. A determination is then made at
325 at to whether a Directory Table Start marker was read.
If yes, processing continues at 317, described above. If
the determination at 325 is no, a determination is made at
327 as to whether a Tape File End marker was read. If
yes, processing stops.
If the determination at 327 is no, a determination
is made at 329 as to whether a Subdirectory End marker was
read. If yes, a determination is made at 311 as to
whether the current directory is the root directory. If
yes, control transfers to 323, described above. If the
determination at 331 is no, the current directory is not
the root directory, the current directory is changed the
parent directory, and processing continues at 323.
Referring again to 329, if the determination is no,
a Subdirectory End Marker was not read, a determination is
made at 335 as to whether a File Information header was
obtained. If yes, the File Information header is pro-
cessed at 337. Next, any Operating System attributes
headers are p~ocessed at 339. Data Fork headers are
processed at 341, and Resource Fork headers are processed
at 343. Processing then continues at 323, described
above.
Processing of the File Information header and any
Operating System Attributes headers generally involves
opening the file specified by the File Information header
and appropriately changing the attributes of the file
pursuant to the File Information header and any Operating
System Attributes headers, or creating a file as defined
by the File Information header and any Operating System
Attributes headers. Processing of the Data Fork headers
and any Resource Fork headers involves copying the data in
such headers to the file.
.. . .
: ~ ' . , :
, ~ , .
1 If the determination at 335 is no, a File Informa-
tion header was not read, a determination is made at 345
as to whether a Subdirectory Information header was read.
If no, processing continues at 323, described above. If
yes, a Subdirectory Information header was obtained, the
Subdirectory Information header is processed at 347. Any
operating system attributes headers are processed at 349,
and at 351 the current directory for processing is changed
to the subdirectory identified by the subdirectory infor-
mation header just read. Processing then continues at323.
Processing of the Subdirectory Information header
generally involves creating the specified subdirectory if
it does not already exisl, and assigning or changing the
attributes of the subdirectory in accordance with the
Subdirectory Information header and any Operating System
Attributes headers. Also, the current dircctory would be
changed to the subdirectory defined by the Subdirectory
Information header being processed.
The foregoing has been a disclosure of a tape
formatting technique that advantageously produces data
tapes which can-be transported to computers having differ-
ent operating systems, efficiently utilizes storage,
provides for efficient identification and access of stored
files and subdirectories, and provides for partial data
recovery in the event of catastrophic data corruption.
Although the foregoing has been a description and
illustration of specific embodiments of the invention,
various modifications and changes thereto can be made by
persons skilled in the art without departing from the
scope and spirit of the invention as defined by the
following claims.