Note: Descriptions are shown in the official language in which they were submitted.
CA 02350721 2003-06-26
METHOD AND APPARATUS FOR C JNK BASED TRANSACTION LOGGING WITH
ASYNCHRONOUS INPUTIOUTPUT FnR .A DA_T~,ABASE MANAGEMENT SYSTEM
Field of the Invention
This invention relates to the field of transaction logging in the storage of
data in database
management systems (DBMS) such as relational database management aystems.
Background of the Invention
For reliable operation of a database rnanagernertt system, integrity and
consistency are needed for
the storage and updating of data for they dat~alhrr~;~ xtwaraag~merot system.
'1"o facilitate this
requirement, one of the key feKttures of a mature databaw~e~, management
system is transaction
logging. For each change to the database: c.onterrt, there: is one; d:ar more
log record that describes
this change. Each log record is assigned a specific we<luenc°e number
and log records are
eventually transferred to permanent storage in carne or~det~ as this sequ ence
number. These log
records must eventually be placed on permanor~t storage in order to ensure
consistency and to
provide a history of changes to the database. Eor relational database
management systems
(RDBMS) such as DB2, transaction logging is an integral process involved in
making changes to
the data contents of the database tables stored by the RDBMS. 'These changes
include inserting,
deleting or updating the database contents. Fc>r nurture database management
systems, transaction
log records are first logged by database: connection agents ire a transaction
log buffer in memory
before being transferred to permanent storage such as a disk. The database
manager generally has
a dedicated thread of execution c~tllecl the logger tc> rx~art*rge tire
transt"erring <>f data from log
buffer to permanent storage. 'The purp<yse cjf t(~c; transszcticnrt log buffer
is to reduce the cost of
small changes to the database contents by r°nakirag the: ~;o~~t c~f
lof;ging of these small changes
relatively inexpensive. Placing transaction 1o1; records irrtta a log buffer
is relatively simple when
a single database application is involved since there is n<o c;t:nrttenticrn
for t:he use of the transaction
log buffer by other application. 'fhe speed at which transaction logging takes
place is dependent
on the amount of data that the single application provides for transaction
logging and the
frequency that the application makes changes t.o the database contents.
Transaction logging for a
single database application can be handled in a simple serial fashion and
there is no contention
CA9-2001-0030 l
CA 02350721 2003-06-26
on the transaction log buffer. However; the sittraticn~ bc.;comes more complex
for a database
management system which can serve multiple database connections
simultaneously. An efficient
transaction logging mechanism is required that carr handle the lagging of data
by multiple
database applications to multiple database cablGS on pern"wallent: storage.
GVith multiple database
applications, contention for the transaction log b~rff'er~ cru7 irrhibil:
transaction lagging performance
due to the serialized access to the trarrsactiorr lcsg buffer. liar workloads
which involve a high
frequency of transactions that make changes tea thre dat~rl:rase ~;or~tents,
transaction logging can
become a performance bottleneck.
In prior art, changes made to the database by multiple database applications
making use of the
RDBM5 involved granting exclusive access to tl~~ log huffier to one database
connection agent (a
database connection agent in this context is ~:r tasks or process that resides
in the RDBMS engine
and performs various operations on behalf oiv a database 4rplalication). The
access to the log buffer
is sequential, requiring other database ~onrrectic~n agents to wait until the
w~rent database
application's data had been copied into the log l:ruffer. rflnis serialized
access can have a large
negative impact to transaction logging throughput: since only one database
connection agent can
copy transaction log records at any given time.
The following generally summarizes the steps taken in pr for art when a
transaction log record
was copied into the transaction Log buffer:
1. Obtain a latch (a latch ire this c;ont~xt is sirroply a high speed mutually
exclusive
locking mechanism) which protects the trarrs~rc~ic,~o log buffer from having
more than one
database connection agent copying a transaction log record into it. Once the
transaction log
buffer is latched by one database connection agent. other database connection
agents must wait
until the latch is treed and the latch is obtained by the database connection
agent.
2. Copy the transaction log recorwd into tl-we. trans<rcticsn Icrg buffer
starting at the
current log buffer offset.
3. Update the transaction log buffi°c:r° c>ffse;t w°hWich
marks tire location that the next
transaction log record should be i:opied to.
CA9-2001-0030 ?
CA 02350721 2003-06-26
4. Release the latch which protects the transaction lag buffer so that other
database
database connection agents can attempt to obtain the lat.cl,.
One problem in the prior art is that there is too much werialization in
accessira g the log buffer. It
appears that there is no recognition in the prior art that the latch is only
needed to determine the
sequential order of the log records and the locaticjn in the log butter for
loading log record data.
We have found that it is not necessary (at Ic;ast far ttae lnvrTpc~su of
determining the sequential
order of log records) to hold ttre~ latch whip copying lag record data into
the log buffer. If
database connection agents are allowed to copy their lo= record data after
releasing the latch (so
multiple database connection agents can be copying conc;urrently), a problem
is not knowing
when database connection agents will have finislte:ci copying data into the
lag buffer.
We have found that the logger, which is used to transfer data records from
tire transaction log
buffer to main storage needs to know when it can start transferring data from
the. log buffer to
permanent storage. One aspect of this inventiaro addresses this.
The prior art also had to contend with two competirog goals when trying to
improve transaction
logging performance: one being application response time and the second being
overall
transaction logging throughput. 'this issued is related to the trar~sl~erring
of data frc>rn the lag buffer
to permanent storage. The performance at a storage device such as a disk
usually varies with the
amount of data for each transfer, and generally increase, with the transfer
size up to a certain
level, depending on system characteristics, after which the. performance I;ain
becomes smaller for
increasing transfer size. A storage device ttas a tr~.~n~;fer site lar ~e
r~rrnge of transfer sizes) at which
data transferring becomes ef"ficient and levels c>fh with larger transfer
sizes. However, during the
operation of the logger, there rriay be cliff-erer~t amounts c~f dais
available in the logger buffer
ready to be transferred to permanent storage. If the logger is implemented
with logic to initiate
the data transfer for all data available in the Irog buffer, the size ~of the
transfer pray be very large
resulting in poor response tune far some c°rf thc~ applic,atic~n:~
nmking use of the RI~BMS. It is also
possible that the amount of data to be transftr~r~ed rnay be: too small far
good efficiency, which
can result in poor system throughput. Taken to the extreme, it the logger
writes each transaction
CA9-2(>U 1-0030 3
CA 02350721 2003-06-26
record to storage one at a tirne, then each change to the database would have.
to endure the cost of
writing a log record to perrn~rnent storagev. While writing a specific log
record to storage
immediately benefits the individual database connection agent which made the
request, the
overall throughput is hindered, as will be, alypr~;GGut~::d by tfr~ose skilled
in the art. On the other
hand, the overall throughput carp benefit from waiting t'os ~r rrurnber of
transaction records to be
copied into the log buffer and then transferring thenr all to permanent
storage at the same time.
The longer the transaction logging task waits, the worse thc;,r potential
response time to an
individual request and the shorter the transaction logging task waits, the
worse the overall
throughput. One aspect of this invention addresses this issue by providing a
preselected data
transfer amount for use by the logger in transl"erring claca 1'rcnrvr the 9og
buf ter to permanent
storage.
Another potential problem with prior art is that tire transa~aicrn logging
task (performed by the
logger) was responsible for both writing transacaion records to permanent
storage and performing
other work. The other work may involve: resporrdir~g to varicaus requests (for
example, reading
some c~f the transaction records that have been previously written to the
storage) from potentially
thousands of database connection agents. While the transaction logging task is
handling the
transfer of transaction records to permanent storage, it is rroC able to
perform other work and
while the transaction logging task is performing other wcyrk, it is not able
to transfer transaction
records to permanent storage. The invention can increase the tune available
for the logging task
to perform these other duties. The perforrraanc~e c>l' these other duties is
not the ;subject matter of
this invention and will not be discussed further.
SummarJr of the Invention
The invention herein seeks to overcome the disadvantages of the prior art by
providing an
improved method and apparatus for the logging of transaction data from
multiple database
applications to be transferred to permanent. storage for a clat4rl~ase
rrranagement system.
As may be apparent herein there are a number of a spects tc5 the invention
herein; for instance:
CA9-2001-0030 ~t
CA 02350721 2003-06-26
a method is provided far storing transaction data fc7r a l,~lurality of
database applications in a
multiple access database management systerxr having permanent storage and a
transaction lag
buffer to store data for the database application) betar°a.
transferring the data to the permanent
storage, wherein database connection agent~a irsac:~ciateca with the
applications are used to store
data in the transaction log buffer, comprising:
granting exclusive write; access r~.servaticyrts in th~~ trartsttction lag
buffer to a plurality of
the database connection agents; and,
allowing database connection agents try write transaction data records to
previously
granted write access reservations while granting exclusive acc~.,ss
reservations to other database
connection agents.
From another perspective the method includes:
granting an exclusive write access reservation in tlm transaction log bui'fer
to a database
connection agent;
granting an exclusive write access reservation its the transaction log buffer
to another
database connection agent; while: allowing the: previous dat~~ibas~s
connection agent to write a data
record to its respective write access reservation in the transaction log
buffer.
Also, write access reservations may be granted sequenti~rlly in tf~o
transaction log buffer to
database connection agents; while allowing database conni;ction agents to
write data records to
previously granted write access reservations irr the trartsacl.icarr log buff-
er.
The connection agents preferably write data rvc~~rds irrdc;perrdently of and
concurrently with other
database connection agents t.o previously granted write acct°ss
reservations in the transaction log
buffer.
The transaction log buffer may be divided into a multiplicity of contiguous
logical sections of a
predetermined sire; for the purpose of seduentially granting contiguous write
access reservations
in the logical sections of the transaction log buffer to clatabase
c;crnnection agents; while allowing
CA9-2( I01-003() :?
CA 02350721 2003-06-26
other database connection agents to write data records courcrrrr'ently to
previously granted write
access reservations in the transaction log buffer.
When the reservations fill logical section; the process rnay beneficially
sequentially grant
continguous write access reservations in :~ next logical section of~ the
tramaction log buffer.
The method may include determining if all reservations in a logical section
have: been filled with
data records, copying transaction data records from a data filled. logical
section to the permanent
storage; and, subsequently making the data filled logical sec:~tion available
for data storage.
From another aspect the method provides f"or storing transaction data for a
multiplicity of
database applications in a multiple access database management system having
permanent
storage, and a transaction log buffer divided into a rn~rltiplicity of logical
sections of
predetermined equal size, to temporarily stcore data fc>r tlwe database
applications before
transferring the data to the permanent storage, wherein database connection
agents associated
with the database applications are used to stcar~: cl~rta in the transaction
lc~g buffesr, and a logger is
used to copy data from the transaction log buffer tc> the permanent storage,
comprising:
allowing database connection agents to write data ~-~cords asynchronously tc>
any
previously granted write access reservations in the t:ranssrcticrn log buffer;
sequentially granting contiguous write access rese,~rvations in a the logical
section of the
transaction log buffer' to database connection agents until t lre
re;~;ervations fill the logical section;
and in turn sequentially granting continguous write access reservations in a
next logical section
of the transaction log buffer;
counting the number of reservations made in a logic°.al section and
counting t:he number of
reservations in the logical section that have b~;en filled with data records
comparing the r~umher of reservations rrr~rd~ in a 1<>gical section with the
number of
reservations filled with data in the logical sectron;
when the number of reservation:; ma<le ~tracl the r~ur~r~ber° of
reservations filled are equal
marking the logical section as a filled logical :~ectic~n;
allowing data records from the filled logical section to be copied to
pe;rm~tnent storage;
CA9-2(lU 1-Q03U
CA 02350721 2003-06-26
copying transaction data records from the data filled logical section to fhe
permanent storage;
determining when all data records from the filled logical section have been
copied to
permanent storage;
making the data. filled logical section a.vailahle for data storage when all
data records
from the data filled logical secti~an have been colaied to p~rrnranerrt
storage.
Advantageously, in the method the logger may he used to copy data records from
the data filled
logical section to the pei-rrranerit storage; arir:l, tl~e datar filled
logical section will be marked
available for data storage after the logger has cernpleted <;opying data
records from the data filled
logical section.
From another aspect the method of storing tr4vrxsactior~ data fi>r a
multiplicity of database
applications in a multiple access database management system having permanent
storage and a
transaction log buffer, having a buffer latch to ensr.rre exclusive access for
establishing
reservations, the transaction log buffer being adapted to store data for the
database applications
before transferring the data to tl7e permanent storage, ~.vher~in database
connection agents
associated with the database applications are used to store data in the
transaction log buffer,
includes:
granting control of the buffer latch at a first position in tl~e transaction
log buffer to a first
database connection agent requesting wrote access to write a pre.cletermined
transaction data
record to the transaction buffer;
determining the size of the data record;
updating the position caf the log t~uffe.r off,et wlril~~ holding the buffer
latch to a second
position to define a first data r°eservation iii the transaction log
buffer between the first and
second positions corresponding to the size of tire dtrta record;
granting exclusive write. access to the first data reservation to the first
database
connection agent;
releasing the control of the latch from the first database connection agent;
granting control of the buffer latch at the se~:oi~d position iii the
transactican log buffer to a
second database connection agent;
CA9-2001-003()
CA 02350721 2003-06-26
allowing the first database connection agent tc~ ropy the predetermined
transaction data
record into the first data reservation.
More specifically the method may include granting contiguous write access
reservations in the
transaction log buffer to database connection agents; while allowing database
connection agents
to write data records asynchronously to pr-eviously granted write access
reservations in the
transaction log buffer by:
a) granting control of the buffe w latch at 4vrr initial p~:asiticm in the
transaction log buffer to
a database connection agent requesting write access to write a predetermined
transaction data
record to the transaction buffer to define ra beginning positi~>r~ of a data
reservati~:~n For writing;
b) deternrirrirrg the size of the data record of the database connection
agent;
c) updating the position of the lag buffer offset while holding the latch to
an end position
for writing to define i) a data reservatic:~n for t:he. clatGth~~~.
applications in the transaction log
buffer between the beginning and end positions corresponding to the size of
the data record, and
I S ii) an initial position for' writing a next data rc~s~rvatic~n;
d) granting exclusive write ~.cGess tro tloc~ d<:rta re,~~erv~ation to the
database connection
agent;
releasing the control of the latch fr~orn the database cjc~nrrc:,ctic~rl
agent;
e) granting control of the buffer latch to another database connection agent
at the initial
positicon for writing a next data reservation in tlrra, crarrsartir~rt log
bufter;
f) repeating steps a to e, and
allowing any database connection agents to e:opy predetermined transaction
data records
into their respective data reservations even while ccnrrtr°ol c>I' the
t7ufFer latch is under control of
any other database connection agent.
Another aspect of the invention provides data prcocessing system fc>r storing
transaction data for a
plurality of database applications in a multiple access database management
system having
permanent storage and a transaction lcrg 1>uff~:r t.~:r stare datsr 1'c~r the
database applications before
transferring the data to the permanent storage, wherein databaae connection
agents associated
with the applications are used to store daC<r in the tr~trrlsac ric,nr lc>g
t7uffer, e~onnprising:
CA9-2001-0030 h
CA 02350721 2003-06-26
means for Granting exclusive write access reservations in the transactican log
buffer to a
plurality of the database connection agents; ar~cl,
means for allowing database conneiaion agents tc> write transaction data
records to
previously granted write access w-~aerv~rtic.ms while g~°antirag
Exclusive access reservations to other
database connection agents.
In addition the invention advantageously provides sot'tware. that may be
incorporated in an article
including:
a computer-readable signal bearing rnediurn wkrGrc~ira sKricl medium is .r
recordable data
storage medium or a modulated carrier signal;
means in the medium for storing tra~r~sacti~:~n data for a plurality of
database applications
in a multiple access database management system having permanent storage and a
transaction log
buffer to store data for said database ~rpplications before tr~rrrsfe~~r~irig
said data to said pf;rmanent
storage, wherein database corm ection agents associated witk~ said
applications are used to store
data in said transaction log buffer, comprising:
means for granting exclusive write ac.c~;ss reservations irr said transaction
log buffer to a
plurality of said database connection agents; and,
means for allowing database cortnecaion agents r.o write transaction data
records to
previously granted write access reservations while granting exclusive access
reservations to other
database connection agents.
Another advantageous aspect o1" the inverrtic>n prcuvides a cornl:>uter
pre>gram product adaptable
for embodiment on a computer readable signal-boarir~g rnodiurxl comprising
computer program
code means adapted to perform the steps of the: method of the invention when
said program is
run on a computer.
1 ssar
A number of terms are used in this application and are discussed herein:
Database connection went A database conrr~cti~nra agent is a thread o r'
process that performs a
task on behalf of a database application; it may r°un within the DBMS
engine and may report
when the task has been accomplished.
CA9-2()01-0030 ~J
CA 02350721 2003-06-26
Log buffer (transaction log buffer) As contemplated hc;rein a database
management system has
a transaction log buffer that: is snared anrorrt,>. var°ious
dat:.tl>ase connection agents of the database
management system in order to store data in permanent storage of the database
management
system.
LoE record (transaction log rocord) .A lcag rv;cord is a cc~llecticm crf data
that r°epresents one or
more changes to the data contents of a database rt~anagemerrt system.
Logger (transaction logging process or task) 'this is a special process or
task in the dbms system
which controls the copying of data I'eCUCds from the transaction log buffer to
permanent storage,
which is usually implemented as disk storage.
Log buffer offset This term marks the location irr the: log buffer at which a
reservation will
be started in preparation for the loading of a log record by a database
connection agent. The value
of the offset is related to the begin°ining c~f t:he log brrffor.
Latch A latch in the cr.~r~text of this; ir~ventiorr rTriry be implemented as
a high speed
mutually exclusive locking mechanism that can be used to serialize access to a
resource in shared
memory or storage. The transaction I<:>g but ter° has a IaG~:,ko, for
which an database connection
agent acquires control to prevent another database connection agent from
obtaining write access
to the buffer. If one database ec>nnectiorr agent hays ~:rrrrtrc:~l of the
latch of the; transaction log,
another database connection agent would have to wait until the first database
t:onnection agent
releases the latch until it could ga:rin write access tc:> the trwarrsrrrction
log. The latch serializes access
to the transaction log buffer. If cane database connection agent acquires the
latch, other database
connection agents have to w<:rit until the:. latch is relr:.amd.
Chunk For the purposes of this inventaan the trans~rctiorr log buffer may be
divided into
logical divisions which we will call 'chunks'. lrr a prvferrecl
ernlrc>dir~oent of the invention herein
a counter is provided for each chunk that is used to track when the chunk has
been filled.
CA9-2()01-0030 1 t )
CA 02350721 2003-06-26
Current Chunk For the purposes of this invention, the current chunk is the
chunk which
contains the log bufter offset.
Chunk Counter There is one chunk counter for each chunk in the log buffer.
This is used
to keep track of outstanding copra that need t<> be lcaaded into the log
buffer.
Reservation A reservation is a portion of the log huffier which has been
reserved for the
exclusive use by one database connection agent i'Ur the purpose of loading
data.
Underw a preferred embodiment crf the present. invention, thc: log buffer of a
datalaase; management
system is divided into logical divisions called chunks. The. size of a chunk
is chosen to be large
enough that writing its contents to permanent stcarage is reasonably efficient
and small enough
that ii' the logger needs to wait until a chunk is full befcare writing its
contents to permanent
IS storage, it won't seriously impact database applii;.at.ion rcslac:rrtse
time.
In another aspect of this invention, an database connection agent working on
behalf of a database
application no longer copies transaction log r°~cc:rrds irrtrr the log
buffer while balding the log
buffer latch. Instead, the latch need only be held lung enough to determine
the location in the log
buffer to which a log record is to be copied, aa~<:I t4> upclatG~ the log
buffer offset to represent the
location in the lag buffer far the start of the next log r~e~;~~rd. 'fcr
indicate that the lcrg record data
has not yet been copied into the lag buffer, the chunk counter is also
incremented. In one
embodiment of the present: invention the it~t~:reroentit~g r:::rf the chunk
countc;r i5 done while
holding the latch. If the log record spans multiple chunks, the counter for
each of the affected
chunky is incretrtented. 'The log buffer latch is then released in one
emboeliment of the present
invention. After releasing the latch, the database connection agent then
copies its transaction log
record data to the buffer location (reservatiorta that was determined when the
latch was held. The
chunk counter is decremented after the data is c:opic;cl into tlte. log
buffer. Again. if t:he log record
spans multiple chunks, the saunter for each of the pertinent chunks is
decrernented. The log
buffer latch does not need to be held for the d~:c~rr:~rrnenting ol' the chunk
counter.
CA9-2(101-0030 I 1
CA 02350721 2003-06-26
Under the preferred embodiment of the invention, the logger is still
responsible for writing the
contents of the log buffer to pe.rmanent storage:. 'I"he chunk counter allows
the logger to
detern~ine that a chunk lie. a section of tlne log buff~~°) is filled
with log record data. If the chunk
counter is greater than zero, there are still outstanding log records to be
copied into the chunk. If
the chunk counter is zero, there are no outstarrdin~~ lc~g r~;c°c>rds
to be copied to they chunk. Once
the chunk has been filled, the logger is then free to write the contents of
the chunk
asynchronously to permanent storage. If rnultipl~: chunks are: filled, the
logger will issue
asynchronous write requests (the requests arc preferwably isse.red
sequentiall~,~, but the actual
writing of the data from the log buffer to permanent storage will occur
asynchronously) for each
of the chunks. I"he chunk size should pr~~fexably be chase;ra fcrr°
goad balance 1"ar both efficient
transference of data to storage lie, system throughput) and application
response time. Writing
chunks asynchronously gives the, logger an advantage sincs..° it cart
perform atlrer work while the
log records are being transferred to permanenn stc.ar~.go.
Under an aspect of the invention, the latch is rc~clrtired only tar each
database connection agent to
make its reservation in the lag buffer. The serializatityn of the latch
results in the sequential order
of these reservations in the log buffer, which care be used t~P sequentially
order the log records to
write them to the permanent storage. The writing of the la~~; data tc~ its
reservation by one
database connection agent is done independently of any other database
connection agent making
a reservation. Multiple database; connection agents irrdc;pendently write lc~g
data to their
respective reservations. The tracking of the Completion of the transference of
data into the log
buffer is performed by chunk counters.
Brief Description of the Drawings
The present invention may be best understood f torn thc: 1'c~llt>w~ng detailed
description taken in
conjunction with the accompanying drawings of which:
Fig. 1 depicts a diagram at" the transac~~tiorr data :tc~rK:rge system of the
invention depicting
transaction log buffer, main storage and Irrocesses used ten lc7ad data into
the main storage using
the transaction log buffer;
CA9-2(?01-0030 1 '?
CA 02350721 2003-06-26
Fig. 2 depicts a flow chart of =r lagging task for ;~tc~rirr~ transaction data
in a log buffer,
and,
Fig. 3 depicts a fi7ow chart of a loggers ~:~a;,is i'csr transferring
transactian data from the lag
buffer to permanent storage.
Preferred Embodiment of the Invention
In multiple access database management systems shared st~aragcr is used to
star; data from many
database applications and accordingly a number of applications frequently have
data records that
need to be stored in the database of the database management system in arder
to keep the
database current. As indicated previc,~usly under l;>r°ii>r ~xr~.
trans~~c.tic:>o storage systems the serial
nature of the reservation and storage systems pr~esGnt;a t~ottlenec:k to data
storage and
processing.
Referring to Fig. 1 which shows the basic structr~re of the trans~actian data
storage system of the
invention it may be seen that a transaction log buffer 1 is used to
temporarily store transaction
records 4 of database connectiarr agents :l int.~~r~deci t<>r ;~tc.rr~ag~ in
permanent storage, in this case
disk storage 2.
A number of database connection agr_:nts 3 are provided t~.o atte,nd to the
storage of transaction
data records 4 of database applications using the: database: rrranagement
system. The database
connection agents are assigned by the database naanagernerrt system to each
database application
using the database management system. In thc~ loreferred ernl~adirnont of
this; invention in a
database management system ccynternplated ler°esirr the
ds:rttrhase° connection agents are operable
by the database management system to operate asynchranously of each other so
that transaction
data of~ each database application can the~oreti~°ally lie loaded irata
the log buffi;r without being
constrained to wait for the loading of transaction data of another database.
As the transaction log buffer 1 is structured as memory ahared among a number
of database
connection agents it has a latch 6 which is use:cl tea srr.rializtr rrcco~,a
tco it tco prevent the data of one
database connection agent from being overwritten by the data of anather
database connection
CA9-2001-003() l _~
CA 02350721 2003-06-26
agent. The database management system of the invention assigns exclusive
control of the latch 6
to a database connection agent :~ requesting tire stc:~rage c:rf' a c:lata
ree:ord ~ in order to prevent
another database connection agent 3 from obtaining write access to the log
buffer 1. When one
database connection agent 3 has control of tlrc ls.ctch t~ cof the transaction
log buffer l, another
database connection agent 3 must wait until the first database connection
agent 3 releases the
latch 6 until it can gain write access tc~ the trarrsac.ticsn loll, buffer. As
may be appreciated from
this discussion the latch 6 seritrlizes access to the transaction log buffer 1
for the purpose of
allowing an database connection agent to make ow rewservati~~ra t'or the
storage of caata.. Unlike prior
art storage systems the latch is not used to serialize data loading into the
buffer. Loading of the
log buffer is allowed asynchronously c>nce a rc;servation is r~nade., as
discussed below.
Under the preferred embodiment of the present inventic>rr taoe transaction
logging control system
of the database management system operates by first granting, to a requesting
database
connection agent 3 a reservation 5 for exclusive write acccas t~> the
transaction log buffer 1 by
granting to the database connection agent, the latch la of the transaction lag
buffer 1. The
database connection agent deterrni es the sta~~ting lyositiorr crt' its
z°eservation 5 from the value of
the log buffer offset after obtaining the latch. 'The database connection
agent uses the size of its
log record to update the buffer of°fset by incrementing tire lu.rffer
offset value sufficiently for the
storage of the log record. This updated buffer offset will be used to mark the
end of the
reservattion, and the beginning of" another reservation. In ac:lditicara the
database connection agent
will increment chunk counter 8 for the chunk in which the reservation is
contained (This counter
will be deeremented after a reservation is fillc~cf witlo cfata;l. t'::~nc~~
this has been done; the database
connection agent 3 will release the latch 6. A ter a resc:.r-vation 5 is made
for the database
connection agent: 3 the database connection age:.r]t care load its data record
4 in it;~ reservation 5 in
the transaction log buffer 1. Because of t:he high speed of oper;:rtion of the
latch 6 a number of
reservations 5 may possibly be made before; data is l~:~a~a~rl in one
reservation 5. Taatabase
connection agents are able to transfer their data into their established
reservations,
asynchronously of each other wittrout overwriting c:latrr of araothr:~r
database connection agent. As
may be appreciated from this discussion the storage. of data by database
connection agents is not
restricted to sequential operation and thus ca ra k» pt.rf-ornnesd i~t a
higher rate possible than if
CA9-2(101-0030 ~ 4
CA 02350721 2003-06-26
restricted to sequential operation. It rnay he preferable to have the buffer
filled with contiguous
reservations.
In order to increase the efficiency of data transfer from the transackican log
ln~ffe~r to main storage
it has been found effective to divide up t:he trAtnsac ion Ic~g buffer into a
number of sections 7
(called chunks in this embodiment). As ttmy he appreciated from Fig. l, as
reservations are
established sequentially within the transaction log bufi'er, the reservations
will first fill chunk 1
then chunk 2, etc. As will be understood by those: :kille~a in the art. When
onf,~ of these chunks
has its reservations filled with data it will be efficient to try n;~fl r that
data to permanent storage.
This is done by the logger 12. Chunk counter 8 of each chunk is used to
determine when a chunk
becomes filled. As discussed above, tl~e c;c>urrker of" a chunk i;~
incremented when a data
reservation is made in it and decremented when the datait~a~;~ c:o~trtection
agent's data is stored in
it. In the preferred embodiment the process of incrementing the counter is
performed when a
database connection agent has acquired the latch, before it releases it. If
the counter starts at 0
when the chunk is empty it will return to zerca wlte.~t all r~:si:rvatio~ts
made in the chunk are filled
with data. In the situation where a res~rv~~tiou of sufficient ,~i~e is
established straddling multiple
chunks the chunk counters of all affected chunks are incremented on
reservation establishment
and decremented when the reservation is f-'illed.
When a chunk is tilled Logger 12 will access the chunk and write the contents
of all of the
reservations in it to permanent storage. When the records have been copied the
logger will can
notify the relevant database connection agents that the records have been
copied successfully. As
can be seen from above the logger can atdvantagectusly ~ol~erat~
asynchronously of the loading of
data in the transaction log buffer.
Some transaction logging requests have to ~w<tit 1'or the lc°,sg record
data to be written to disk.
Other logging requests are. completed when the log record data i,5 wri~aen
into the log buf-fer. The
logger may need to do some processing (such as calculating a page check sum)
before it can issue
an I/O request to write log data to permanent storage. After the f/O request
to write log data to
permanent storage complete;, the logger toay need to itrfor°rta orte
<or more daCabase connection
CA9-2001-0030 I 5
CA 02350721 2003-06-26
agents that are waiting for their log record data to be on permanent storage
that this has been
achieved.
Referring to Fig. 3 which depicts the opera tion o1' tlrL logger 1~ c~f the
preferred embodiment of
the invention it may be. seen that when the 'logger l2 is tr~yirlg to
del:ermin a 30 whether a chunk
has all its space reserved by waiting 31 for counter to return to (), meaning
that all agents have
fully copied their records to the chunk, then thc: logger prepa.rE~s 32 ttte
data ire the chunk to be
written to permanent storage and then issues an rtsynckrroraous write request
to perrrranent storage
to store the data. If however, it the chunk referenced is rmt fiilled 40 the
logger looks for other
work to do and serves 41 a request for that work 1! then c;an check 42 to
deter mine if a previous
asynchronous write request to permanent storage. has been completed. If not it
can check to
determine if another chunk is ready to handled. I1' the I/() has completed
successfully 43, ie. Data
in the relevant chunk has been successfr,rlly loaded inlc~ l:3er~~r~~rnent
storage, then the chunk is
marked for reuse 44 so that database connection agenta can reuse it. The
logger can perform this
reuse marking task.
The invention improves the efficiency of transaction logging 1/O throughput
between the log
buffer and database permanent stcorage by enahiing the async°llronous
logging of transaction data.
As we have discussed, objectives of the invention include improving
transacticm data handling
system throughput, as well as response time o1' applicaticrrrs ~~raking
changes to the database.
The solution in the preferred embodiment of the invention includes:
1. The logger tries to writes a fixed size aa' log d~rta tc~ disk. This fixed
size of data is
called a chunk. The log buffer size is bigger than the chunk size. When there
is a lot of
log data (more than a chunkl in the iog buhl'er, the logger breaks trp the
data into
multiple chunks and thus will issue arurltipl~: write requests for each chunk
to write the
log data to disk..
2. The logger uses the asynchronous 1/C) irvterface provided by the operating
system. By
using asynchronous writes" the logger catn issue: write requests bevForr~.
previous write
CA9-2001-(.)030 ~ t~
CA 02350721 2003-06-26
requests have completed. if a chunk is filled wi~:lt log record d~rta arzd is
ready to be.
written to diak, the write request for the chunk is issuc;d before the
lo;~ger. prepares any
new chunk to be written to disk. '-fhe write requests must be made in order
and collected
in order ('collected' irr this context means i;crnf'irmmin,~," that data fcar
flue write requests
have successfully been written to permanent stc>r°ax=e).
3. In some cases, database rrtanagerarcnt system processes or tasks rnay need
to wait for
their log data to tae written to permanent storage: f~~l'ore continuing to
perform other
work. Writing in chunk sizes that are far°ge ertou~;h 1'c~r efficient
1/O and small enough
that the I/O completes quickly, database connection agents corn be notified
frequently
that their log data ha.s been written to permanent storage. On the other hand,
if the
logger used write sizes that. are much larger tlaarr a drunk size, 'the
throughput of the
logger would not benefit sigrrifi~arrtly az~d C)E3MS tasks would be notified
less
frequently that their log data has been written to permanent storage. Using
write sizes
that are larger than a chunk :size will t°esutt iu poor response time
for some of the
database applications.
4. When one write request completes, the logger first checks if more log data
is
available to be written to disk. If ther°c is more Ic:~~ drit<:r to be
written to disk, the logger
will issue the write request for this data before informing any DBMS tasks
that are
waiting for the log data in tht;, cornpletecl write. ~flris rate%aros that
ear:. hi chunk should have
infc~rmatian about which log records are 1'ully cor~txtir~ecl in it. The task;
that are waiting
their log data to be written to disk can use this information to decide if
their requests are
completed. Note that the rnaintenstnct~ c,t this, iorf<arrrration locr chunl~
is done by the
DBMS tasks that make logging requests. 'Tlnerc.fc:nre, the overhead of
maintaining this
information has almost no impact on the throughput of the logger.
5. The DBMS processes or tasks making lobginlrequests are serialized only to
the extent needed to determine the position in the Iog buffer that the log
data should be
copied into and to update position for° t.hc next Rrsg record data.
Copying of the log
CA9-2001-0(:)3() I r
CA 02350721 2003-06-26
record data into the lc~g buffer is not serialized. This improves the speed at
which
DBMS tasks can copy lag record data irtta the: l~4rg buffer since rttultiply
DMBS tasks
can copy data into the°. Icog buffer- at the same time:. Hcrwev~r, this
method creates a
problem for the logger. 'The logger needs to ensure that all of the
outstanding copies
have completed far a chunk l,r~t'or~: it ~,~a.n issm.~ ttte: write request.
rl'his is salved by
maintaining a counter for each chunk that represents the number of outstanding
copies
for the chunk. They cauttter is incrertrtentecl try arty t:)M13S task that
rescxves space in the
log buffer chunk prior to copying its lc>g data into thc~ log buffer. The
counter is then
subsequently decremented at'ter the DBMS task ct~pie;~ the lag data into the
log buffer.
l0 Note that the maintenance of this counter is performed by the DMI3S tasks
that are
making logging requests and not the: logger itself. ~fh~refore, this overhead
has almost
no impact an the throughput of fate logger process.
The solution presented in the preferred ernbadirnent imprave;~ the throughput
of the
logger because the logger can have multiple write requests being handled by
the disk
subsystem at any given time. Since che~~ Itogg~r is using ~t.sync:hranous t/0,
it is free to
report the completion of previous write requests to any l>BMS tasks that are
waiting and
prepare far new write requests while log d4ita is being written to perrrranent
storage. It
also improves the response time of applicaticm since. the completion of log
data writes is
reported mare frequently then if the logger wrote very latrge amounts of log
data with
each write to permanent storage. 'Che irtrhrctved et"li~:ier~ty in reserving
sl:aace~ and copying
log data into the log buffer also helps the response time for DMBS tasks that
make
requests which do not require waiting for clne~ lag dat~:t to t;~e written to
permanent storage.
The preferred embodiment of the invention rterein rttay be better understood
by reference to the
following Pseudo Code:
A) Simplified version of Pseudo code:
Pseudo rode:
cA9-zcbl-oo:~0 1
CA 02350721 2003-06-26
DA'I'AI3ASE CONNEC'T1ON AC3Ehd'1':
get latch
update position in log buffer
increment counter for current chunk
release latch
ropy transaction record to log buffer
decrement counter
LOC3GER:
loop
if (there is a new chunk]
i
I 5 wait for counter to be ()
set up, issue async I/O
}
else
IO complete = FALSE
if (there is other request)
serve request
check (but do not wait) fior I/() completion, if there is 110 complete, set IO
complete = TRUE
}
else
wait for I/O completion
I()_complete = T'12L1E:
}
if (IC) complete)
(
mark chunk for reuse
}
}
CA9-2001-0030 I