Write-Ahead Logging (WAL) in PostgreSQL and Redo Logs in Oracle have some things in common. Here I want to compare both with respect to database configuration.
WAL and Redo Logs both have the same purpose: to guarantee data integrity the database management system must write changes to some kind of stable storage. But instead of writing data directly to the location where it belongs and from where it will be retrieved later on, records describing the changes (in Oracle sometimes called change vectors) are written to dedicated log files. After the log has been flushed to disk the DBMS has enough information available to recreate the original data changes after a crash. The data eventually will be written to the correct location by an operation called checkpoint. When the checkpoint is completed the log is no longer needed for crash recovery (but it could still be valuable for log based replication or disaster recovery).
The invention of these logs is probably the consequence of plain old rotational disks being the prevailing kind of stable storage for most of the Information Age. The throughput of writing sequentially to a disk is much larger than writing single blocks to different spots on the disk. Therefore it will in general be much faster to write the log with a change description than to write the changed data itself to the database files.
The most visible difference between WAL and redo logs is their configuration.
An Oracle DBA has to decide on the number of redo logs (two or more; to
simplify things I will ignore mirrored redo logs here) and their size. So the
disk space used by the redo logs is fixed and will only change if the DBA
modifies the configuration. For PostgreSQL on the other hand the size of a
WAL file is always 16 megabytes (unless the server software has been
recompiled with a different value). The exact number of WAL files depends on
a couple of configuration parameters
wal_keep_segments) and the current load profile.
So what does this mean for the database operation? Both DBMS will reuse a log when the contained data is no longer needed for recovery. That implies that the checkpoint for the relevant data blocks must have finished. The log must also have been archived successfully if archiving is enabled.
Oracle can only use the pre-configured logs and will emit warnings or error
messages when a log cannot be reused. The message “
complete” in the alert log indicates that Oracle has to postpone
database changes because switching to the next redo log is not yet possible
due to the still running checkpoint. Obviously this reduces the performance
of the application. It gets worse when the archiver is unable to archive a
log. In this case the error message “
ORA-00257: archiver error.
Connect internal only, until freed” comes up and the database
freezes until the issue is resolved by the DBA.
PostgreSQL on the other hand will start to add more WAL files to the
pg_xlog when it can not reuse an existing log. This
prevents the database from stalling if checkpoints do not finish in time or
the archiver process can’t keep up with the work. Of course an adequate
amount of disk space must be reserved for PostgreSQL to make this possible
Both DBMS offer parameters to control the frequency of checkpoints. In
PostgreSQL the parameter
checkpoint_timeout is used to define
the maximum time interval between two checkpoints. Oracle has a similar
log_checkpoint_timeout. But nowadays the
checkpoint frequency is often automatically adjusted to satisfy the crash
recovery time. The parameter
fast_start_mttr_target sets the
target for the duration of a crash recovery and Oracle triggers checkpoints
to achieve that goal. So in this case the configuration of Oracle uses a more
business aligned number.
In both systems the amount of changes done also triggers a checkpoint.
PostgreSQL has the parameter
checkpoint_segments to set the
number of segments written to the WAL files before a checkpoint happens. As
said above, the segments normally have a size of 16 megabytes. So this gives
at least 16 megabytes written to WAL until this type of checkpoint is
triggered. For Oracle the parameter
specifies the number of OS blocks (containing 512 bytes on most platforms)
between checkpoints. A checkpoint is also triggered by a redo log switch and
therefore the amount of changes is also defined by the size of the redo logs.
A DBA can create redo logs much smaller than 16 megabytes so this could lead
to very frequent checkpoints in Oracle. In this case the DBA will have to
increase the size of the redo logs to reduce the frequency of these
checkpoints. In PostgreSQL the same can by achieved by increasing the
The first step when looking into checkpoint activity is to write the relevant
details to a logfile. PostgreSQL uses the parameter
log_checkpoints to write details of every checkpoint into the
server log. The log even shows if the checkpoint has been triggered by the
timeout (indicated by
checkpoint starting: time) or by the
amount of changes (indicated by
checkpoint starting: xlog).
Oracle uses the parameter
log_checkpoints_to_alert to write
details into the alert log.