Two-phase HDB commit via iprop log, + GC for log
We used to update the iprop log and HDB in different orders depending on the kadm5 operation, which then led to various race conditions. The iprop log now functions as a two-phase commit (with roll forward) log for HDB changes. The log is auto-truncated, keeping the latest entries that fit in a configurable maximum number of bytes (defaults to 50MB). See the log-max-size parameter description in krb5.conf(5). The iprop log format and the protocol remain backwards-compatible with earlier versions of Heimdal. This is NOT a flag-day; there is NO need to update all the slaves at once with the master, though it is advisable in general. Rolling upgrades and downgrades should work. The sequence of updates is now (with HDB and log open and locked): a) check that the HDB operation will succeed if attempted, b) append to iprop log and fsync() it, c) write to HDB (which should fsync()), d) mark last log record committed (no fsync in this case). Every kadm5 write operation recover transactions not yet confirmed as committed, thus there can be at most one unconfirmed commit on a master KDC. Reads via kadm5_get_principal() also attempt to lock the log, and if successful, recover unconfirmed transactions; readers must have write access and must win any race to lock the iprop log. The ipropd-master daemon also attempts to recover unconfirmed transactions when idle. The log now starts with a nop record whose payload records the offset of the logical end of the log: the end of the last confirmed committed transaction. This is kown as the "uber record". Its purpose is two-fold: act as the confirmation of committed transactions, and provide an O(1) method of finding the end of the log (i.e., without having to traverse the entire log front to back). Two-phase commit makes all kadm5 writes single-operation atomic transactions (though some kadm5 operations, such as renames of principals, and changes to principals' aliases, use multiple low-level HDB write operations, but still all in one transaction). One can still hold a lock on the HDB across many operations (e.g., by using the lock command in a kadmin -l or calling kadm5_lock()) in order to push multiple transactions in sequence, but this sequence will not be atomic if the process or host crashes in the middle. As before, HDB writes which do not go through the kadm5 API are excluded from all of this, but there should be no such writes. Lastly, the iprop-log(1) command is enhanced as follows: - The dump, last-version, truncate, and replay sub-commands now have an option to not lock the log. This is useful for inspecting a running system's log file, especially on slave KDCs. - The dump, last-version, truncate, and replay sub-commands now take an optional iprop log file positional argument, so that they may be used to inspect log files other than the running system's configured/default log file. Extensive code review and some re-writing for clarity by Viktor Dukhovni.
This commit is contained in:
@@ -38,7 +38,7 @@
|
||||
.Nm iprop ,
|
||||
.Nm ipropd-master ,
|
||||
.Nm ipropd-slave
|
||||
.Nd propagate changes to a Heimdal Kerberos master KDC to slave KDCs
|
||||
.Nd propagate transactions from a Heimdal Kerberos master KDC to slave KDCs
|
||||
.Sh SYNOPSIS
|
||||
.Nm ipropd-master
|
||||
.Oo Fl c Ar string \*(Ba Xo
|
||||
@@ -110,13 +110,18 @@ which sends the whole database to the slaves regularly,
|
||||
.Nm
|
||||
normally sends only the changes as they happen on the master.
|
||||
The master keeps track of all the changes by assigning a version
|
||||
number to every change to the database.
|
||||
number to every transaction to the database.
|
||||
The slaves know which was the latest version they saw, and in this
|
||||
way it can be determined if they are in sync or not.
|
||||
A log of all the changes is kept on the master.
|
||||
A log of all the transactions is kept on the master.
|
||||
When a slave is at an older version than the oldest one in the log,
|
||||
the whole database has to be sent.
|
||||
.Pp
|
||||
The log of transactions is also used to implement a two-phase commit
|
||||
(with roll-forward for recovery) method of updating the HDB.
|
||||
Transactions are first recorded in the log, then in the HDB, then
|
||||
the log is updated to mark the transaction as committed.
|
||||
.Pp
|
||||
The changes are propagated over a secure channel (on port 2121 by
|
||||
default).
|
||||
This should normally be defined as
|
||||
@@ -175,6 +180,11 @@ like 5 min, 300 s, or simply a number of seconds.
|
||||
.Pa slaves ,
|
||||
.Pa slave-stats
|
||||
in the database directory.
|
||||
.Pa ipropd-master.pid ,
|
||||
.Pa ipropd-slave.pid
|
||||
in the database directory, or in the directory named by the
|
||||
.Ev HEIM_PIDFILE_DIR
|
||||
environment variable.
|
||||
.Sh SEE ALSO
|
||||
.Xr krb5.conf 5 ,
|
||||
.Xr hprop 8 ,
|
||||
|
Reference in New Issue
Block a user