A replica set member transitions to the
ROLLBACK
state if it has applied writes that a majority of the members in the replica set have no record of, and that same majority has applied writes inconsistent with those known to the isolated member.
The following conditions may trigger a rollback:
- Member has recently re-established communication with other members and is determining appropriate status.
- Latest local oplog identity does not occur in peer oplogs.
- Scanning backwards to oplog point of agreement reveals less than 300MB of data to revert.
When a rollback occurs, MongoDB writes the rollback data to BSON files in the
rollback/
folder under the database’s dbPath
directory using the naming pattern <database>.<collection>.<timestamp>.bson
. You can view the contents of the rollback files using bsondump, and restore that data using mongorestore if required.
For more information, see Rollbacks During Replica Set Failover.
Rollbacks on secondary members
While a primary member is most likely to experience a rollback, it is possible to isolate several members in such a way that a secondary (that has never been primary) must rollback.
Example
In a five member replica set there is a constant stream of writes to the primary. A network partition isolates the primary and one secondary together. These members continue taking writes for a brief period, and the nearby secondary replicates these writes. This occurs regardless of
WriteConcern
, although clients do not receive write confirmation in this situation if w='majority'
is set.
When the members of the replica set detect the partition, the primary steps down and the group of three members still in contact with each other independently elects a new primary. If the new primary accepts new writes before the split is solved, the previous primary and its nearby secondary will require a rollback.
Rolling back operations without original document
During the rollback process, the member determines the related document for each undesired oplog event. It then requests these documents from one of the other members. The local documents, with the unacknowledged changes, are saved to the rollback directory, and the database copy is overwritten with the peer supplied version. The rollback member then removes the undesired entries from the local oplog and applies new entries as given by its peers. The member remains in
ROLLBACK
state throughout this period, as it has an inconsistent database view until it passes the point in the peer oplog that it last retrieved a rolled back document for local overwrite.
Comments
Post a Comment