what are the writeConflicts in mongodb

The writeConflicts metric is used by storage engines that implement optimistic concurrency control (for example, WiredTiger).
This metric is reported in:
  • mongod log lines for write commands
  • output of the db.serverStatus() command

Example: mongod log lines for write commands

The following mongod log entry displays an update query with its execution statistics. Note the writeConflicts:84 metric:
2017-08-06T0:50:04.461-0400 I COMMAND  [conn1022] update mydb.mycollection query: { _id: 1 } update: { $inc: { myCounter: 1 } } keysExamined:1 docsExamined:1 nMatched:1 nModified:1 keyUpdates:0 writeConflicts:84 numYields:85 locks:{ Global: { acquireCount: { r: 87, w: 87 } }, Database: { acquireCount: { w: 87 } }, Collection: { acquireCount: { w: 86 } }, Metadata: { acquireCount: { w: 1 } }, oplog: { acquireCount: { w: 1 } } } 2998ms
Most of the time writeConflicts is zero. writeConflicts:84 is an atypically high value for a command, especially in this case where it is only updating one document.

Example: output of the db.serverStatus() command

In the output of db.serverStatus()metrics.operation.writeConflicts counts the number of write conflicts that the mongod process encountered since its last restart:
mongo> db.serverStatus()
{
  "host" : "testhost123.corp.internal",
  "version" : "3.4.7",
  ...
  "metrics" : {
    ...,
    "operation" : {
      "scanAndOrder" : 71513,
      "writeConflicts" : 6490211
    },
    ...
}
The important element to track from metrics.operation.writeConflicts is the rate of increase over time. For example, the following script determines the average write conflicts per second:
mongo> var wc_start = db.serverStatus().metrics.operation.writeConflicts;
mongo> sleep(60000); //1 min
mongo> var wc_end = db.serverStatus().metrics.operation.writeConflicts;
mongo> print(((wc_end - wc_start) / 60) + " write conflicts per sec on average.");
Note
db.currentOp() output does not include write conflict metrics.

What do writeConflicts indicate?

Every type of command that performs a write can encounter write conflicts, including less frequently run user commands such as creating indexes and dropping collections. Administrative commands that write to a system collection, such as createUserreplSetReconfig, and enableSharding, can also encounter conflicts although it is very rare. In practice, you are only likely to see notable writeConflicts on your busiest, user-made, non-system collections.
A non-zero writeConflicts metric indicates that WiredTiger has detected an update that could potentially create a data concurrency violation. The update does not fail, nor is it sent back to the client with an error. Instead, the writeConflicts metric increments and the update is retried by MongoDB until it completes without a conflict. From the client perspective, the write succeeds and is otherwise normal except for increased latency.
Because WiredTiger uses optimistic updates for document-level concurrency, writeConflicts indicates that multiple clients are trying to update exactly the same document within the same fraction of a second. This window is the time between the execution of the WiredTiger API's WT_SESSION::begin_transaction() and WT_SESSION::commit_transaction() operations. There is no fixed minimum time for these operations. It could be double or even single-digit microseconds if the amount of data to be written is small and the server has no queues for CPU or memory channels. There is no fixed maximum time either, although it is somewhat dependent on the performance of the server.

Internal writeConflicts process

  1. The storage engine returns a status value of WT_ROLLBACK from the document-update function.
  2. The MongoDB integration layer wrapping the storage engine API throws a WriteConflictException.
  3. That exception is caught at a higher scope of the operation's execution thread.
  4. The operation's writeConflicts metric increments.
  5. The operation is resubmitted at increasing intervals in following cycles of WiredTiger transactions until it succeeds:
    • It is retried immediately three times.
    • After 4 retries a sleep of 1ms is applied before each successive retry.
    • After 10 retries the sleep period increases to 5ms.
    • After 100 retries the sleep period increases to 10ms.

When does writeConflicts indicate a problem?

Small number of writeConflicts/sec

A small number of writeConflicts per second does not usually indicate a problem. For example, 10 conflicts per second does not have a performance impact on a server that processes a thousand operations per second. Data concurrency is preserved with only a small latency cost for those conflict updates, which is measured in milliseconds or even microseconds between one storage engine logical commit and the next. The difference in average server latency is negligible, and is unlikely to be noticed by a client that must also deal with network latency.

Large number of writeConflicts/sec

A large number of writeConflicts per second can create performance issues. For example, 100 updates for the same document that arrive simultaneously are much slower than 100 updates for different documents. The reasons for this difference are:
  • The server must execute 100 commits serially, rather than 100 updates in parallel that are executed in a single or small number of commits.
  • The server executes an exponential amount of CPU work:
    • WiredTiger creates 100 MultiVersion Concurrency Control (MVCC) versions of the document, of which 1 is kept and the others are discarded.
    • 99 updates are retried. X updates, where X <= 99, reach the WiredTiger layer simultaneously. X MVCC versions of the document are created, and 1 is committed while the others are discarded.
    • The remaining updates are retried in the above fashion until all are successful. In order to commit n updates, an exponential number of uncommitted updates must be created. That exponential factor is quadratic in the worst case.
The back-off timing logic in the internal retry process might reduce the number of attempted-but-discarded updates, but not if the application continues to queue more updates to the same document.

What to do if you see a large number of writeConflicts/sec

WiredTiger write conflicts are caused by several application processes, or threads within a multi-threaded application, all competing to update the same document at the same time. To address the issue, first identify the offending operation, then improve the corresponding update logic in your application.

Identify the operation causing conflicts in the mongod log file

An operation causing conflicts records a writeConflicts:XXX field in the logs, where XXX is a number greater than 50, and likely over 100.
At the default log verbosity a command is logged only if it takes longer than the slowOpThresholdMs span to execute. Write conflicts often result in slow operations, but you can lower slowOpThresholdMs to capture operations below the 100ms default, or run the db.setLogLevel(1, "command") command to log all commands.
Search the logs for operations with a high number of writeConflicts. For example, you can use grep to find operations with over 100 writeConflicts:
grep 'writeConflicts:[1-9][0-9][0-9]' [pathToLogfile]
Depending on the situation, you may need to lower the conflict threshold. For example, this variation looks for operations with more than 50 conflicts:
grep 'writeConflicts:[5-9][0-9]' [pathToLogfile]
If nothing appears in the logs but the serverStatus metrics indicate a high rate of writeConflicts/sec, raise the log verbosity and wait until the issue occurs again.
In situations where conflicts appear in many different operations, you may need to refine your search. For example, this script finds the top 20 commands with the highest number of writeConflicts to date:
grep 'writeConflicts:[1-9][0-9]*' [pathToLogfile] |
sed 's/^.*writeConflicts:\([1-9][0-9]*\).*$/\1\t\0/' |
sort -nr | sed 's/^[1-9][0-9]*\t//' |
head -n 20

Improving update operation patterns

After identifying the update operation that your application runs in parallel, consider:
  • what the application is trying to achieve
  • if the operation is required
  • alternative methods that reduce the total number of updates to the same document
For example, if the document is an application-wide counter, you could delay the update for 100ms and combine other count increases that occur in that period before sending. That is, you could send one update of {$inc: N} rather than N updates of {$inc: 1}.

Comments

  1. It’s a wonderful blog you shared here, I read the whole content and found it so easy to understand in language and is also full of amazing and useful information. Thanks a lot for sharing it.
    Hire Mongodb developer

    ReplyDelete
  2. Your blog is in a convincing manner, thanks for sharing such an information with lots of your effort and time
    mongodb online training
    mongodb online course
    mongodb course
    mongodb training

    ReplyDelete
  3. very nice blog! For sharing content and such nice information for me. I hope you will share some more content about. Please keep sharing! Hire Mongo DB Developer

    ReplyDelete
  4. Thank you for sharing any good knowledge and thanks for fantastic efforts. Thank you for sharing wonderful information with us to get some idea about that content. Igained more knowledge from your blog. Keep Doing..
    oracle training in chennai

    oracle training institute in chennai

    oracle training in bangalore

    oracle training in hyderabad

    oracle training

    oracle online training

    hadoop training in chennai

    hadoop training in bangalore




    ReplyDelete
  5. Awesome links, it has helped me a lot. Thanks for sharing.
    MongoDB Training in Bangalore

    ReplyDelete
  6. tools that automate and scale events personalize attendee experiences and deliver positive ROI. event marketing, thank you for your email. and best subject lines for events

    ReplyDelete
  7. Thanks for sharing this wonderful information. I too learn something new from your post..
    Mean Stack Training in Chennai
    Mean Stack Course in Chennai

    ReplyDelete
  8. Great Post with valuable information. I am glad that I have visited this site. Share more updates.
    IELTS Coaching centre in Chennai
    IELTS coaching in velachery
    IELTS Coaching Center in Porur

    ReplyDelete
  9. Nice article, its very informative content...thanks for sharing...Waiting for the next update...
    Web Services with SoapUI online training
    Web Services with SoapUI training online

    ReplyDelete
  10. Nice article, its very informative content..thanks for sharing...Waiting for the next update.
    React Native Online Training
    React Native Online Course

    ReplyDelete
  11. Nice blog, very informative content. Thanks for sharing, waiting for the next update…

    java basic tutorial
    java best tutorial

    ReplyDelete
  12. Really nice blog. thanks for sharing such a useful information.
    Kotlin Online Course

    ReplyDelete
  13. Great blog.thanks for sharing such a useful information
    Informatica Training in Chennai

    ReplyDelete
  14. This post is so interactive and informative.keep update more information...
    Web Designing Course in Tambaram
    Web Designing Course in chennai

    ReplyDelete
  15. This post is so interactive and informative.keep update more information...
    dot net training in Tambaram
    Dot net training in Chennai


    ReplyDelete

Post a Comment