Mainnet transaction delays

Incident Report for Base

Postmortem

Incident summary

On April 9th at approximately 01:15 UTC, Base’s batcher experienced an increase in data that was pending submission to L1. This was due to a sudden spike in transaction volume on Base. To help stabilize the network, the Base sequencer reduced block sizes to allow the batcher to catch up (posting all remaining & new data to L1 blobs). This lasted for 3 hours which during that time, our block sizes were roughly 33% (20 Mgas) of the 60 Mgas target.

While block sizes were reduced, some users experienced delayed or failed transactions, as transactions sat in the mempool longer than expected and some were eventually dropped.

Once the backlog cleared and normal batcher operations resumed, block sizes increased to process the pending activity. This spike in block size caused some node operators to fall behind, leading to syncing issues with their nodes.

This issue has been fully resolved. At no point were user funds at risk — all assets remained safe throughout the incident.

Impact

  • User transactions: Some transactions were delayed or dropped from the mempool for a brief period of time, leading to some users experiencing failures and having to resubmit.
  • Node operators: Larger blocks put extra strain on node resources, resulting in slower synchronization and, in some cases, stalled node performance.

Resolution

  • The batcher has fully recovered, and blocks are now being processed as expected.
  • We have published fresh snapshot links (see below) to help node operators rapidly restore any behind nodes.

Recommendations for node operators

  • Restore from snapshot if behind: If your node falls significantly behind the chain head, restoring from a snapshot is the fastest way to resync.
  • Ensure your setup is optimized: Monitor your node logs for signs of high resource usage or network latency, and ensure your setup has sufficient CPU, RAM, and disk I/O capacity to handle larger blocks. We recommend using a locally attached NVMe SSD for optimal disk performance.
  • Consider redundancy & load-balancing: Running a secondary node or using a load balancer can help maintain uptime and stability if you process a significant transaction volume.

Next steps

  • Refine batcher logic: We are iterating on our batching mechanism to prevent large backlog scenarios due to L1 congestion and ensure smoother block size transitions.
  • Improve mempool management: We plan to scale transaction pool systems to reduce dropped transactions.
  • Expand operator guidance: Additional tools and documentation will be released for troubleshooting and faster sync recovery.

User guidance

  • Resubmit dropped transactions: If your transaction was dropped as a result of the aforementioned issues, simply resubmit it with an appropriate gas fee.
  • Utilize stable RPC providers: If your self-hosted node remains out of sync, consider routing through a third-party RPC provider while you recover or resync.

For support or questions, please don’t hesitate to contact us in the Base Discord, we're happy to help you there.

Posted Apr 09, 2025 - 19:50 UTC

Resolved

This incident has been resolved.
Posted Apr 09, 2025 - 19:36 UTC

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Apr 09, 2025 - 04:39 UTC

Update

We are continuing to investigate this issue.
Posted Apr 09, 2025 - 03:23 UTC

Investigating

We are currently investigating this issue.
Posted Apr 09, 2025 - 03:21 UTC
This incident affected: Mainnet (Transaction pool).