This is the BNB Smart Chain Annual Storage Report 2023 written by NodeReal. This report serves as a comprehensive guide to the storage status of the BSC in the past year and the efforts being made to address the challenges posed by its large storage size.
1. TLDR
The 3rd year since the launch of BNB Smart Chain (BSC) in Aug 2020 has come to a close. Despite experiencing a lower traffic volume in 2022 compared to the market surge in 2021, BSC has proven to be resilient and maintained its stability by consistently generating blocks and updating its world state. This report provides a comprehensive summary of the storage status of BSC, with a focus on the History Data and the World State:
- In 2022, BSC generated more than 10 million blocks, resulting in an increase in the storage size for history data (blocks, transactions, receipts, and code) from ~658 GB to ~1.07 TB (a growth of ~63%). The storage size of the whole MPT trie tree also increased from ~20 5GB to ~361 GB (a growth of ~76%). The total storage size of the pruned BSC full node reached ~1.60TB, which is nearly 2.46 times larger than Ethereum(with Pruned Ethereum full node taking up ~650 GB).
- The large storage size of BSC poses several challenges: the need for expensive hardware requirements; prolonged node synchronization times; reduced performance to access large storage; write amplification…
- To solve these large storage issues, some potential solutions have been proposed, such as EIP-4444, StateExpiry proposal from NodeReal, and the recent BNB GreenField.
2. Storage Overview
A full node was set up, synced up to block 24353658, and generated on 30th Dec 2022. We pruned the state only to keep the latest 128 trie layers and use `db inspect` to get the storage layout, which is shown below:
The Components
The storage of BSC is mainly composed of 4 parts:
- HistoryData: Header, Body, Receipt, Code.
- WorldState(Trie): the world state is kept in a Merkle Patricia Trie (MPT) tree, it is used to calculate the state root hash.
- WorldState(Snapshot): it also keeps the whole world state, but it only keeps the leaf node with flattened Key-Value (KV) for fast access. It is much smaller than MPT trie, since it does not contain the intermediate MPT node.
- MetaData: Most intermediate information for index, can be removed and rebuilt from historical data. Its size varies, depending on how much of this intermediate data is kept.
3. History Data
History data is composed of 4 parts:
- Header: It is the block header, the size is almost fixed for each block to show the general information of the block.
- Body: it is the block body, the payload of transactions is included.
- Receipt: each transaction could add logs during transaction execution, it is a critical part of cross-chain communication.
- Code: it is the smart contract’s bytecode stored on-chain.
In this section, we will go through each part of the storage data and compare it with the corresponding data from 2021.
We have another storage report, which was pruned and based on the state around 17th Jan 2022. Although the exact block number was not recorded, we will use this data as a comparison to see the growth of the storage over the past year.
3.1. Header & Block
The first block in 2022 is https://bsctrace.com/block/13969661 and the last block of 2022 is https://bsctrace.com/block/24393652. That is 10,423,992 blocks generated in 2022, equivalent to ~3.025 seconds per block. This is close to the expected block period of 3 seconds for BSC, indicating that the network ran smoothly throughout the year.
3.2. Receipts & Code
3.3. Transaction Size
BSC Scan tracks the daily transaction volume, allowing us to calculate the number of transactions executed over the past year.
The Transactions Per Second (TPS) metric displayed below only represents actual traffic on the BSC network, not its capacity. The BSC network is capable of handling a much higher TPS.
4. The WorldState
We added some logs to dump the account info based on the same block 24353658.
As you may know, there are 2 types of accounts in BSC:
- Externally-owned Account (EOA)
- Contract Account (CA)
If you are unfamiliar with these 2 account types, you may refer to types-of-account.
Please note: the following account information is collected based on the pruned snapshot, not the MPT trie tree. Currently, the snapshot could contain some redundant accounts which are not pruned from the database, which makes the following account statistics not quite accurate. Despite this, the data can be used to provide a general overview of the account layout. And we will improve the db inspect tool in the future to get accurate data.
4.1. Account Overview
There are ~135 million active accounts in total, ~⅓ are contract accounts and ~⅔ are EOA accounts.
Please note: the active account size is different from the unique address size shown in: https://bscscan.com/chart/address. Since a unique address includes those that are no longer active, e.g. self-destroyed addresses.
4.2.Snapshot & Trie & KV Pair
4.3. Big Contract Account
Big contract accounts are those that have large storage sizes, i.e. massive KV pairs written by the contract. They consume lots of storage and have a deep storage MPT trie tree, they also have the storage amplification problem and could impact the performance.
We added some logs to dump the top 20 contracts that have the largest KV pair size.
Since the DB only stores the hash of the account address, it is not possible to obtain the original account address directly.
We attempted to identify the original addresses of these large players and have listed the top 5 below:
5. Conclusion
In conclusion, the BNB Smart Chain Annual Storage Report 2023 presents an overview of the storage status of the BSC network over the past year. Most of the data of this report are as expected, what surprised us a little is the big contract accounts, the No.1 contract account uses almost 100M KV pairs, which is more than the total EOA account and the top 20 contract accounts occupying 26.58% of the total KV pairs.
This report gives directions on how to optimize the storage, and how to remove this cold storage. In fact, NodeReal has been already working on this, including the efforts on the proposal BEP(Idea): State Expiry On BNB Chain and the BNB Greenfield.
Last but not least, this report serves as a call to action, urging all stakeholders to work together to find solutions and make the BSC network more efficient and cost-effective. Let’s BUIDL for REAL.
About NodeReal
NodeReal is a one-stop blockchain infrastructure and service provider that embraces the high-speed blockchain era and empowers developers by “Make your Web3 Real”. We provide scalable, reliable, and efficient blockchain solutions for everyone, aiming to support the adoption, growth, and long-term success of the Web3 ecosystem.
Join Our Community
Join our community to learn more about NodeReal and stay up to date!
Discord | Twitter| Youtube | LinkedIn