132 lines
3.7 KiB
Markdown
132 lines
3.7 KiB
Markdown
|
# Ledger History
|
||
|
|
||
|
Radiance proposes a neutral file format to capture Solana ledger history.
|
||
|
|
||
|
It also ships various reference tool to work with this file format.
|
||
|
|
||
|
## Ledger content
|
||
|
|
||
|
*Ledger data* is broadly defined as transaction and consensus data that can be trustlessly validated.
|
||
|
|
||
|
On Solana, ledger data is made up by only two classes of information.
|
||
|
* Proof-of-History[^1] parameters (contains block hashes)
|
||
|
* Transactions (contains user txs and consensus txs)
|
||
|
|
||
|
[^1]: PoH is a cryptographic delay function based on recursive SHA256 hashing
|
||
|
|
||
|
### Entries
|
||
|
|
||
|
The Solana protocol propagates ledger data in the form of _entries_.
|
||
|
|
||
|
On the wire, entries are additionally _shredded_ into network packets with erasure coding,
|
||
|
but this bears no relevance to the Solana ledger itself.
|
||
|
|
||
|
Entries have the following schema.
|
||
|
|
||
|
```python
|
||
|
class Entry:
|
||
|
num_hashes: uint64
|
||
|
prev_hash: Hash
|
||
|
transactions: list[Tx]
|
||
|
```
|
||
|
|
||
|
Note that a _block_ on Solana is made of up multiple entries.
|
||
|
But on the ledger itself, the concept of blocks is only implied.
|
||
|
|
||
|
### Existing Formats
|
||
|
|
||
|
We find that the following representations of ledger data are widely used.
|
||
|
|
||
|
1. **Shreds** as UDP packets
|
||
|
- Used in the peer-to-peer network
|
||
|
- Hard to capture and archive for long-term storage
|
||
|
2. Archives of **blockstore** databases
|
||
|
- Archives of the RocksDB database used in the Solana Labs validator implementation
|
||
|
- Technically implementation-defined, forces use librocksdb (C++)
|
||
|
3. Google Cloud **Bigtable** integration for Solana RPC nodes
|
||
|
- Closed source
|
||
|
- Locked to one specific vendor
|
||
|
- Lacks PoH data
|
||
|
|
||
|
We introduce a new format better suited for long-term archival and public distribution than the existing alternatives.
|
||
|
|
||
|
## CARv1 File Format
|
||
|
|
||
|
The **Content-addressable ARchive** is a streaming container format for blobs (files without a name).
|
||
|
|
||
|
[IPLD CARv1 Specification](https://ipld.io/specs/transport/car/carv1/)
|
||
|
|
||
|
### Content Addressing
|
||
|
|
||
|
_Why not .tar.zst, .7z, .rar, etc?_
|
||
|
|
||
|
Unlike with traditional archive formats, all blobs in CARs are content-addressed with a hash function.
|
||
|
Blobs are referred by CIDs (content identifiers) which unambiguously refer to the exact byte contents.
|
||
|
|
||
|
Leveraging the [IPLD Merkle-DAG](https://docs.ipfs.tech/concepts/merkle-dag/) construction,
|
||
|
blobs can recursively refer to other CIDs to build arbitrarily complex acyclic graphs of data.
|
||
|
|
||
|
Thus, if users know and trust a root CID (~35 bytes), they can safely retrieve blobs from any untrusted source.
|
||
|
Notably, users have the ability to verify if untrusted blobs match exactly what was requested.
|
||
|
|
||
|
### Determinism
|
||
|
|
||
|
Ledger CAR files are reproducible and deterministic.
|
||
|
Independent node operators would generate byte-by-byte identical CAR files for the same extent of ledger history,
|
||
|
regardless of where that data is sourced from.
|
||
|
|
||
|
### Header
|
||
|
|
||
|
The header of the ledger CAR file is set to the following.
|
||
|
|
||
|
```json
|
||
|
{
|
||
|
"roots": ["bafkqaaa"],
|
||
|
"version": 1
|
||
|
}
|
||
|
```
|
||
|
|
||
|
Rationale: The CAR file does not have a single root so we place the "empty" multihash instead,
|
||
|
as recommended by the [CARv1 spec](https://ipld.io/specs/transport/car/carv1/#number-of-roots).
|
||
|
|
||
|
This implies that any CARv1 file starts with the following byte content (hex).
|
||
|
|
||
|
```
|
||
|
19 a2 65 72 6f 6f 74 73
|
||
|
81 d8 2a 45 00 01 55 00
|
||
|
00 67 76 65 72 73 69 6f
|
||
|
6e 01
|
||
|
```
|
||
|
|
||
|
### IPLD data types
|
||
|
|
||
|
#### Transactions
|
||
|
|
||
|
Each Solana transaction is mapped to an IPLD block in native (bincode) serialization.
|
||
|
|
||
|
```
|
||
|
type Transaction bytes
|
||
|
```
|
||
|
|
||
|
#### Entries
|
||
|
|
||
|
```
|
||
|
type Entry struct {
|
||
|
numHashes Int
|
||
|
hash Hash
|
||
|
txs TransactionList
|
||
|
} representation tuple
|
||
|
|
||
|
type TransactionList [ &Transaction ]
|
||
|
```
|
||
|
|
||
|
#### Blocks
|
||
|
|
||
|
```
|
||
|
type Block struct {
|
||
|
slot Int
|
||
|
entries [ Link ]
|
||
|
shredding [ Shredding ]
|
||
|
} representation tuple
|
||
|
```
|