Purge accounts with lamports=0 on rooted forks (#6315)

2019-10-23 12:46:48 -07:00 · 2019-10-23 12:46:48 -07:00 · f1172617cc
parent 6ce115ec95
commit f1172617cc
6 changed files with 188 additions and 39 deletions
--- a/book/src/implemented-proposals/snapshot-verification.md
+++ b/book/src/implemented-proposals/snapshot-verification.md
@ -0,0 +1,55 @@
+# Snapshot Verification
+
+## Problem
+
+When a validator boots up from a snapshot, it needs a way to verify the account set matches what the rest of the network sees quickly. A potential
+attacker could give the validator an incorrect state, and then try to convince it to accept a transaction that would otherwise be rejected.
+
+## Solution
+
+Currently the bank hash is derived from hashing the delta state of the accounts in a slot which is then combined with the previous bank hash value.
+The problem with this is that the list of hashes will grow on the order of the number of slots processed by the chain and become a burden to both
+transmit and verify successfully.
+
+Another naive method could be to create a merkle tree of the account state. This has the downside that with each account update, the merkle tree
+would have to be recomputed from the entire account state of all live accounts in the system.
+
+To verify the snapshot, we do the following:
+
+On account store of non-zero lamport accounts, we hash the following data:
+
+* Account owner
+* Account data
+* Account pubkey
+* Account lamports balance
+* Fork the account is stored on
+
+Use this resulting hash value as input to an expansion function which expands the hash value into an image value.
+The function will create a 440 byte block of data where the first 32 bytes are the hash value, and the next 440 - 32 bytes are
+generated from a Chacha RNG with the hash as the seed.
+
+The account images are then combined with xor. The previous account value will be xored into the state and the new account value also xored into the state.
+
+Voting and sysvar hash values occur with the hash of the resulting full image value.
+
+On validator boot, when it loads from a snapshot, it would verify the hash value with the accounts set. It would then
+use SPV to display the percentage of the network that voted for the hash value given.
+
+The resulting value can be verified by a validator to be the result of xoring all current account states together.
+
+A snapshot must be purged of zero lamport accounts before creation and during verify since the zero lamport accounts do not affect the hash value but may cause
+a validator bank to read that an account is not present when it really should be.
+
+An attack on the xor state could be made to influence its value:
+
+Thus the 440 byte image size comes from this paper, avoiding xor collision with 0 \(or thus any other given bit pattern\): \[[https://link.springer.com/content/pdf/10.1007%2F3-540-45708-9\_19.pdf](https://link.springer.com/content/pdf/10.1007%2F3-540-45708-9_19.pdf)\]
+
+The math provides 128 bit security in this case:
+
+```text
+O(k * 2^(n/(1+lg(k)))
+k=2^40 accounts
+n=440
+2^(40) * 2^(448 * 8 / 41) ~= O(2^(128))
+```
+
--- a/book/src/proposals/snapshot-verification.md
+++ b/book/src/proposals/snapshot-verification.md
@ -2,44 +2,8 @@

 ## Problem

-When a validator boots up from a snapshot, it needs a way to verify the account set matches what the rest of the network sees quickly. A potential attacker could give the validator an incorrect state, and then try to convince it to accept a transaction that would otherwise be rejected.
+Snapshot verification of the account states is implemented, but the bank hash of the snapshot which is used to verify is falsifiable.

 ## Solution

-Currently the bank hash is derived from hashing the delta state of the accounts in a slot which is then combined with the previous bank hash value. The problem with this is that the list of hashes will grow on the order of the number of slots processed by the chain and become a burden to both transmit and verify successfully.
-
-Another naive method could be to create a merkle tree of the account state. This has the downside that with each account update which removes an account state from the tree, the merkle tree would have to be recomputed from the entire account state of all live accounts in the system.
-
-To verify the snapshot, we propose the following:
-
-On account store hash the following data:
-
-* Account owner
-* Account data
-* Account pubkey
-* Account lamports balance
-* Fork the account is stored on
-
-Use this resulting hash value as input to an expansion function which expands the hash value into an image value. The function will create a 440 byte block of data where the first 32 bytes are the hash value, and the next 440 - 32 bytes are generated from a Chacha RNG with the hash as the seed.
-
-The account images are then combined with xor. The previous account value will be xored into the state and the new account value also xored into the state.
-
-Voting and sysvar hash values occur with the hash of the resulting full image value.
-
-On validator boot, when it loads from a snapshot, it would verify the hash value with the accounts set. It would then use SPV to display the percentage of the network that voted for the hash value given.
-
-The resulting value can be verified by a validator to be the result of xoring all current account states together.
-
-An attack on the xor state could be made to influence its value:
-
-Thus the 440 byte image size comes from this paper, avoiding xor collision with 0 \(or thus any other given bit pattern\): \[[https://link.springer.com/content/pdf/10.1007%2F3-540-45708-9\_19.pdf](https://link.springer.com/content/pdf/10.1007%2F3-540-45708-9_19.pdf)\]
-
-The math provides 128 bit security in this case:
-
-```text
-O(k * 2^(n/(1+lg(k)))
-k=2^40 accounts
-n=440
-2^(40) * 2^(448 * 8 / 41) ~= O(2^(128))
-```
-
+Use the simple payment verification (SPV) solution to verify the vote transactions which are on-chain voting for the bank hash value.
--- a/ledger/src/snapshot_utils.rs
+++ b/ledger/src/snapshot_utils.rs
@ -150,6 +150,7 @@ where
 }

 pub fn add_snapshot<P: AsRef<Path>>(snapshot_path: P, bank: &Bank) -> Result<()> {
+    bank.purge_zero_lamport_accounts();
    let slot = bank.slot();
    // snapshot_path/slot
    let slot_snapshot_dir = get_bank_snapshot_dir(snapshot_path, slot);
@ -220,7 +221,7 @@ pub fn bank_from_archive<P: AsRef<Path>>(
        unpacked_accounts_dir,
    )?;

-    if !bank.verify_hash_internal_state() {
+    if !bank.verify_snapshot_bank() {
        panic!("Snapshot bank failed to verify");
    }

--- a/runtime/src/accounts_db.rs
+++ b/runtime/src/accounts_db.rs
@ -559,6 +559,26 @@ impl AccountsDB {
        false
    }

+    pub fn purge_zero_lamport_accounts(&self, ancestors: &HashMap<u64, usize>) {
+        let accounts_index = self.accounts_index.read().unwrap();
+        let mut purges = Vec::new();
+        accounts_index.scan_accounts(ancestors, |pubkey, (account_info, fork)| {
+            if account_info.lamports == 0 && accounts_index.is_root(fork) {
+                purges.push(*pubkey);
+            }
+        });
+        drop(accounts_index);
+        let mut reclaims = Vec::new();
+        let mut accounts_index = self.accounts_index.write().unwrap();
+        for purge in &purges {
+            reclaims.extend(accounts_index.purge(purge));
+        }
+        let last_root = accounts_index.last_root;
+        drop(accounts_index);
+        let mut dead_forks = self.remove_dead_accounts(reclaims);
+        self.cleanup_dead_forks(&mut dead_forks, last_root);
+    }
+
    pub fn scan_accounts<F, A>(&self, ancestors: &HashMap<Fork, usize>, scan_func: F) -> A
    where
        F: Fn(&mut A, Option<(&Pubkey, Account, Fork)>) -> (),
@ -746,6 +766,10 @@ impl AccountsDB {
    }

    pub fn hash_account_data(fork: Fork, lamports: u64, data: &[u8], pubkey: &Pubkey) -> Hash {
+        if lamports == 0 {
+            return Hash::default();
+        }
+
        let mut hasher = Hasher::default();
        let mut buf = [0u8; 8];

--- a/runtime/src/accounts_index.rs
+++ b/runtime/src/accounts_index.rs
@ -29,6 +29,17 @@ impl<T: Clone> AccountsIndex<T> {
        }
    }

+    pub fn purge(&mut self, pubkey: &Pubkey) -> Vec<(Fork, T)> {
+        let mut list = self.account_maps.get(&pubkey).unwrap().write().unwrap();
+        let reclaims = list
+            .iter()
+            .filter(|(fork, _)| self.is_root(*fork))
+            .cloned()
+            .collect();
+        list.retain(|(fork, _)| !self.is_root(*fork));
+        reclaims
+    }
+
    // find the latest fork and T in a list for a given ancestor
    // returns index into 'list' if found, None if not.
    fn latest_fork(&self, ancestors: &HashMap<Fork, usize>, list: &[(Fork, T)]) -> Option<usize> {
--- a/runtime/src/bank.rs
+++ b/runtime/src/bank.rs
@ -1403,6 +1403,28 @@ impl Bank {
            .verify_hash_internal_state(self.slot(), &self.ancestors)
    }

+    /// A snapshot bank should be purged of 0 lamport accounts which are not part of the hash
+    /// calculation and could shield other real accounts.
+    pub fn verify_snapshot_bank(&self) -> bool {
+        self.rc
+            .accounts
+            .verify_hash_internal_state(self.slot(), &self.ancestors)
+            && !self.has_accounts_with_zero_lamports()
+    }
+
+    fn has_accounts_with_zero_lamports(&self) -> bool {
+        self.rc.accounts.accounts_db.scan_accounts(
+            &self.ancestors,
+            |collector: &mut bool, option| {
+                if let Some((_, account, _)) = option {
+                    if account.lamports == 0 {
+                        *collector = true;
+                    }
+                }
+            },
+        )
+    }
+
    /// Return the number of ticks per slot
    pub fn ticks_per_slot(&self) -> u64 {
        self.ticks_per_slot
@ -1584,6 +1606,13 @@ impl Bank {
            .accounts
            .commit_credits(&self.ancestors, self.slot());
    }
+
+    pub fn purge_zero_lamport_accounts(&self) {
+        self.rc
+            .accounts
+            .accounts_db
+            .purge_zero_lamport_accounts(&self.ancestors);
+    }
 }

 impl Drop for Bank {
@ -1753,6 +1782,71 @@ mod tests {
        );
    }

+    fn assert_no_zero_balance_accounts(bank: &Arc<Bank>) {
+        assert!(!bank.has_accounts_with_zero_lamports());
+    }
+
+    // Test that purging 0 lamports accounts works.
+    #[test]
+    fn test_purge_empty_accounts() {
+        solana_logger::setup();
+        let (genesis_block, mint_keypair) = create_genesis_block(500_000);
+        let parent = Arc::new(Bank::new(&genesis_block));
+        let mut bank = parent;
+        for _ in 0..10 {
+            let blockhash = bank.last_blockhash();
+            let pubkey = Pubkey::new_rand();
+            let tx = system_transaction::transfer_now(&mint_keypair, &pubkey, 0, blockhash);
+            bank.process_transaction(&tx).unwrap();
+            bank.squash();
+            bank = Arc::new(new_from_parent(&bank));
+        }
+
+        bank.purge_zero_lamport_accounts();
+
+        assert_no_zero_balance_accounts(&bank);
+
+        let bank0 = Arc::new(new_from_parent(&bank));
+        let blockhash = bank.last_blockhash();
+        let keypair = Keypair::new();
+        let tx = system_transaction::transfer_now(&mint_keypair, &keypair.pubkey(), 10, blockhash);
+        bank0.process_transaction(&tx).unwrap();
+
+        let bank1 = Arc::new(new_from_parent(&bank0));
+        let pubkey = Pubkey::new_rand();
+        let blockhash = bank.last_blockhash();
+        let tx = system_transaction::transfer_now(&keypair, &pubkey, 10, blockhash);
+        bank1.process_transaction(&tx).unwrap();
+
+        assert_eq!(bank0.get_account(&keypair.pubkey()).unwrap().lamports, 10);
+        assert_eq!(bank1.get_account(&keypair.pubkey()), None);
+        bank0.purge_zero_lamport_accounts();
+
+        assert_eq!(bank0.get_account(&keypair.pubkey()).unwrap().lamports, 10);
+        assert_eq!(bank1.get_account(&keypair.pubkey()), None);
+        bank1.purge_zero_lamport_accounts();
+
+        assert_eq!(bank0.get_account(&keypair.pubkey()).unwrap().lamports, 10);
+        assert_eq!(bank1.get_account(&keypair.pubkey()), None);
+
+        assert!(bank0.verify_hash_internal_state());
+
+        // Squash and then verify hash_internal value
+        bank0.squash();
+        assert!(bank0.verify_hash_internal_state());
+
+        bank1.squash();
+        assert!(bank1.verify_hash_internal_state());
+
+        // keypair should have 0 tokens on both forks
+        assert_eq!(bank0.get_account(&keypair.pubkey()), None);
+        assert_eq!(bank1.get_account(&keypair.pubkey()), None);
+        bank1.purge_zero_lamport_accounts();
+
+        assert!(bank1.verify_hash_internal_state());
+        assert_no_zero_balance_accounts(&bank1);
+    }
+
    #[test]
    fn test_two_payments_to_one_party() {
        let (genesis_block, mint_keypair) = create_genesis_block(10_000);