Cleanup and update Smart Contracts Engine RFC to what is currently in the code (#1539)

* Cleanup and update to the state of the code

* update

* render

* render

* comments on memory allocation
This commit is contained in:
anatoly yakovenko 2018-10-19 06:08:49 -07:00 committed by GitHub
parent 0bd1412562
commit 0423cafbeb
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 101 additions and 131 deletions

View File

@ -1,10 +1,17 @@
# Smart Contracts Engine # Smart Contracts Engine
The goal of this RFC is to define a set of constraints for APIs and runtime such that we can execute our smart contracts safely on massively parallel hardware such as a GPU. Our runtime is built around an OS *syscall* primitive. The difference in blockchain is that now the OS does a cryptographic check of memory region ownership before accessing the memory in the Solana kernel. The goal of this RFC is to define a set of constraints for APIs and smart contracts runtime such that we can execute our contracts safely on massively parallel hardware such as a GPU.
## Version ## Version
version 0.2 Version 0.3
## Definitions
* Transaction - an atomic operation with multiple instructions. All Instruction must complete successfully for the transaction to be comitted.
* Instruction - a call to a program that modifies Account token balances and Account specific userdata state. A single transaction may have multiple Instructions with different Accounts and Programs.
* Program - Programs are code that modifies Account token balances and Account specific userdata state.
* Account - A single instance of state. Accounts are looked up by account Pubkeys and are associated with a Program's Pubkey.
## Toolchain Stack ## Toolchain Stack
@ -39,173 +46,136 @@ In Figure 1 an untrusted client, creates a program in the front-end language of
## Runtime ## Runtime
The goal with the runtime is to have a general purpose execution environment that is highly parallelizeable and doesn't require dynamic resource management. The goal is to execute as many contracts as possible in parallel, and have them pass or fail without a destructive state change. The goal with the runtime is to have a general purpose execution environment that is highly parallelizeable. To achieve this goal the runtime forces each Instruction to specify all of its memory dependencies up front, and therefore a single Instruction cannot cause a dynamic memory allocation. An explicit Instruction for memory allocation from the `SystemProgram::CreateAccount` is the only way to allocate new memory in the engine. A Transaction may compose multiple Instruction, including `SystemProgram::CreateAccount`, into a single atomic sequence which allows for memory allocation to achieve a result that is similar to dynamic allocation.
### State ### State
State is addressed by an account which is at the moment simply the Pubkey. Our goal is to eliminate memory allocation from within the smart contract itself. Thus the client of the contract provides all the state that is necessary for the contract to execute in the transaction itself. The runtime interacts with the contract through a state transition function, which takes a mapping of [(Pubkey,State)] and returns [(Pubkey, State')]. The State is an opeque type to the runtime, a `Vec<u8>`, the contents of which the contract has full control over. State is addressed by an Account which is at the moment simply the Pubkey. Our goal is to eliminate memory allocation from within the program itself. Thus the client of the program provides all the state that is necessary for the program to execute in the transaction itself. The runtime interacts with the program through an entry point with a well defined interface. The userdata stored in an Account is an opaque type to the runtime, a `Vec<u8>`, the contents of which the program code has full control over.
### Call Structure ### Transaction structure
``` ```
/// Call definition /// An atomic transaction
/// Signed portion
#[derive(Serialize, Deserialize, Debug, PartialEq, Eq, Clone)] #[derive(Serialize, Deserialize, Debug, PartialEq, Eq, Clone)]
pub struct CallData { pub struct Transaction {
/// Each Pubkey in this vector is mapped to a corresponding `Page` that is loaded for contract execution /// A digital signature of `account_keys`, `program_ids`, `last_id`, `fee` and `instructions`, signed by `Pubkey`.
/// In a simple pay transaction `key[0]` is the token owner's key and `key[1]` is the recipient's key. pub signature: Signature,
pub keys: Vec<Pubkey>,
/// The Pubkeys that are required to have a proof. The proofs are a `Vec<Signature> which encoded along side this data structure /// The `Pubkeys` that are executing this transaction userdata. The meaning of each key is
/// Each Signature signs the `required_proofs` vector as well as the `keys` vectors. The transaction is valid if and only if all /// program-specific.
/// the required signatures are present and the public key vector is unchanged between signatures. /// * account_keys[0] - Typically this is the `caller` public key. `signature` is verified with account_keys[0].
pub required_proofs: Vec<u8>, /// In the future which key pays the fee and which keys have signatures would be configurable.
/// * account_keys[1] - Typically this is the program context or the recipient of the tokens
pub account_keys: Vec<Pubkey>,
/// PoH data /// The ID of a recent ledger entry.
/// last PoH hash observed by the sender
pub last_id: Hash, pub last_id: Hash,
/// Program /// The number of tokens paid for processing and storage of this transaction.
/// The address of the program we want to call. ContractId is just a Pubkey that is the address of the loaded code that will execute this Call.
pub contract_id: ContractId,
/// OS scheduling fee
pub fee: i64, pub fee: i64,
/// struct version to prevent duplicate spends
/// Calls with a version <= Page.version are rejected /// Keys identifying programs in the instructions vector.
pub version: u64, pub program_ids: Vec<Pubkey>,
/// method to call in the contract /// Programs that will be executed in sequence and commited in one atomic transaction if all
pub method: u8, /// succeed.
/// usedata in bytes pub instructions: Vec<Instruction>,
}
```
The Transaction structure specifies a list of Pubkey's and signatures for those keys and a sequentail list of instructions that will operate over the state's assosciated with the `account_keys`. For the transaction to be committed all the instructions must execute successfully, if any abort the whole transaction fails to commit.
### Account structure
Accounts maintain token state as well as program specific memory.
```
/// An Account with userdata that is stored on chain
pub struct Account {
/// tokens in the account
pub tokens: i64,
/// user data
/// A transaction can write to its userdata
pub userdata: Vec<u8>, pub userdata: Vec<u8>,
} /// program id this Account belongs to
pub program_id: Pubkey,
#[derive(Serialize, Deserialize, Debug, PartialEq, Eq, Clone)]
pub struct Call {
/// Signatures and Keys
/// (signature, key index)
/// This vector contains a tuple of signatures, and the key index the signature is for
/// proofs[0] is always key[0]
pub proofs: Vec<Signature>,
pub data: CallData,
} }
``` ```
At it's core, this is just a set of Pubkeys and Signatures with a bit of metadata. The contract Pubkey routes this transaction into that contracts entry point. `version` is used for dropping retransmitted requests. # Transaction Engine
Contracts should be able to read any state that is part of runtime, but only write to state that the contract allocated. At it's core, the engine looks up all the Pubkeys maps them to accounts and routs them to the `program_id` entry point.
### Execution ## Execution
Calls batched and processed in a pipeline Transactions are batched and processed in a pipeline
``` ```
+-----------+ +-------------+ +--------------+ +--------------------+ +-----------+ +-------------+ +--------------+ +--------------------+
| sigverify |--->| lock memory |--->| validate fee |--->| allocate new pages |---> | sigverify |--->| lock memory |--->| validate fee |--->| allocate accounts |--->
+-----------+ +-------------+ +--------------+ +--------------------+ +-----------+ +-------------+ +--------------+ +--------------------+
+------------+ +---------+ +--------------+ +-=------------+ +------------+ +---------+ +-=------------+ +--------------+
--->| load pages |--->| execute |--->|unlock memory |--->| commit pages | --->| load data |--->| execute |--->| commit data |-->|unlock memory |
+------------+ +---------+ +--------------+ +--------------+ +------------+ +---------+ +--------------+ +--------------+
``` ```
At the `execute` stage, the loaded pages have no data dependencies, so all the contracts can be executed in parallel. At the `execute` stage, the loaded pages have no data dependencies, so all the programs can be executed in parallel.
## Memory Management
```
pub struct Page {
/// key that indexes this page
/// prove ownership of this key to spend from this Page
owner: Pubkey,
/// contract that owns this page
/// contract can write to the data that is in `memory` vector
contract: Pubkey,
/// balance that belongs to owner
balance: u64,
/// version of the structure, public for testing
version: u64,
/// hash of the page data
memhash: Hash,
/// The following could be in a separate structure
memory: Vec<u8>,
}
```
The guarantee that runtime enforces: The runtime enforces the following rules:
1. The contract code is the only code that will modify the contents of `memory`
2. Total balances on all the pages is equal before and after exectuion of a call 1. The `program_id` code is the only code that will modify the contents of `Account::userdata` of Account's that have been assigned to it. This means that upon assignment userdata vector is guarnteed to be `0`.
3. Balances of each of the pages not owned by the contract must be equal to or greater after the call than before the call. 2. Total balances on all the accounts is equal before and after execution of a Transaction.
3. Balances of each of the accounts not assigned to `program_id` must be equal to or greater after the Transaction than before the transaction.
4. All Instructions in the Transaction executed without a failure.
## Entry Point ## Entry Point
Exectuion of the contract involves maping the contract's public key to an entry point which takes a pointer to the transaction, and an array of loaded pages. Execution of the program involves mapping the Program's public key to an entry point which takes a pointer to the transaction, and an array of loaded pages.
``` ```
// Find the method pub fn process_transaction(
match (tx.contract, tx.method) { tx: &Transaction,
// system interface pix: usize,
// everyone has the same reallocate accounts: &mut [&mut Account],
(_, 0) => system_0_realloc(&tx, &mut call_pages), ) -> Result<()>;
(_, 1) => system_1_assign(&tx, &mut call_pages),
// contract methods
(DEFAULT_CONTRACT, 128) => default_contract_128_move_funds(&tx, &mut call_pages),
(contract, method) => //...
``` ```
The first 127 methods are reserved for the system interface, which implements allocation and assignment of memory. The rest, including the contract for moving funds are implemented by the contract itself.
## System Interface ## System Interface
``` ```
/// SYSTEM interface, same for very contract, methods 0 to 127 pub enum SystemProgram {
/// method 0 /// Create a new account
/// reallocate /// * Transaction::keys[0] - source
/// spend the funds from the call to the first recipient's /// * Transaction::keys[1] - new account key
pub fn system_0_realloc(call: &Call, pages: &mut Vec<Page>) { /// * tokens - number of tokens to transfer to the new account
if call.contract == DEFAULT_CONTRACT { /// * space - memory to allocate if greater then zero
let size: u64 = deserialize(&call.userdata).unwrap(); /// * program_id - the program id of the new account
pages[0].memory.resize(size as usize, 0u8); CreateAccount {
} tokens: i64,
} space: u64,
/// method 1 program_id: Pubkey,
/// assign },
/// assign the page to a contract /// Assign account to a program
pub fn system_1_assign(call: &Call, pages: &mut Vec<Page>) { /// * Transaction::keys[0] - account to assign
let contract = deserialize(&call.userdata).unwrap(); Assign { program_id: Pubkey },
if call.contract == DEFAULT_CONTRACT { /// Move tokens
pages[0].contract = contract; /// * Transaction::keys[0] - source
//zero out the memory in pages[0].memory /// * Transaction::keys[1] - destination
//Contracts need to own the state of that data otherwise a use could fabricate the state and Move { tokens: i64 },
//manipulate the contract
pages[0].memory.clear();
}
} }
``` ```
The first method resizes the memory that is assosciated with the callers page. The second system call assignes the page to the contract. Both methods check if the current contract is 0, otherwise the method does nothing and the caller spent their fees. The interface is best described by the `Instruction::userdata` that the user encodes.
* `CreateAccount` - This allows the user to create and assign an Account to a Program.
This ensures that when memory is assigned to the contract the initial state of all the bytes is 0, and the contract itself is the only thing that can modify that state. * `Assign` - allows the user to assign an existing account to a `Program`.
* `Move` - moves tokens between `Account`s that are assosciated with `SystemProgram`. This cannot be used to move tokens of other `Account`s. Programs need to implement their own version of Move.
## Simplest contract
```
/// DEFAULT_CONTRACT interface
/// All contracts start with 128
/// method 128
/// move_funds
/// spend the funds from the call to the first recipient's
pub fn default_contract_128_move_funds(call: &Call, pages: &mut Vec<Page>) {
let amount: u64 = deserialize(&call.userdata).unwrap();
if pages[0].balance >= amount {
pages[0].balance -= amount;
pages[1].balance += amount;
}
}
```
This simply moves the amount from page[0], which is the callers page, to page[1], which is the recipient's page.
## Notes ## Notes
1. There is no dynamic memory allocation. 1. There is no dynamic memory allocation. Client's need to call the `SystemProgram` to create memory before passing it to another program. This Instruction can be composed into a single Transaction with the call to the program itself.
2. Persistent Memory is allocated to a Key with ownership 2. Runtime guarantees that when memory is assigned to the `Program` it is zero initialized.
3. Contracts can `call` to update key owned state 3. Runtime guarantees that `Program`'s code is the only thing that can modify memory that its assigned to
4. `call` is just a *syscall* that does a cryptographic check of memory ownership 4. Runtime guarantees that the `Program` can only spend tokens that are in `Account`s that are assigned to it
5. Kernel guarantees that when memory is assigned to the contract its state is 0 5. Runtime guarantees the balances belonging to `Account`s are balanced before and after the transaction
6. Kernel guarantees that contract is the only thing that can modify memory that its assigned to 6. Runtime guarantees that multiple instructions all executed successfully when a transaction is committed.
7. Kernel guarantees that the contract can only spend tokens that are in pages that are assigned to it
8. Kernel guarantees the balances belonging to pages are balanced before and after the call # Future Work
* Continuations and Signals for long running Transactions. https://github.com/solana-labs/solana/issues/1485