solana/rfcs/rfc-001-smart-contracts-eng...

203 lines
8.3 KiB
Markdown
Raw Normal View History

2018-06-21 11:31:21 -07:00
# Smart Contracts Engine
2018-06-21 22:51:20 -07:00
The goal of this RFC is to define a set of constraints for APIs and runtime such that we can safely execute our smart contracts safely on massively parallel hardware such as a GPU.
## Toolchain Stack
2018-06-21 11:31:21 -07:00
2018-06-21 16:19:14 -07:00
+---------------------+ +---------------------+
| | | |
| +------------+ | | +------------+ |
| | | | | | | |
| | frontend | | | | verifier | |
| | | | | | | |
| +-----+------+ | | +-----+------+ |
| | | | | |
| | | | | |
| +-----+------+ | | +-----+------+ |
| | | | | | | |
| | llvm | | | | loader | |
| | | +------>+ | | |
| +-----+------+ | | +-----+------+ |
| | | | | |
| | | | | |
| +-----+------+ | | +-----+------+ |
| | | | | | | |
| | ELF | | | | runtime | |
| | | | | | | |
| +------------+ | | +------------+ |
| | | |
2018-06-21 22:51:20 -07:00
| client | | solana |
2018-06-21 16:19:14 -07:00
+---------------------+ +---------------------+
2018-06-21 11:31:21 -07:00
2018-06-21 22:51:20 -07:00
[Figure 1. Smart Contracts Stack]
2018-06-21 16:19:14 -07:00
2018-06-21 22:51:20 -07:00
In Figure 1. an untrusted client, creates a program in the front-end language of her choice, (like C/C++/Rust/Lua), and compiles it with LLVM to a position independnet shared object ELF, targeting BPF bytecode. Solana will safely load and execute the ELF.
2018-06-21 11:31:21 -07:00
2018-06-21 22:51:20 -07:00
## Bytecode
2018-06-21 11:31:21 -07:00
Our bytecode is based on Berkley Packet Filter. The requirements for BPF overlap almost exactly with the requirements we have
1. Deterministic amount of time to execute the code
2. Bytecode that is portable between machine instruction sets
3. Verified memory accesses
4. Fast to load the object, verify the bytecode and JIT to local machine instruction set
2018-06-21 22:51:20 -07:00
For 1, that means that loops are unrolled, and for any jumps back we can guard them with a check against the number of instruction that have been executed at this point. If the limit is reached, the program yields its execution. This involves saving the stack and current instruction index.
2018-06-21 16:19:14 -07:00
For 2, the BPF bytecode already easily maps to x8664, arm64 and other instruction sets. 
2018-06-21 22:51:20 -07:00
For 3, every load and store that is relative can be checked to be within the expected memory that is passed into the ELF. Dynamic load and stores can do a runtime check against available memory, these will be slow and should be avoided.
For 4, Statically linked PIC ELF with just a signle RX segment. Effectively we are linking a shared object with `-fpic -target bpf` and a linker script to collect everything into a single RX segment. Writable globals are not supported at the moment.
2018-06-21 16:19:14 -07:00
## Loader
2018-06-21 22:51:20 -07:00
The loader is our first smart contract. The job of this contract is to load the actual program with its own instance data. The loader expects the shared object to implement the following methods:
```
void map(const struct module_data *module_data, struct transaction* tx, uint8_t *scratch);
void reduce(
const struct module_data *module_data,
const transaction *txs,
uint32_t num,
const struct reduce* reductions,
uint32_t num_rs,
struct reduce* reduced
);
void finalize(
const struct module_data *module_data,
const transaction *txs,
uint32_t num,
struct reduce* reduce
);
```
The module_data structure is configued by the client, it contains the `struct solana_module` structure at the top, which defines how to calculate how much buffer to provide for each step.
2018-06-21 11:31:21 -07:00
2018-06-21 16:19:14 -07:00
A client will create a transaction to create a new loader instance.
2018-06-21 22:51:20 -07:00
* `Solana_NewLoader(Loader instance PubKey, proof of key ownership, space i need for my elf)`
2018-06-21 11:31:21 -07:00
2018-06-21 16:19:14 -07:00
A client will then do a bunch of transactions to load its elf into the loader instance they created.
2018-06-21 22:51:20 -07:00
* `Loader_UploadElf(Loader instance PubKey, proof of key ownership, pos start, pos end, data)`
* `Loader_NewInstance(Loader instance PubKey, proof of key ownership, Instance PubKey, proof of key owndership)`
2018-06-21 11:31:21 -07:00
2018-06-21 22:51:20 -07:00
A client will then do a bunch of transactions to load its elf into the loader instance they created.
* `Instance_UploadModuleData(Instance PubKey, proof of key ownership, pos start, pos end, data)`
```
struct module_hdr {
struct pubkey owner;
uint32_t map_scratch_size;
uint32_t map_data_size;
uint32_t reduce_size;
uint32_t reduce_scratch_size;
uint32_t finalize_scratch_size;
};
2018-06-21 16:19:14 -07:00
At this point the client may need to upload more R user data to the OS via some more transactions to the loader.
2018-06-21 22:51:20 -07:00
* `Instance_Start(Instance PubKey, proof of key owndership)`
2018-06-21 16:19:14 -07:00
2018-06-21 22:51:20 -07:00
At this point clients can start sending transactions to the instance
2018-06-21 11:31:21 -07:00
## Parallelizable Runtime
2018-06-21 22:51:20 -07:00
To parallelize smart contract execution we plan on breaking up contracts into distinct interfaces, Map/Collect/Reduce/Finalize.
### Map and Collect
2018-06-21 11:31:21 -07:00
```
2018-06-21 22:51:20 -07:00
struct transaction {
struct transaction_msg msg;
2018-06-21 11:31:21 -07:00
uint8_t favorite;
}
2018-06-21 22:51:20 -07:00
struct module_data {
struct module_hdr hdr;
2018-06-21 11:31:21 -07:00
}
2018-06-21 22:51:20 -07:00
void map(const struct module_data *module_data, struct transaction* tx, uint8_t *scratch)
2018-06-21 11:31:21 -07:00
{
2018-06-21 22:51:20 -07:00
//msg.userdata is a network protocol defined fixed size that is an input from the user via the transaction
tx->favorite = tx->msg.userdata[0];
collect(&tx->hdr);
2018-06-21 11:31:21 -07:00
}
2018-06-21 16:19:14 -07:00
```
2018-06-21 22:51:20 -07:00
The contract's object file implements a map function and lays out memory that is allocated per transaction. It then tells the runtime to collect this transaction for further processing if it's accepted by the contract. The mapped memory is stored as part of the transaction, and only transactions that succeed in a `collect` call will get accepted by this contract and move to the next stage.
### Reduce
2018-06-21 16:19:14 -07:00
```
2018-06-21 22:51:20 -07:00
struct reduce {
struct reduce_hdr hdr;
uint64_t votes[256];
}
void reduce(
const struct module_data *module_data,
const transaction *txs,
uint32_t num,
const struct reduce* reductions,
uint32_t num_rs,
struct reduce* reduced
) {
struct reduce *reduced = (struct reduce*)scratch;
int i = 0;
for(int i = 0; i < num; ++i) {
struct Vote *v = collected(&txs[i]);
reduced->votes[txs[i].favorite] += txs[i].msg.amount;
}
for(int i = 0; i < num_rs; ++i) {
for(j = 0; j < 256; ++j) {
reduced->votes[j] += reductions[i].votes[j];
}
}
2018-06-21 11:31:21 -07:00
}
```
2018-06-21 22:51:20 -07:00
Reduce allows the contract to accumilate all the `collect` and `reduce` calls into a single structure.
### Finalize
Finalize is then called when some final condition occurs. This could be when the time expires on the contract, or from a direct call to finalize itself, such as finalize(reduce). 
2018-06-21 11:31:21 -07:00
```
2018-06-21 22:51:20 -07:00
void finalize(
const struct module_data *module_data,
const transaction *txs,
uint32_t num,
struct reduce* reduce
) {
int i, s = 0;
uint64_t total = 0;
uint8_t max = 0;
for(i = 0; i < 256; ++i) {
if reduce->votes[max] < reduce->votes[i] {
max = i;
}
total += reduce->votes[i];
}
//now we have to spend the transactions
for(i = 0; i < num; ++i) {
struct transaction *dst = &txs[i];
if txs[i]->favorite != max {
continue;
}
uint64_t award = total * dst.hdr->amount / reduced->votes[max];
for(; s < num; ++s) {
struct transaction *src = &txs[s];
if src->favorite == max {
continue;
}
uint64_t amt = MIN(src->hdr.amount, award);
//mark the src transaction as spent
spend(&src->hdr, amt, dst.hdr.from);
award -= amt;
if award == 0 {
break;
}
}
}
//spend the rounding errors on myself
for(; s < num; ++s) {
struct transaction *src = &txs[s];
spend(&src->hdr, src->hdr.amount, module_data->hdr.owner);
}
2018-06-21 11:31:21 -07:00
}
```
## Notes
1. There is no dynamic memory allocation. 
2018-06-21 22:51:20 -07:00
2. Transactions are tracked by the runtime and not the contract
3. Transactions must be spent, if they are not spent the runtime can cancel and refund them minus fees