direct mapping: misc fixes (#32649)

* transaction_context: update make_data_mut comment

* bpf_loader: cpi: pass SerializeAccountMetadata to CallerAccount::from*

We now have a way to provide CallerAccount with trusted values coming
from our internal serialization code and not from untrusted vm space

* bpf_loader: direct_mapping: enforce account info pointers to be immutable

When direct mapping is enabled, we might need to update account data
memory regions across CPI calls. Since the only way we have to retrieve
the regions is based on their vm addresses, we enforce vm addresses to
be stable.  Accounts can still be mutated and resized of course, but it
must be done in place.

This also locks all other AccountInfo pointers, since there's no legitimate
reason to make them point to anything else.

* bpf_loader: cpi: access ref_to_len_in_vm through VmValue

Direct mapping needs to translate vm values at each access since
permissions of the underlying memory might have changed.

* direct mapping: improve memory permission tracking across CPI calls

Ensure that the data and realloc regions of an account always track the
account's permissions. In order to do this, we also need to split
realloc regions in their own self contained regions, where before we
had:

[account fields][account data][account realloc + more account fields + next account fields][next account data][...]

we now have:

[account fields][account data][account realloc][more account fields + next account fields][next account data][...]

Tested in TEST_[FORBID|ALLOW]_WRITE_AFTER_OWNERSHIP_CHANGE*

Additionally when direct mapping is on, we must update all perms at once before
doing account data updates. Otherwise, updating an account might write into
another account whose perms we haven't updated yet. Tested in
TEST_FORBID_LEN_UPDATE_AFTER_OWNERSHIP_CHANGE.

* bpf_loader: serialization: address review comment don't return vm_addr from push_account_region

* bpf_loader: rename push_account_region to push_account_data_region

* cpi: fix slow edge case zeroing extra account capacity after shrinking an account

When returning from CPI we need to zero all the account memory up to the
original length only if we know we're potentially dealing with uninitialized
memory.

When we know that the spare capacity has deterministic content, we only need to
zero new_len..prev_len.

This fixes a slow edge case that was triggerable by the following scenario:

- load a large account (say 10MB) into the vm
- shrink to 10 bytes - would memset 10..10MB
- shrink to 9 bytes - would memset 9..10MB
- shrink to 8 bytes - would memset 8..10MB
- ...

Now instead in the scenario above the following will happen:

- load a large account (say 10MB) into the vm
- shrink to 10 bytes - memsets 10..10MB
- shrink to 9 bytes - memsets 9..10
- shrink to 8 bytes - memset 8..9
- ...

* bpf_loader: add account_data_region_memory_state helper

Shared between serialization and CPI to figure out the MemoryState of an
account.

* cpi: direct_mapping: error out if ref_to_len_in_vm points to account memory

If ref_to_len_in_vm is allowed to be in account memory, calles could mutate it,
essentially letting callees directly mutate callers memory.

* bpf_loader: direct_mapping: map AccessViolation -> InstructionError

Return the proper ReadonlyDataModified / ExecutableDataModified /
ExternalAccountDataModified depending on where the violation occurs

* bpf_loader: cpi: remove unnecessary infallible slice::get call
This commit is contained in:
Alessandro Decina 2023-08-30 16:57:24 +07:00 committed by GitHub
parent b8dc5daedb
commit 0f41719918
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
9 changed files with 797 additions and 266 deletions

1
Cargo.lock generated
View File

@ -5372,6 +5372,7 @@ dependencies = [
"log",
"memoffset 0.9.0",
"rand 0.8.5",
"scopeguard",
"solana-measure",
"solana-program-runtime",
"solana-sdk",

View File

@ -153,6 +153,10 @@ pub struct SyscallContext {
#[derive(Debug, Clone)]
pub struct SerializedAccountMetadata {
pub original_data_len: usize,
pub vm_data_addr: u64,
pub vm_key_addr: u64,
pub vm_lamports_addr: u64,
pub vm_owner_addr: u64,
}
pub struct InvokeContext<'a> {

View File

@ -15,6 +15,7 @@ byteorder = { workspace = true }
libsecp256k1 = { workspace = true }
log = { workspace = true }
rand = { workspace = true }
scopeguard = { workspace = true }
solana-measure = { workspace = true }
solana-program-runtime = { workspace = true }
solana-sdk = { workspace = true }

View File

@ -1495,12 +1495,19 @@ fn execute<'a, 'b: 'a>(
let log_collector = invoke_context.get_log_collector();
let transaction_context = &invoke_context.transaction_context;
let instruction_context = transaction_context.get_current_instruction_context()?;
let program_id = *instruction_context.get_last_program_key(transaction_context)?;
let (program_id, is_loader_deprecated) = {
let program_account =
instruction_context.try_borrow_last_program_account(transaction_context)?;
(
*program_account.get_key(),
*program_account.get_owner() == bpf_loader_deprecated::id(),
)
};
#[cfg(any(target_os = "windows", not(target_arch = "x86_64")))]
let use_jit = false;
#[cfg(all(not(target_os = "windows"), target_arch = "x86_64"))]
let use_jit = executable.get_compiled_program().is_some();
let bpf_account_data_direct_mapping = invoke_context
let direct_mapping = invoke_context
.feature_set
.is_active(&bpf_account_data_direct_mapping::id());
@ -1511,18 +1518,26 @@ fn execute<'a, 'b: 'a>(
invoke_context
.feature_set
.is_active(&cap_bpf_program_instruction_accounts::ID),
!bpf_account_data_direct_mapping,
!direct_mapping,
)?;
serialize_time.stop();
// save the account addresses so in case of AccessViolation below we can
// map to InstructionError::ReadonlyDataModified, which is easier to
// diagnose from developers
let account_region_addrs = regions
// save the account addresses so in case we hit an AccessViolation error we
// can map to a more specific error
let account_region_addrs = accounts_metadata
.iter()
.map(|r| r.vm_addr..r.vm_addr.saturating_add(r.len))
.map(|m| {
let vm_end = m
.vm_data_addr
.saturating_add(m.original_data_len as u64)
.saturating_add(if !is_loader_deprecated {
MAX_PERMITTED_DATA_INCREASE as u64
} else {
0
});
m.vm_data_addr..vm_end
})
.collect::<Vec<_>>();
let addr_is_account_data = |addr: u64| account_region_addrs.iter().any(|r| r.contains(&addr));
let mut create_vm_time = Measure::start("create_vm");
let mut execute_time;
@ -1574,22 +1589,43 @@ fn execute<'a, 'b: 'a>(
};
Err(Box::new(error) as Box<dyn std::error::Error>)
}
ProgramResult::Err(error) => {
let error = match error.downcast_ref() {
Some(EbpfError::AccessViolation(
ProgramResult::Err(mut error) => {
if direct_mapping {
if let Some(EbpfError::AccessViolation(
_pc,
AccessType::Store,
address,
_size,
_section_name,
)) if addr_is_account_data(*address) => {
// We can get here if direct_mapping is enabled and a program tries to
// write to a readonly account. Map the error to ReadonlyDataModified so
// it's easier for devs to diagnose what happened.
Box::new(InstructionError::ReadonlyDataModified)
)) = error.downcast_ref()
{
// If direct_mapping is enabled and a program tries to write to a readonly
// region we'll get a memory access violation. Map it to a more specific
// error so it's easier for developers to see what happened.
if let Some((instruction_account_index, _)) = account_region_addrs
.iter()
.enumerate()
.find(|(_, vm_region)| vm_region.contains(address))
{
let transaction_context = &invoke_context.transaction_context;
let instruction_context =
transaction_context.get_current_instruction_context()?;
let account = instruction_context.try_borrow_instruction_account(
transaction_context,
instruction_account_index as IndexOfAccount,
)?;
error = Box::new(if account.is_executable() {
InstructionError::ExecutableDataModified
} else if account.is_writable() {
InstructionError::ExternalAccountDataModified
} else {
InstructionError::ReadonlyDataModified
})
}
}
_ => error,
};
}
Err(error)
}
_ => Ok(()),
@ -1615,12 +1651,8 @@ fn execute<'a, 'b: 'a>(
let mut deserialize_time = Measure::start("deserialize");
let execute_or_deserialize_result = execution_result.and_then(|_| {
deserialize_parameters(
invoke_context,
parameter_bytes.as_slice(),
!bpf_account_data_direct_mapping,
)
.map_err(|error| Box::new(error) as Box<dyn std::error::Error>)
deserialize_parameters(invoke_context, parameter_bytes.as_slice(), !direct_mapping)
.map_err(|error| Box::new(error) as Box<dyn std::error::Error>)
});
deserialize_time.stop();

View File

@ -6,7 +6,7 @@ use {
solana_rbpf::{
aligned_memory::{AlignedMemory, Pod},
ebpf::{HOST_ALIGN, MM_INPUT_START},
memory_region::MemoryRegion,
memory_region::{MemoryRegion, MemoryState},
},
solana_sdk::{
bpf_loader_deprecated,
@ -55,8 +55,12 @@ impl Serializer {
self.buffer.fill_write(num, value)
}
pub fn write<T: Pod>(&mut self, value: T) {
pub fn write<T: Pod>(&mut self, value: T) -> u64 {
self.debug_assert_alignment::<T>();
let vaddr = self
.vaddr
.saturating_add(self.buffer.len() as u64)
.saturating_sub(self.region_start as u64);
// Safety:
// in serialize_parameters_(aligned|unaligned) first we compute the
// required size then we write into the newly allocated buffer. There's
@ -68,22 +72,38 @@ impl Serializer {
unsafe {
self.buffer.write_unchecked(value);
}
vaddr
}
fn write_all(&mut self, value: &[u8]) {
fn write_all(&mut self, value: &[u8]) -> u64 {
let vaddr = self
.vaddr
.saturating_add(self.buffer.len() as u64)
.saturating_sub(self.region_start as u64);
// Safety:
// see write() - the buffer is guaranteed to be large enough
unsafe {
self.buffer.write_all_unchecked(value);
}
vaddr
}
fn write_account(&mut self, account: &mut BorrowedAccount<'_>) -> Result<(), InstructionError> {
if self.copy_account_data {
fn write_account(
&mut self,
account: &mut BorrowedAccount<'_>,
) -> Result<u64, InstructionError> {
let vm_data_addr = if self.copy_account_data {
let vm_data_addr = self.vaddr.saturating_add(self.buffer.len() as u64);
self.write_all(account.get_data());
vm_data_addr
} else {
self.push_account_region(account)?;
}
self.push_region(true);
let vaddr = self.vaddr;
self.push_account_data_region(account)?;
vaddr
};
if self.aligned {
let align_offset =
@ -100,59 +120,55 @@ impl Serializer {
self.fill_write(MAX_PERMITTED_DATA_INCREASE + BPF_ALIGN_OF_U128, 0)
.map_err(|_| InstructionError::InvalidArgument)?;
self.region_start += BPF_ALIGN_OF_U128.saturating_sub(align_offset);
// put the realloc padding in its own region
self.push_region(account.can_data_be_changed().is_ok());
}
}
Ok(())
Ok(vm_data_addr)
}
fn push_account_region(
fn push_account_data_region(
&mut self,
account: &mut BorrowedAccount<'_>,
) -> Result<(), InstructionError> {
self.push_region();
let account_len = account.get_data().len();
if account_len > 0 {
let region = if account.can_data_be_changed().is_ok() {
if account.is_shared() {
// If the account is still shared it means it wasn't written to yet during this
// transaction. We map it as CoW and it'll be copied the first time something
// tries to write into it.
let index_in_transaction = account.get_index_in_transaction();
MemoryRegion::new_cow(
account.get_data(),
self.vaddr,
index_in_transaction as u64,
)
} else {
// The account isn't shared anymore, meaning it was written to earlier during
// this transaction. We can just map as writable no need for any fancy CoW
// business.
if !account.get_data().is_empty() {
let region = match account_data_region_memory_state(account) {
MemoryState::Readable => MemoryRegion::new_readonly(account.get_data(), self.vaddr),
MemoryState::Writable => {
MemoryRegion::new_writable(account.get_data_mut()?, self.vaddr)
}
} else {
MemoryRegion::new_readonly(account.get_data(), self.vaddr)
MemoryState::Cow(index_in_transaction) => {
MemoryRegion::new_cow(account.get_data(), self.vaddr, index_in_transaction)
}
};
self.vaddr += region.len;
self.regions.push(region);
}
Ok(())
}
fn push_region(&mut self) {
fn push_region(&mut self, writable: bool) {
let range = self.region_start..self.buffer.len();
let region = MemoryRegion::new_writable(
self.buffer.as_slice_mut().get_mut(range.clone()).unwrap(),
self.vaddr,
);
let region = if writable {
MemoryRegion::new_writable(
self.buffer.as_slice_mut().get_mut(range.clone()).unwrap(),
self.vaddr,
)
} else {
MemoryRegion::new_readonly(
self.buffer.as_slice().get(range.clone()).unwrap(),
self.vaddr,
)
};
self.regions.push(region);
self.region_start = range.end;
self.vaddr += range.len() as u64;
}
fn finish(mut self) -> (AlignedMemory<HOST_ALIGN>, Vec<MemoryRegion>) {
self.push_region();
self.push_region(true);
debug_assert_eq!(self.region_start, self.buffer.len());
(self.buffer, self.regions)
}
@ -315,19 +331,23 @@ fn serialize_parameters_unaligned(
s.write(position as u8);
}
SerializeAccount::Account(_, mut account) => {
accounts_metadata.push(SerializedAccountMetadata {
original_data_len: account.get_data().len(),
});
s.write::<u8>(NON_DUP_MARKER);
s.write::<u8>(account.is_signer() as u8);
s.write::<u8>(account.is_writable() as u8);
s.write_all(account.get_key().as_ref());
s.write::<u64>(account.get_lamports().to_le());
let vm_key_addr = s.write_all(account.get_key().as_ref());
let vm_lamports_addr = s.write::<u64>(account.get_lamports().to_le());
s.write::<u64>((account.get_data().len() as u64).to_le());
s.write_account(&mut account)?;
s.write_all(account.get_owner().as_ref());
let vm_data_addr = s.write_account(&mut account)?;
let vm_owner_addr = s.write_all(account.get_owner().as_ref());
s.write::<u8>(account.is_executable() as u8);
s.write::<u64>((account.get_rent_epoch()).to_le());
accounts_metadata.push(SerializedAccountMetadata {
original_data_len: account.get_data().len(),
vm_key_addr,
vm_lamports_addr,
vm_owner_addr,
vm_data_addr,
});
}
};
}
@ -406,7 +426,7 @@ fn serialize_parameters_aligned(
),
InstructionError,
> {
let mut account_lengths = Vec::with_capacity(accounts.len());
let mut accounts_metadata = Vec::with_capacity(accounts.len());
// Calculate size in order to alloc once
let mut size = size_of::<u64>();
for account in &accounts {
@ -444,23 +464,27 @@ fn serialize_parameters_aligned(
for account in accounts {
match account {
SerializeAccount::Account(_, mut borrowed_account) => {
account_lengths.push(SerializedAccountMetadata {
original_data_len: borrowed_account.get_data().len(),
});
s.write::<u8>(NON_DUP_MARKER);
s.write::<u8>(borrowed_account.is_signer() as u8);
s.write::<u8>(borrowed_account.is_writable() as u8);
s.write::<u8>(borrowed_account.is_executable() as u8);
s.write_all(&[0u8, 0, 0, 0]);
s.write_all(borrowed_account.get_key().as_ref());
s.write_all(borrowed_account.get_owner().as_ref());
s.write::<u64>(borrowed_account.get_lamports().to_le());
let vm_key_addr = s.write_all(borrowed_account.get_key().as_ref());
let vm_owner_addr = s.write_all(borrowed_account.get_owner().as_ref());
let vm_lamports_addr = s.write::<u64>(borrowed_account.get_lamports().to_le());
s.write::<u64>((borrowed_account.get_data().len() as u64).to_le());
s.write_account(&mut borrowed_account)?;
let vm_data_addr = s.write_account(&mut borrowed_account)?;
s.write::<u64>((borrowed_account.get_rent_epoch()).to_le());
accounts_metadata.push(SerializedAccountMetadata {
original_data_len: borrowed_account.get_data().len(),
vm_key_addr,
vm_owner_addr,
vm_lamports_addr,
vm_data_addr,
});
}
SerializeAccount::Duplicate(position) => {
account_lengths.push(account_lengths.get(position as usize).unwrap().clone());
accounts_metadata.push(accounts_metadata.get(position as usize).unwrap().clone());
s.write::<u8>(position as u8);
s.write_all(&[0u8, 0, 0, 0, 0, 0, 0]);
}
@ -471,7 +495,7 @@ fn serialize_parameters_aligned(
s.write_all(program_id.as_ref());
let (mem, regions) = s.finish();
Ok((mem, regions, account_lengths))
Ok((mem, regions, accounts_metadata))
}
pub fn deserialize_parameters_aligned<I: IntoIterator<Item = usize>>(
@ -579,6 +603,18 @@ pub fn deserialize_parameters_aligned<I: IntoIterator<Item = usize>>(
Ok(())
}
pub(crate) fn account_data_region_memory_state(account: &BorrowedAccount<'_>) -> MemoryState {
if account.can_data_be_changed().is_ok() {
if account.is_shared() {
MemoryState::Cow(account.get_index_in_transaction() as u64)
} else {
MemoryState::Writable
}
} else {
MemoryState::Readable
}
}
#[cfg(test)]
#[allow(clippy::indexing_slicing)]
mod tests {

File diff suppressed because it is too large Load Diff

View File

@ -123,6 +123,8 @@ pub enum SyscallError {
},
#[error("InvalidAttribute")]
InvalidAttribute,
#[error("Invalid pointer")]
InvalidPointer,
}
type Error = Box<dyn std::error::Error>;

View File

@ -4613,6 +4613,7 @@ dependencies = [
"libsecp256k1 0.6.0",
"log",
"rand 0.8.5",
"scopeguard",
"solana-measure",
"solana-program-runtime",
"solana-sdk",

View File

@ -1000,8 +1000,8 @@ impl<'a> BorrowedAccount<'a> {
// transaction reallocs, we don't have to copy the whole account data a
// second time to fullfill the realloc.
//
// NOTE: The account memory region CoW code in Serializer::push_account_region() implements
// the same logic and must be kept in sync.
// NOTE: The account memory region CoW code in bpf_loader::create_vm() implements the same
// logic and must be kept in sync.
if self.account.is_shared() {
self.account.reserve(MAX_PERMITTED_DATA_INCREASE);
}