cloud-foundation-fabric/fast/stages/02-security/README.md

22 KiB

Shared security resources

This stage sets up security resources and configurations which impact the whole organization, or are shared across the hierarchy to other projects and teams.

The design of this stage is fairly general, and provides a reference example for Cloud KMS and a VPC Service Controls configuration that sets up three perimeters (landing, development, production), their related bridge perimeters, and provides variables to configure their resources, access levels, and directional policies.

Expanding this stage to include other security-related services like Secret Manager, is fairly simple by using the provided implementation for Cloud KMS, and leveraging the broad permissions on the top-level Security folder of the automation service account used.

The following diagram illustrates the high-level design of created resources and a schema of the VPC SC design, which can be adapted to specific requirements via variables:

Security diagram

Design overview and choices

Project-level security resources are grouped into two separate projects, one per environment. This setup matches requirements we frequently observe in real life and provides enough separation without needlessly complicating operations.

Cloud KMS is configured and designed mainly to encrypt GCP resources with a Customer-managed encryption key but it may be used to create cryptokeys used to encrypt application data too.

IAM for management-related operations is already assigned at the folder level to the security team by the previous stage, but more granularity can be added here at the project level, to grant control of separate services across environments to different actors.

Cloud KMS

A reference Cloud KMS implementation is part of this stage, to provide a simple way of managing centralized keys, that are then shared and consumed widely across the organization to enable customer-managed encryption. The implementation is also easy to clone and modify to support other services like Secret Manager.

The Cloud KMS configuration allows defining keys by name (typically matching the downstream service that uses them) in different locations, either based on a common default or a per-key setting. It then takes care internally of provisioning the relevant keyrings and creating keys in the appropriate location.

IAM roles on keys can be configured at the logical level for all locations where a logical key is created. Their management can also be delegated via delegated role grants exposed through a simple variable, to allow other identities to set IAM policies on keys. This is particularly useful in setups like project factories, making it possible to configure IAM bindings during project creation for team groups or service agent accounts (compute, storage, etc.).

VPC Service Controls

This stage also provisions the VPC Service Controls configuration on demand for the whole organization, implementing the straightforward design illustrated above:

  • one perimeter for each environment
  • one perimeter for centralized services and the landing VPC
  • bridge perimeters to connect the landing perimeter to each environment

The VPC SC configuration is set to dry-run mode, but switching to enforced mode is a simple operation involving modifying a few lines of code highlighted by ad-hoc comments. Variables are designed to enable easy centralized management of VPC Service Controls, including access levels and ingress/egress rules as described below.

Some care needs to be taken with project membership in perimeters, which can only be implemented here instead of being delegated (all or partially) to different stages, until the Google Provider feature request allowing using project-level association for both enforced and dry-run modes is implemented.

How to run this stage

This stage is meant to be executed after the resource management stage has run, as it leverages the folder and automation resources created there. The relevant user groups must also exist, but that's one of the requirements for the previous stages too, so if you ran those successfully, you're good to go.

It's possible to run this stage in isolation, but that's outside the scope of this document, and you would need to refer to the code for the bootstrap stage for the required roles.

Before running this stage, you need to ensure you have the correct credentials and permissions, and customize variables by assigning values that match your configuration.

Providers configuration

The default way of making sure you have the correct permissions is to use the identity of the service account pre-created for this stage during bootstrap, and that you are a member of the group that can impersonate it via provider-level configuration (gcp-devops or organization-admins).

To simplify setup, the previous stage pre-configures a valid providers file in its output, and optionally writes it to a local file if the outputs_location variable is set to a valid path.

If you have set a valid value for outputs_location in the resource management stage, simply link the relevant providers.tf file from this stage's folder in the path you specified:

# `outputs_location` is set to `~/fast-config`
ln -s ~/fast-config/providers/02-security-providers.tf .

If you have not configured outputs_location in resource management, you can derive the providers file from that stage's outputs:

cd ../01-resman
terraform output -json providers | jq -r '.["02-security"]' \
  > ../02-security/providers.tf

Variable configuration

There are two broad sets of variables you will need to fill in:

  • variables shared by other stages (organization id, billing account id, etc.), or derived from a resource managed by a different stage (folder id, automation project id, etc.)
  • variables specific to resources managed by this stage

To avoid the tedious job of filling in the first group of variables with values derived from other stages' outputs, the same mechanism used above for the provider configuration can be used to leverage pre-configured .tfvars files.

If you configured a valid path for outputs_location in the previous stages, simply link the relevant terraform-*.auto.tfvars.json files from this stage's output folder (under the path you specified), where the * above is set to the name of the stage that produced it. For this stage, two .tfvars files are available:

# `outputs_location` is set to `~/fast-config`
ln -s ~/fast-config/tfvars/00-bootstrap.auto.tfvars.json .
ln -s ~/fast-config/tfvars/01-resman.auto.tfvars.json .
# also copy the tfvars file used for the bootstrap stage
cp ../00-bootstrap/terraform.tfvars .

A second set of optional variables is specific to this stage. If you need to customize them add them to the file copied from bootstrap.

Refer to the Variables table at the bottom of this document, for a full list of variables, their origin (e.g., a stage or specific to this one), and descriptions explaining their meaning. The sections below also describe some of the possible customizations.

Once done, you can run this stage:

terraform init
terraform apply

Customizations

KMS keys

Cloud KMS configuration is split in two variables:

  • kms_defaults configures the locations and rotation period, used for keys that don't specifically configure them
  • kms_keys configures the actual keys to create, and also allows configuring their IAM bindings and labels, and overriding locations and rotation period. When configuring locations for a key, please consider the limitations each cloud product may have.

The additional kms_restricted_admins variable allows granting roles/cloudkms.admin to specified principals, restricted via delegated role grants so that it only allows granting the roles needed for encryption/decryption on keys. This allows safe delegation of key management to subsequent Terraform stages like the Project Factory, for example to grant usage access on relevant keys to the service agent accounts for compute, storage, etc.

To support these scenarios, key IAM bindings are configured by default to be additive, to enable other stages or Terraform configuration to safely co-manage bindings on the same keys. If this is not desired, follow the comments in the core-dev.tf and core-prod.tf files to switch to authoritative bindings on keys.

An example of how to configure keys:

# terraform.tfvars

kms_defaults = {
  locations       = ["europe-west1", "europe-west3", "global"]
  rotation_period = "7776000s"
}
kms_keys = {
  compute = {
    iam = {
      "roles/cloudkms.cryptoKeyEncrypterDecrypter" = [
        "user:user1@example.com"
      ]
    }
    labels = { service = "compute" }
    locations = null
    rotation_period = null
  }
  storage = {
    iam = null
    labels = { service = "compute" }
    locations = ["europe"]
    rotation_period = null
  }
}

The script will create one keyring for each specified location and keys on each keyring.

VPC Service Controls configuration

A set of variables allows configuring the VPC SC perimeters described above:

  • vpc_sc_perimeter_projects configures project membership in the three regular perimeters
  • vpc_sc_access_levels configures access levels, which can then be associated to perimeters by key using the vpc_sc_perimeter_access_levels
  • vpc_sc_egress_policies configures directional egress policies, which can then be associated to perimeters by key using the vpc_sc_perimeter_egress_policies
  • vpc_sc_ingress_policies configures directional ingress policies, which can then be associated to perimeters by key using the vpc_sc_perimeter_ingress_policies

This allows configuring VPC SC in a fairly flexible and concise way, without repeating similar definitions. Bridges perimeters configuration will be computed automatically to allow communication between regular perimeters: landing <-> prod and landing <-> dev.

Dry-run vs. enforced

The VPC SC configuration is set up by default in dry-run mode to allow easy experimentation, and detecting violations before enforcement. Once everything is set up correctly, switching to enforced mode needs to be done in code, by changing the vpc_sc_explicit_dry_run_spec local.

Access levels

Access levels are defined via the vpc_sc_access_levels variable, and referenced by key in perimeter definitions:

vpc_sc_access_levels = {
  onprem = {
    conditions = [{
      ip_subnetworks = ["101.101.101.0/24"]
    }]
  }
}

Ingress and Egress policies

Ingress and egress policy are defined via the vpc_sc_egress_policies and vpc_sc_ingress_policies, and referenced by key in perimeter definitions:

vpc_sc_egress_policies = {
  iac-gcs = {
    from = {
      identities = [
        "serviceAccount:xxx-prod-resman-security-0@xxx-prod-iac-core-0.iam.gserviceaccount.com"
      ]
    }
    to = {
      operations = [{
        method_selectors = ["*"]
        service_name = "storage.googleapis.com"
      }]
      resources = ["projects/123456782"]
    }
  }
}
vpc_sc_ingress_policies = {
  iac = {
    from = {
      identities = [
        "serviceAccount:xxx-prod-resman-security-0@xxx-prod-iac-core-0.iam.gserviceaccount.com"
      ]
      access_levels = ["*"]
    }
    to = {
      operations = [{ method_selectors = [], service_name = "*" }]
      resources  = ["*"]
    }
  }
}

Perimeters

Regular perimeters are defined via the the vpc_sc_perimeters variable, and bridge perimeters are automatically populated from that variable.

Support for independently adding projects to perimeters outside of this Terraform setup is pending resolution of this Google Terraform Provider issue, which implements support for dry-run mode in the additive resource.

Access levels and egress/ingress policies are referenced in perimeters via keys.

vpc_sc_perimeters = {
  dev = {
    egress_policies  = ["iac-gcs"]
    ingress_policies = ["iac"]
    resources        = ["projects/1111111111"]
  }
  dev = {
    egress_policies  = ["iac-gcs"]
    ingress_policies = ["iac"]
    resources        = ["projects/0000000000"]
  }
  dev = {
    access_levels    = ["onprem"]
    egress_policies  = ["iac-gcs"]
    ingress_policies = ["iac"]
    resources        = ["projects/2222222222"]
  }
}

Notes

Some references that might be useful in setting up this stage:

Files

name description modules resources
core-dev.tf None kms · project google_project_iam_member
core-prod.tf None kms · project google_project_iam_member
main.tf Module-level locals and resources.
outputs.tf Module outputs. google_storage_bucket_object · local_file
variables.tf Module variables.
vpc-sc.tf None vpc-sc

Variables

name description type required default producer
automation Automation resources created by the bootstrap stage. object({…}) 00-bootstrap
billing_account Billing account id and organization id ('nnnnnnnn' or null). object({…}) 00-bootstrap
folder_ids Folder name => id mappings, the 'security' folder name must exist. object({…}) 01-resman
organization Organization details. object({…}) 00-bootstrap
prefix Prefix used for resources that need unique names. Use 9 characters or less. string 00-bootstrap
service_accounts Automation service accounts that can assign the encrypt/decrypt roles on keys. object({…}) 01-resman
groups Group names to grant organization-level permissions. map(string) {…} 00-bootstrap
kms_defaults Defaults used for KMS keys. object({…}) {…}
kms_keys KMS keys to create, keyed by name. Null attributes will be interpolated with defaults. map(object({…})) {}
outputs_location Path where providers, tfvars files, and lists for the following stages are written. Leave empty to disable. string null
vpc_sc_access_levels VPC SC access level definitions. map(object({…})) {}
vpc_sc_egress_policies VPC SC egress policy defnitions. map(object({…})) {}
vpc_sc_ingress_policies VPC SC ingress policy defnitions. map(object({…})) {}
vpc_sc_perimeters VPC SC regular perimeter definitions. object({…}) {}

Outputs

name description sensitive consumers
kms_keys KMS key ids.
stage_perimeter_projects Security project numbers. They can be added to perimeter resources.
tfvars Terraform variable files for the following stages.