cloud-foundation-fabric/fast/stages/2-security
Julio Castillo 3af7e257d2
Add tflint to pipelines (#2220)
* Fix terraform_deprecated_index

https://github.com/terraform-linters/tflint-ruleset-terraform/blob/v0.5.0/docs/rules/terraform_deprecated_index.md

* Fix terraform_deprecated_interpolation

Reference: https://github.com/terraform-linters/tflint-ruleset-terraform/blob/v0.5.0/docs/rules/terraform_deprecated_interpolation.md

* Fix more indexing

* Remove unused variable

* Enable TFLint for modules

* Add tflint config file

* Fix chdir

* Lint modules

* TFLint fixes

* TFLint

* Fixes binauthz README

* Fixes DNS response policy tests. Restores MIG outputs.

* Fixes other DNS response policy tests.

* Update tests for fast 2-e

* Moar fixed tests

---------

Co-authored-by: Simone Ruffilli <sruffilli@google.com>
2024-04-17 10:23:48 +02:00
..
data/vpc-sc FAST security stage refactor (#2203) 2024-04-07 20:14:39 -07:00
IAM.md update FAST state IAM files (#2136) 2024-03-07 00:08:09 +01:00
README.md FAST security stage refactor (#2203) 2024-04-07 20:14:39 -07:00
core-dev.tf Fix FAST tests 2023-09-17 00:21:36 +02:00
core-prod.tf Fix FAST tests 2023-09-17 00:21:36 +02:00
diagram.png FAST security stage refactor (#2203) 2024-04-07 20:14:39 -07:00
main.tf Extend FAST to support different principal types (#2064) 2024-02-12 14:35:30 +01:00
outputs.tf Add tflint to pipelines (#2220) 2024-04-17 10:23:48 +02:00
variables.tf FAST security stage refactor (#2203) 2024-04-07 20:14:39 -07:00
vpc-sc.tf Add tflint to pipelines (#2220) 2024-04-17 10:23:48 +02:00

README.md

Shared security resources and VPC Service Controls

This stage sets up security resources and configurations which impact the whole organization, or are shared across the hierarchy to other projects and teams.

The design of this stage is fairly general, providing

Expanding this stage to include other security-related services like Secret Manager is fairly simple by adapting the provided implementation for Cloud KMS, and leveraging the broad permissions granted on the top-level Security folder to the automation service account used here.

The following diagram illustrates the high-level design of created resources and a schema of the VPC SC design:

Security diagram

Design overview and choices

Project-level security resources are grouped into two separate projects, one per environment. This setup matches requirements we frequently observe in real life and provides enough separation without needlessly complicating operations.

Cloud KMS is configured and designed mainly to encrypt GCP resources with a Customer-managed encryption key but it may be used to create cryptokeys used to encrypt application data too.

IAM for day to day operations is already assigned at the folder level to the security team by the previous stage, but more granularity can be added here at the project level, to grant control of separate services across environments to different actors.

Cloud KMS

A reference Cloud KMS implementation is part of this stage, to provide a simple way of managing centralized keys, that are then shared and consumed widely across the organization to enable customer-managed encryption. The implementation is also easy to clone and modify to support other services like Secret Manager.

The Cloud KMS configuration allows defining keys by name (typically matching the downstream service that uses them) in different locations. It then takes care internally of provisioning the relevant keyrings and creating keys in the appropriate location.

IAM roles on keys can be configured at the logical level for all locations where a logical key is created. Their management can also be delegated via delegated role grants exposed through a simple variable, to allow other identities to set IAM policies on keys. This is particularly useful in setups like project factories, making it possible to configure IAM bindings during project creation for team groups or service agent accounts (compute, storage, etc.).

VPC Service Controls

This stage also provisions the VPC Service Controls configuration that protects the whole organization, implementing a simplified design that leverages a single perimeter and optionally provides automatic enrollment of projects in the perimeter.

The VPC SC configuration is controlled via the top-level vpc_sc variable, and is disabled by default unless vpc_sc.perimeter_default is populated. Access levels and ingress/egress policies can be defined in code via the respective vpc_sc variable attributes, or via YAML-based factories configured via the usual factories_config variable.

How to run this stage

This stage is meant to be executed after the resource management stage has run, as it leverages the automation service account and bucket created there, and additional resources configured in the bootstrap stage.

It's of course possible to run this stage in isolation, but that's outside the scope of this document, and you would need to refer to the code for the previous stages for the environmental requirements.

Before running this stage, you need to make sure you have the correct credentials and permissions, and localize variables by assigning values that match your configuration.

Provider and Terraform variables

As all other FAST stages, the mechanism used to pass variable values and pre-built provider files from one stage to the next is also leveraged here.

The commands to link or copy the provider and terraform variable files can be easily derived from the stage-links.sh script in the FAST root folder, passing it a single argument with the local output files folder (if configured) or the GCS output bucket in the automation project (derived from stage 0 outputs). The following examples demonstrate both cases, and the resulting commands that then need to be copy/pasted and run.

../../stage-links.sh ~/fast-config

# copy and paste the following commands for '2-security'

ln -s ~/fast-config/providers/2-security-providers.tf ./
ln -s ~/fast-config/tfvars/0-globals.auto.tfvars.json ./
ln -s ~/fast-config/tfvars/0-bootstrap.auto.tfvars.json ./
ln -s ~/fast-config/tfvars/1-resman.auto.tfvars.json ./
../../stage-links.sh gs://xxx-prod-iac-core-outputs-0

# copy and paste the following commands for '2-security'

gcloud alpha storage cp gs://xxx-prod-iac-core-outputs-0/providers/2-security-providers.tf ./
gcloud alpha storage cp gs://xxx-prod-iac-core-outputs-0/tfvars/0-globals.auto.tfvars.json ./
gcloud alpha storage cp gs://xxx-prod-iac-core-outputs-0/tfvars/0-bootstrap.auto.tfvars.json ./
gcloud alpha storage cp gs://xxx-prod-iac-core-outputs-0/tfvars/1-resman.auto.tfvars.json ./

Impersonating the automation service account

The preconfigured provider file uses impersonation to run with this stage's automation service account's credentials. The gcp-devops and organization-admins groups have the necessary IAM bindings in place to do that, so make sure the current user is a member of one of those groups.

Variable configuration

Variables in this stage -- like most other FAST stages -- are broadly divided into three separate sets:

  • variables which refer to global values for the whole organization (org id, billing account id, prefix, etc.), which are pre-populated via the 0-globals.auto.tfvars.json file linked or copied above
  • variables which refer to resources managed by previous stages, which are prepopulated here via the 0-bootstrap.auto.tfvars.json and 1-resman.auto.tfvars.json files linked or copied above
  • and finally variables that optionally control this stage's behaviour and customizations, and can to be set in a custom terraform.tfvars file

The latter set is explained in the Customization sections below, and the full list can be found in the Variables table at the bottom of this document.

Note that the outputs_location variable is disabled by default, you need to explicitly set it in your terraform.tfvars file if you want output files to be generated by this stage. This is a sample terraform.tfvars that configures it, refer to the bootstrap stage documentation for more details:

outputs_location = "~/fast-config"

Using delayed billing association for projects

This configuration is possible but unsupported and only exists for development purposes, use at your own risk:

  • temporarily switch billing_account.id to null in 0-globals.auto.tfvars.json
  • for each project resources in the project modules used in this stage (dev-sec-project, prod-sec-project)
    • apply using -target, for example terraform apply -target 'module.prod-sec-project.google_project.project[0]'
    • untaint the project resource after applying, for example terraform untaint 'module.prod-sec-project.google_project.project[0]'
  • go through the process to associate the billing account with the two projects
  • switch billing_account.id back to the real billing account id
  • resume applying normally

Running the stage

Once provider and variable values are in place and the correct user is configured, the stage can be run:

terraform init
terraform apply

Customizations

KMS keys

Cloud KMS configuration is controlled by kms_keys, which configures the actual keys to create, and also allows configuring their IAM bindings, labels, locations and rotation period. When configuring locations for a key, please consider the limitations each cloud product may have.

The additional kms_restricted_admins variable allows granting roles/cloudkms.admin to specified principals, restricted via delegated role grants so that it only allows granting the roles needed for encryption/decryption on keys. This allows safe delegation of key management to subsequent Terraform stages like the Project Factory, for example to grant usage access on relevant keys to the service agent accounts for compute, storage, etc.

To support these scenarios, key IAM bindings are configured by default to be additive, to enable other stages or Terraform configuration to safely co-manage bindings on the same keys. If this is not desired, follow the comments in the core-dev.tf and core-prod.tf files to switch to authoritative bindings on keys.

An example of how to configure keys:

# terraform.tfvars

kms_keys = {
  compute = {
    iam = {
      "roles/cloudkms.cryptoKeyEncrypterDecrypter" = [
        "user:user1@example.com"
      ]
    }
    labels          = { service = "compute" }
    locations       = ["europe-west1", "europe-west3", "global"]
    rotation_period = "7776000s"
  }
  storage = {
    iam             = null
    labels          = { service = "compute" }
    locations       = ["europe"]
    rotation_period = null
  }
}

The script will create one keyring for each specified location and keys on each keyring.

VPC Service Controls configuration

The vpc_sc variable controls VPC-SC configuration and project auto-discovery via Cloud Asset Inventory. VPC-SC configuration can also leverage YAML factories via the factories_config variable. Both variables mostly pass through to the underlying vpc-sc module, which serves as a reference for their individual types.

The vpc_sc variable has the following attributes:

  • access_levels, egress_policies, ingress_policies define the corresponding objects, internally merged with any data coming from the YAML factories
  • perimeter_default configures the single organization-wide perimeter by referencing access levels and policies by key, setting included projects, and allowing to turn on dry run mode
  • resource_discovery controls automatic discovery of projects via Asset Inventory, and allows defining inclusion and exclusions lists

A few things to note on the default perimeter

  • writer identities for sinks defined in the bootstrap stage are passed through via output files, and automatically included in an ingress policy
  • the perimeter is brought up in enforced mode by default
  • project discovery is turned on by default and includes all projects in the organization

The following example configures the default perimeter, with a single broad geo-based access level. Refer to the vpc-sc module for details on how to configure ingress/egress policies, and how to leverage the YAML factories. The perimeter is set to enforced mode and leverages auto discovery of projects.

The following YAML file leverages factories to configure the broad geo-based access level (the factory path can be changed via the factories_config variable):

# data/vpc-sc/access-levels/geo-default.yaml
conditions:
  - regions:
      - IT
      - ES
# terraform.tfvars

vpc_sc = {
  perimeter_default = {
    access_levels = ["geo-default"]
    # dry run is disabled by default
    dry_run = true
    # resource discovery is enabled by default
  }
}

Notes

Some references that might be useful in setting up this stage:

Files

name description modules resources
core-dev.tf None kms · project
core-prod.tf None kms · project
main.tf Module-level locals and resources. folder
outputs.tf Module outputs. google_storage_bucket_object · local_file
variables.tf Module variables.
vpc-sc.tf None projects-data-source · vpc-sc

Variables

name description type required default producer
automation Automation resources created by the bootstrap stage. object({…}) 0-bootstrap
billing_account Billing account id. If billing account is not part of the same org set is_org_level to false. object({…}) 0-bootstrap
folder_ids Folder name => id mappings, the 'security' folder name must exist. object({…}) 1-resman
organization Organization details. object({…}) 0-bootstrap
prefix Prefix used for resources that need unique names. Use 9 characters or less. string 0-bootstrap
service_accounts Automation service accounts that can assign the encrypt/decrypt roles on keys. object({…}) 1-resman
essential_contacts Email used for essential contacts, unset if null. string null
factories_config Paths to folders that enable factory functionality. object({…}) {}
kms_keys KMS keys to create, keyed by name. map(object({…})) {}
logging Log writer identities for organization / folders. object({…}) null 0-bootstrap
outputs_location Path where providers, tfvars files, and lists for the following stages are written. Leave empty to disable. string null
vpc_sc VPC SC configuration. object({…}) {}

Outputs

name description sensitive consumers
kms_keys KMS key ids.
tfvars Terraform variable files for the following stages.
vpc_sc_perimeter_default Raw default perimeter resource.