cloud-foundation-fabric/modules/dataproc
Ludovico Magnocavallo 819894d2ba
IAM interface refactor (#1595)
* IAM modules refactor proposal

* policy

* subheading

* Update 20230816-iam-refactor.md

* log Julio's +1

* data-catalog-policy-tag

* dataproc

* dataproc

* folder

* folder

* folder

* folder

* project

* better filtering in test examples

* project

* folder

* folder

* organization

* fix variable descriptions

* kms

* net-vpc

* dataplex-datascan

* modules/iam-service-account

* modules/source-repository/

* blueprints/cloud-operations/vm-migration/

* blueprints/third-party-solutions/wordpress

* dataplex-datascan

* blueprints/cloud-operations/workload-identity-federation

* blueprints/data-solutions/cloudsql-multiregion/

* blueprints/data-solutions/composer-2

* Update 20230816-iam-refactor.md

* Update 20230816-iam-refactor.md

* capture discussion in architectural doc

* update variable names and refactor proposal

* project

* blueprints first round

* folder

* organization

* data-catalog-policy-tag

* re-enable folder inventory

* project module style fix

* dataproc

* source-repository

* source-repository tests

* dataplex-datascan

* dataplex-datascan tests

* net-vpc

* net-vpc test examples

* iam-service-account

* iam-service-account test examples

* kms

* boilerplate

* tfdoc

* fix module tests

* more blueprint fixes

* fix typo in data blueprints

* incomplete refactor of data platform foundations

* tfdoc

* data platform foundation

* refactor data platform foundation iam locals

* remove redundant example test

* shielded folder fix

* fix typo

* project factory

* project factory outputs

* tfdoc

* test workflow: less verbose tests, fix tf version

* re-enable -vv, shorter traceback, fix action version

* ignore github extension warning, re-enable action version

* fast bootstrap IAM, untested

* bootstrap stage IAM fixes

* stage 0 tests

* fast stage 1

* tenant stage 1

* minor changes to fast stage 0 and 1

* fast security stage

* fast mt stage 0

* fast mt stage 0

* fast pf
2023-08-20 09:44:20 +02:00
..
README.md IAM interface refactor (#1595) 2023-08-20 09:44:20 +02:00
iam.tf IAM interface refactor (#1595) 2023-08-20 09:44:20 +02:00
main.tf Fix Variables 2023-03-01 07:54:10 +01:00
outputs.tf Ensure all modules have an `id` output (#1410) 2023-06-02 16:07:22 +02:00
variables.tf IAM interface refactor (#1595) 2023-08-20 09:44:20 +02:00
versions.tf Moved allow_net_admin to enable_features flag. Bumped provider version to 4.76 2023-08-07 14:27:20 +01:00

README.md

Google Cloud Dataproc

This module Manages a Google Cloud Dataproc cluster resource, including IAM.

TODO

Examples

Simple

module "processing-dp-cluster-2" {
  source     = "./fabric/modules/dataproc"
  project_id = "my-project"
  name       = "my-cluster"
  region     = "europe-west1"
}
# tftest modules=1 resources=1

Cluster configuration

To set cluster configuration use the 'dataproc_config.cluster_config' variable.

module "processing-dp-cluster" {
  source     = "./fabric/modules/dataproc"
  project_id = "my-project"
  name       = "my-cluster"
  region     = "europe-west1"
  prefix     = "prefix"
  dataproc_config = {
    cluster_config = {
      gce_cluster_config = {
        subnetwork             = "https://www.googleapis.com/compute/v1/projects/PROJECT/regions/europe-west1/subnetworks/SUBNET"
        zone                   = "europe-west1-b"
        service_account        = ""
        service_account_scopes = ["cloud-platform"]
        internal_ip_only       = true
      }
    }
  }
}
# tftest modules=1 resources=1

Cluster with CMEK encryption

To set cluster configuration use the Customer Managed Encryption key, set dataproc_config.encryption_config. variable. The Compute Engine service agent and the Cloud Storage service agent need to have CryptoKey Encrypter/Decrypter role on they configured KMS key (Documentation).

module "processing-dp-cluster" {
  source     = "./fabric/modules/dataproc"
  project_id = "my-project"
  name       = "my-cluster"
  region     = "europe-west1"
  prefix     = "prefix"
  dataproc_config = {
    cluster_config = {
      gce_cluster_config = {
        subnetwork             = "https://www.googleapis.com/compute/v1/projects/PROJECT/regions/europe-west1/subnetworks/SUBNET"
        zone                   = "europe-west1-b"
        service_account        = ""
        service_account_scopes = ["cloud-platform"]
        internal_ip_only       = true
      }
    }
    encryption_config = {
      kms_key_name = "projects/project-id/locations/region/keyRings/key-ring-name/cryptoKeys/key-name"
    }
  }
}
# tftest modules=1 resources=1

IAM

IAM is managed via several variables that implement different features and levels of control:

  • iam and group_iam configure authoritative bindings that manage individual roles exclusively, and are internally merged
  • iam_bindings configure authoritative bindings with optional support for conditions, and are not internally merged with the previous two variables
  • iam_bindings_additive configure additive bindings via individual role/member pairs with optional support conditions

The authoritative and additive approaches can be used together, provided different roles are managed by each. Some care must also be taken with the groups_iam variable to ensure that variable keys are static values, so that Terraform is able to compute the dependency graph.

Refer to the project module for examples of the IAM interface.

Authoritative IAM

module "processing-dp-cluster" {
  source     = "./fabric/modules/dataproc"
  project_id = "my-project"
  name       = "my-cluster"
  region     = "europe-west1"
  prefix     = "prefix"
  group_iam = {
    "gcp-data-engineers@example.net" = [
      "roles/dataproc.viewer"
    ]
  }
  iam = {
    "roles/dataproc.viewer" = [
      "serviceAccount:service-account@PROJECT_ID.iam.gserviceaccount.com"
    ]
  }
}
# tftest modules=1 resources=2

Additive IAM

module "processing-dp-cluster" {
  source     = "./fabric/modules/dataproc"
  project_id = "my-project"
  name       = "my-cluster"
  region     = "europe-west1"
  prefix     = "prefix"
  iam_bindings_additive = {
    am1-viewer = {
      member = "user:am1@example.com"
      role   = "roles/dataproc.viewer"
    }
  }
}
# tftest modules=1 resources=2

Variables

name description type required default
name Cluster name. string
project_id Project ID. string
region Dataproc region. string
dataproc_config Dataproc cluster config. object({…}) {}
group_iam Authoritative IAM binding for organization groups, in {GROUP_EMAIL => [ROLES]} format. Group emails need to be static. Can be used in combination with the iam variable. map(list(string)) {}
iam IAM bindings in {ROLE => [MEMBERS]} format. map(list(string)) {}
iam_bindings Authoritative IAM bindings in {ROLE => {members = [], condition = {}}}. map(object({…})) {}
iam_bindings_additive Individual additive IAM bindings. Keys are arbitrary. map(object({…})) {}
labels The resource labels for instance to use to annotate any related underlying resources, such as Compute Engine VMs. map(string) {}
prefix Optional prefix used to generate project id and name. string null
service_account Service account to set on the Dataproc cluster. string null

Outputs

name description sensitive
bucket_names List of bucket names which have been assigned to the cluster.
http_ports The map of port descriptions to URLs.
id Fully qualified cluster id.
instance_names List of instance names which have been assigned to the cluster.
name The name of the cluster.