15 KiB
Google Cloud Dataproc
This module Manages a Google Cloud Dataproc cluster resource, including IAM.
TODO
- Add support for Cloud Dataproc autoscaling policy.
Examples
Simple
module "processing-dp-cluster-2" {
source = "./fabric/modules/dataproc"
project_id = "my-project"
name = "my-cluster"
region = "europe-west1"
}
# tftest modules=1 resources=1
Cluster configuration
To set cluster configuration use the 'dataproc_config.cluster_config' variable.
module "processing-dp-cluster" {
source = "./fabric/modules/dataproc"
project_id = "my-project"
name = "my-cluster"
region = "europe-west1"
prefix = "prefix"
dataproc_config = {
cluster_config = {
gce_cluster_config = {
subnetwork = "https://www.googleapis.com/compute/v1/projects/PROJECT/regions/europe-west1/subnetworks/SUBNET"
zone = "europe-west1-b"
service_account = ""
service_account_scopes = ["cloud-platform"]
internal_ip_only = true
}
}
}
}
# tftest modules=1 resources=1
Cluster with CMEK encryption
To set cluster configuration use the Customer Managed Encryption key, set dataproc_config.encryption_config.
variable. The Compute Engine service agent and the Cloud Storage service agent need to have CryptoKey Encrypter/Decrypter
role on they configured KMS key (Documentation).
module "processing-dp-cluster" {
source = "./fabric/modules/dataproc"
project_id = "my-project"
name = "my-cluster"
region = "europe-west1"
prefix = "prefix"
dataproc_config = {
cluster_config = {
gce_cluster_config = {
subnetwork = "https://www.googleapis.com/compute/v1/projects/PROJECT/regions/europe-west1/subnetworks/SUBNET"
zone = "europe-west1-b"
service_account = ""
service_account_scopes = ["cloud-platform"]
internal_ip_only = true
}
}
encryption_config = {
kms_key_name = "projects/project-id/locations/region/keyRings/key-ring-name/cryptoKeys/key-name"
}
}
}
# tftest modules=1 resources=1
IAM
IAM is managed via several variables that implement different features and levels of control:
iam
andgroup_iam
configure authoritative bindings that manage individual roles exclusively, and are internally mergediam_bindings
configure authoritative bindings with optional support for conditions, and are not internally merged with the previous two variablesiam_bindings_additive
configure additive bindings via individual role/member pairs with optional support conditions
The authoritative and additive approaches can be used together, provided different roles are managed by each. Some care must also be taken with the groups_iam
variable to ensure that variable keys are static values, so that Terraform is able to compute the dependency graph.
Refer to the project module for examples of the IAM interface.
Authoritative IAM
module "processing-dp-cluster" {
source = "./fabric/modules/dataproc"
project_id = "my-project"
name = "my-cluster"
region = "europe-west1"
prefix = "prefix"
group_iam = {
"gcp-data-engineers@example.net" = [
"roles/dataproc.viewer"
]
}
iam = {
"roles/dataproc.viewer" = [
"serviceAccount:service-account@PROJECT_ID.iam.gserviceaccount.com"
]
}
}
# tftest modules=1 resources=2
Additive IAM
module "processing-dp-cluster" {
source = "./fabric/modules/dataproc"
project_id = "my-project"
name = "my-cluster"
region = "europe-west1"
prefix = "prefix"
iam_bindings_additive = {
am1-viewer = {
member = "user:am1@example.com"
role = "roles/dataproc.viewer"
}
}
}
# tftest modules=1 resources=2
Variables
name | description | type | required | default |
---|---|---|---|---|
name | Cluster name. | string |
✓ | |
project_id | Project ID. | string |
✓ | |
region | Dataproc region. | string |
✓ | |
dataproc_config | Dataproc cluster config. | object({…}) |
{} |
|
group_iam | Authoritative IAM binding for organization groups, in {GROUP_EMAIL => [ROLES]} format. Group emails need to be static. Can be used in combination with the iam variable. |
map(list(string)) |
{} |
|
iam | IAM bindings in {ROLE => [MEMBERS]} format. | map(list(string)) |
{} |
|
iam_bindings | Authoritative IAM bindings in {KEY => {role = ROLE, members = [], condition = {}}}. Keys are arbitrary. | map(object({…})) |
{} |
|
iam_bindings_additive | Individual additive IAM bindings. Keys are arbitrary. | map(object({…})) |
{} |
|
labels | The resource labels for instance to use to annotate any related underlying resources, such as Compute Engine VMs. | map(string) |
{} |
|
prefix | Optional prefix used to generate project id and name. | string |
null |
|
service_account | Service account to set on the Dataproc cluster. | string |
null |
Outputs
name | description | sensitive |
---|---|---|
bucket_names | List of bucket names which have been assigned to the cluster. | |
http_ports | The map of port descriptions to URLs. | |
id | Fully qualified cluster id. | |
instance_names | List of instance names which have been assigned to the cluster. | |
name | The name of the cluster. |