History

Julio Castillo 1d13e3e624 Add more validations to linter - Ensure all variables and outputs are sorted - Ensure all variables and outputs have a description - Add data-solutions/data-platform-foundations to linter Fix all modules to follow these new conventions.		2021-10-08 18:26:04 +02:00
..
README.md	Add more validations to linter	2021-10-08 18:26:04 +02:00
diagram.png	Folders rename	2021-06-15 16:12:20 +03:00
main.tf	Remove redundant variable `admin_ranges_enabled`	2021-10-04 14:12:00 +02:00
outputs.tf	Folders rename	2021-06-15 16:12:20 +03:00
providers.tf	Bugfixing Data Foundations (#310 )	2021-09-28 17:13:18 +02:00
variables.tf	Add more validations to linter	2021-10-08 18:26:04 +02:00
versions.tf	Folders rename	2021-06-15 16:12:20 +03:00

README.md

Data Platform Foundations - Resources (Step 2)

This is the second step needed to deploy Data Platform Foundations, which creates resources needed to store and process the data, in the projects created in the previous step. Please refer to the top-level README for prerequisites and how to run the first step.

The resources that will be create in each project are:

Common
Landing
- GCS
- Pub/Sub
Orchestration & Transformation
- Dataflow
DWH
- Bigquery (L0/1/2)
- GCS
Datamart
- Bigquery (views/table)
- GCS
- BigTable

Running the example

In the previous step, we created the environment (projects and service account) which we are going to use in this step.

To create the resources, copy the output of the environment step (project_ids) and paste it into the terraform.tvars:

Specify your variables in a terraform.tvars, you can use the output from the environment stage

project_ids = {
  datamart       = "datamart-project_id"
  dwh            = "dwh-project_id"
  landing        = "landing-project_id"
  services       = "services-project_id"
  transformation = "transformation-project_id"
}

The providers.tf file has been configured to impersonate the main service account
To launch terraform:

terraform plan
terraform apply

Once done testing, you can clean up resources by running terraform destroy.

CMEK configuration

You can configure GCP resources to use existing CMEK keys configuring the 'service_encryption_key_ids' variable. You need to specify a 'global' and a 'multiregional' key.

Variables

name	description	type	required	default
project_ids	Project IDs.	`object({...})`	✓
admins	List of users allowed to impersonate the service account	`list(string)`		`null`
datamart_bq_datasets	Datamart Bigquery datasets	`map(object({...}))`		`...`
dwh_bq_datasets	DWH Bigquery datasets	`map(object({...}))`		`...`
landing_buckets	List of landing buckets to create	`map(object({...}))`		`...`
landing_pubsub	List of landing pubsub topics and subscriptions to create	`map(map(object({...})))`		`...`
landing_service_account	landing service accounts list.	`string`		`sa-landing`
service_account_names	Project service accounts list.	`object({...})`		`...`
service_encryption_key_ids	Cloud KMS encryption key in {LOCATION => [KEY_URL]} format. Keys belong to existing project.	`object({...})`		`...`
transformation_buckets	List of transformation buckets to create	`map(object({...}))`		`...`
transformation_subnets	List of subnets to create in the transformation Project.	`list(object({...}))`		`...`
transformation_vpc_name	Name of the VPC created in the transformation Project.	`string`		`transformation-vpc`

Outputs

name	description	sensitive
datamart-datasets	List of bigquery datasets created for the datamart project.
dwh-datasets	List of bigquery datasets created for the dwh project.
landing-buckets	List of buckets created for the landing project.
landing-pubsub	List of pubsub topics and subscriptions created for the landing project.
transformation-buckets	List of buckets created for the transformation project.
transformation-vpc	Transformation VPC details