83 lines
7.2 KiB
Markdown
83 lines
7.2 KiB
Markdown
# Data Platform Foundations - Resources (Step 2)
|
|
|
|
This is the second step needed to deploy Data Platform Foundations, which creates resources needed to store and process the data, in the projects created in the [previous step](./../environment/). Please refer to the [top-level README](../README.md) for prerequisites and how to run the first step.
|
|
|
|
![Data Foundation - Phase 2](./diagram.png "High-level diagram")
|
|
|
|
The resources that will be create in each project are:
|
|
|
|
- Common
|
|
- Landing
|
|
- [x] GCS
|
|
- [x] Pub/Sub
|
|
- Orchestration & Transformation
|
|
- [x] Dataflow
|
|
- DWH
|
|
- [x] Bigquery (L0/1/2)
|
|
- [x] GCS
|
|
- Datamart
|
|
- [x] Bigquery (views/table)
|
|
- [x] GCS
|
|
- [ ] BigTable
|
|
|
|
## Running the example
|
|
|
|
In the previous step, we created the environment (projects and service account) which we are going to use in this step.
|
|
|
|
To create the resources, copy the output of the environment step (**project_ids**) and paste it into the `terraform.tvars`:
|
|
|
|
- Specify your variables in a `terraform.tvars`, you can use the ouptu from the environment stage
|
|
|
|
```tfm
|
|
project_ids = {
|
|
datamart = "datamart-project_id"
|
|
dwh = "dwh-project_id"
|
|
landing = "landing-project_id"
|
|
services = "services-project_id"
|
|
transformation = "transformation-project_id"
|
|
}
|
|
```
|
|
|
|
- Get a key for the service account created in the environment stage:
|
|
- Go into services project
|
|
- Go into IAM page
|
|
- Go into the service account section
|
|
- Creaet a new key for the service account created in previeous step (**service_account**)
|
|
- Download the json key into the current folder
|
|
- make sure you have the right authentication setup: `export GOOGLE_APPLICATION_CREDENTIALS=PATH_TO_SERVICE_ACCOUT_KEY.json`
|
|
- run `terraform init` and `terraform apply`
|
|
|
|
Once done testing, you can clean up resources by running `terraform destroy`.
|
|
|
|
### CMEK configuration
|
|
You can configure GCP resources to use existing CMEK keys configuring the 'service_encryption_key_ids' variable. You need to specify a 'global' and a 'multiregional' key.
|
|
|
|
<!-- BEGIN TFDOC -->
|
|
## Variables
|
|
|
|
| name | description | type | required | default |
|
|
|---|---|:---: |:---:|:---:|
|
|
| project_ids | Project IDs. | <code title="object({ datamart = string dwh = string landing = string services = string transformation = string })">object({...})</code> | ✓ | |
|
|
| *datamart_bq_datasets* | Datamart Bigquery datasets | <code title="map(object({ iam = map(list(string)) location = string }))">map(object({...}))</code> | | <code title="{ bq_datamart_dataset = { location = "EU" iam = { } } }">...</code> |
|
|
| *dwh_bq_datasets* | DWH Bigquery datasets | <code title="map(object({ location = string iam = map(list(string)) }))">map(object({...}))</code> | | <code title="{ bq_raw_dataset = { iam = {} location = "EU" } }">...</code> |
|
|
| *landing_buckets* | List of landing buckets to create | <code title="map(object({ location = string name = string }))">map(object({...}))</code> | | <code title="{ raw-data = { location = "EU" name = "raw-data" } data-schema = { location = "EU" name = "data-schema" } }">...</code> |
|
|
| *landing_pubsub* | List of landing pubsub topics and subscriptions to create | <code title="map(map(object({ iam = map(list(string)) labels = map(string) options = object({ ack_deadline_seconds = number message_retention_duration = number retain_acked_messages = bool expiration_policy_ttl = number }) })))">map(map(object({...})))</code> | | <code title="{ landing-1 = { sub1 = { iam = { } labels = {} options = null } sub2 = { iam = {} labels = {}, options = null }, } }">...</code> |
|
|
| *landing_service_account* | landing service accounts list. | <code title="">string</code> | | <code title="">sa-landing</code> |
|
|
| *service_account_names* | Project service accounts list. | <code title="object({ datamart = string dwh = string landing = string services = string transformation = string })">object({...})</code> | | <code title="{ datamart = "sa-datamart" dwh = "sa-datawh" landing = "sa-landing" services = "sa-services" transformation = "sa-transformation" }">...</code> |
|
|
| *service_encryption_key_ids* | Cloud KMS encryption key in {LOCATION => [KEY_URL]} format. Keys belong to existing project. | <code title="object({ multiregional = string global = string })">object({...})</code> | | <code title="{ multiregional = null global = null }">...</code> |
|
|
| *transformation_buckets* | List of transformation buckets to create | <code title="map(object({ location = string name = string }))">map(object({...}))</code> | | <code title="{ temp = { location = "EU" name = "temp" }, templates = { location = "EU" name = "templates" }, }">...</code> |
|
|
| *transformation_subnets* | List of subnets to create in the transformation Project. | <code title="list(object({ ip_cidr_range = string name = string region = string secondary_ip_range = map(string) }))">list(object({...}))</code> | | <code title="[ { ip_cidr_range = "10.1.0.0/20" name = "transformation-subnet" region = "europe-west3" secondary_ip_range = {} }, ]">...</code> |
|
|
| *transformation_vpc_name* | Name of the VPC created in the transformation Project. | <code title="">string</code> | | <code title="">transformation-vpc</code> |
|
|
|
|
## Outputs
|
|
|
|
| name | description | sensitive |
|
|
|---|---|:---:|
|
|
| datamart-datasets | List of bigquery datasets created for the datamart project. | |
|
|
| dwh-datasets | List of bigquery datasets created for the dwh project. | |
|
|
| landing-buckets | List of buckets created for the landing project. | |
|
|
| landing-pubsub | List of pubsub topics and subscriptions created for the landing project. | |
|
|
| transformation-buckets | List of buckets created for the transformation project. | |
|
|
| transformation-vpc | Transformation VPC details | |
|
|
<!-- END TFDOC -->
|