Replace existing data platform

2022-02-05 08:51:11 +01:00 · 2022-02-05 08:51:11 +01:00 · b65d153ec1
parent 2cdea57954
commit b65d153ec1
53 changed files with 237 additions and 1523 deletions
--- a/examples/README.md
+++ b/examples/README.md
@ -5,7 +5,7 @@ This section contains **[foundational examples](./foundations/)** that bootstrap
 Currently available examples:

 - **cloud operations** - [Resource tracking and remediation via Cloud Asset feeds](./cloud-operations/asset-inventory-feed-remediation), [Granular Cloud DNS IAM via Service Directory](./cloud-operations/dns-fine-grained-iam), [Granular Cloud DNS IAM for Shared VPC](./cloud-operations/dns-shared-vpc), [Compute Engine quota monitoring](./cloud-operations/quota-monitoring), [Scheduled Cloud Asset Inventory Export to Bigquery](./cloud-operations/scheduled-asset-inventory-export-bq), [Packer image builder](./cloud-operations/packer-image-builder), [On-prem SA key management](./cloud-operations/onprem-sa-key-management)
- **data solutions** - [GCE/GCS CMEK via centralized Cloud KMS](./data-solutions/cmek-via-centralized-kms/), [Cloud Storage to Bigquery with Cloud Dataflow](./data-solutions/gcs-to-bq-with-dataflow/)
+- **data solutions** - [GCE/GCS CMEK via centralized Cloud KMS](./data-solutions/cmek-via-centralized-kms/), [Cloud Storage to Bigquery with Cloud Dataflow](./data-solutions/gcs-to-bq-with-dataflow/), [Data Platform Foundations](./data-solutions/data-platform-foundations/)
 - **factories** - [The why and the how of resource factories](./factories/README.md)
 - **foundations** - [single level hierarchy](./foundations/environments/) (environments), [multiple level hierarchy](./foundations/business-units/) (business units + environments)
 - **networking** - [hub and spoke via peering](./networking/hub-and-spoke-peering/), [hub and spoke via VPN](./networking/hub-and-spoke-vpn/), [DNS and Google Private Access for on-premises](./networking/onprem-google-access-dns/), [Shared VPC with GKE support](./networking/shared-vpc-gke/), [ILB as next hop](./networking/ilb-next-hop), [PSC for on-premises Cloud Function invocation](./networking/private-cloud-function-from-onprem/), [decentralized firewall](./networking/decentralized-firewall)
--- a/examples/data-solutions/README.md
+++ b/examples/data-solutions/README.md
@ -18,7 +18,7 @@ All resources use CMEK hosted in Cloud KMS running in a centralized project. The

 ### Data Platform Foundations

-<a href="./data-platform-foundations/" title="Data Platform Foundations"><img src="./data-platform-foundations/02-resources/diagram.png" align="left" width="280px"></a>
+<a href="./data-platform-foundations/" title="Data Platform Foundations"><img src="./data-platform-foundations/images/overview_diagram.png" align="left" width="280px"></a>
 This [example](./data-platform-foundations/) implements a robust and flexible Data Foundation on GCP that provides opinionated defaults, allowing customers to build and scale out additional data pipelines quickly and reliably.
 <br clear="left">

--- a/examples/data-solutions/data-platform-foundations/01-environment/README.md
+++ b/examples/data-solutions/data-platform-foundations/01-environment/README.md
@ -1,72 +0,0 @@
-# Data Platform Foundations - Environment (Step 1)
-
-This is the first step needed to deploy Data Platform Foundations, which creates projects and service accounts. Please refer to the [top-level Data Platform README](../README.md) for prerequisites.
-
-The projects that will be created are:
-
- Common services
- Landing
- Orchestration & Transformation
- DWH
- Datamart
-
-A main service account named `projects-editor-sa` will be created under the common services project, and it will be granted editor permissions on all the projects in scope.
-
-This is a high level diagram of the created resources:
-
-![Environment -  Phase 1](./diagram.png "High-level Environment diagram")
-
-## Running the example
-
-To create the infrastructure:
-
- specify your variables in a `terraform.tvars`
-
-```tfm
-billing_account = "1234-1234-1234"
-parent          = "folders/12345678"
-admins          = ["user:xxxxx@yyyyy.com"]
-```
-
- make sure you have the right authentication setup (application default credentials, or a service account key) with the right permissions
- **The output of this stage contains the values for the resources stage**
- the `admins` variable contain a list of principals allowed to impersonate the service accounts. These principals will be given the `iam.serviceAccountTokenCreator` role
- run `terraform init` and `terraform apply`
-
-Once done testing, you can clean up resources by running `terraform destroy`.
-
-### CMEK configuration
-You can configure GCP resources to use existing CMEK keys configuring the 'service_encryption_key_ids' variable. You need to specify a 'global' and a 'multiregional' key.
-
-### VPC-SC configuration
-You can assign projects to an existing VPC-SC standard perimeter configuring the 'service_perimeter_standard' variable. You can retrieve the list of existing perimeters from the GCP console or using the following command:
-
-'''
-gcloud access-context-manager perimeters list --format="json" | grep name
-'''
-
-The script use 'google_access_context_manager_service_perimeter_resource' terraform resource. If this resource is used alongside the 'vpc-sc' module, remember to uncomment the lifecycle block in the 'vpc-sc' module so they don't fight over which resources should be in the perimeter.
-<!-- BEGIN TFDOC -->
-
-## Variables
-
-| name | description | type | required | default |
-|---|---|:---:|:---:|:---:|
-| [billing_account_id](variables.tf#L21) | Billing account id. | <code>string</code> | ✓ |  |
-| [root_node](variables.tf#L50) | Parent folder or organization in 'folders/folder_id' or 'organizations/org_id' format. | <code>string</code> | ✓ |  |
-| [admins](variables.tf#L15) | List of users allowed to impersonate the service account. | <code>list&#40;string&#41;</code> |  | <code>null</code> |
-| [prefix](variables.tf#L26) | Prefix used to generate project id and name. | <code>string</code> |  | <code>null</code> |
-| [project_names](variables.tf#L32) | Override this variable if you need non-standard names. | <code title="object&#40;&#123;&#10;  datamart       &#61; string&#10;  dwh            &#61; string&#10;  landing        &#61; string&#10;  services       &#61; string&#10;  transformation &#61; string&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> |  | <code title="&#123;&#10;  datamart       &#61; &#34;datamart&#34;&#10;  dwh            &#61; &#34;datawh&#34;&#10;  landing        &#61; &#34;landing&#34;&#10;  services       &#61; &#34;services&#34;&#10;  transformation &#61; &#34;transformation&#34;&#10;&#125;">&#123;&#8230;&#125;</code> |
-| [service_account_names](variables.tf#L55) | Override this variable if you need non-standard names. | <code title="object&#40;&#123;&#10;  main &#61; string&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> |  | <code title="&#123;&#10;  main &#61; &#34;data-platform-main&#34;&#10;&#125;">&#123;&#8230;&#125;</code> |
-| [service_encryption_key_ids](variables.tf#L65) | Cloud KMS encryption key in {LOCATION => [KEY_URL]} format. Keys belong to existing project. | <code title="object&#40;&#123;&#10;  multiregional &#61; string&#10;  global        &#61; string&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> |  | <code title="&#123;&#10;  multiregional &#61; null&#10;  global        &#61; null&#10;&#125;">&#123;&#8230;&#125;</code> |
-| [service_perimeter_standard](variables.tf#L78) | VPC Service control standard perimeter name in the form of 'accessPolicies/ACCESS_POLICY_NAME/servicePerimeters/PERIMETER_NAME'. All projects will be added to the perimeter in enforced mode. | <code>string</code> |  | <code>null</code> |
-
-## Outputs
-
-| name | description | sensitive |
-|---|---|:---:|
-| [project_ids](outputs.tf#L17) | Project ids for created projects. |  |
-| [service_account](outputs.tf#L28) | Main service account. |  |
-| [service_encryption_key_ids](outputs.tf#L33) | Cloud KMS encryption keys in {LOCATION => [KEY_URL]} format. |  |
-
-<!-- END TFDOC -->
--- a/examples/data-solutions/data-platform-foundations/01-environment/diagram.png
+++ b/examples/data-solutions/data-platform-foundations/01-environment/diagram.png
--- a/examples/data-solutions/data-platform-foundations/01-environment/main.tf
+++ b/examples/data-solutions/data-platform-foundations/01-environment/main.tf
@ -1,162 +0,0 @@
-/**
- * Copyright 2020 Google LLC
- *
- * Licensed under the Apache License, Version 2.0 (the "License");
- * you may not use this file except in compliance with the License.
- * You may obtain a copy of the License at
- *
- *      http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-###############################################################################
-#                                   projects                                  #
-###############################################################################
-
-module "project-datamart" {
-  source          = "../../../../modules/project"
-  parent          = var.root_node
-  billing_account = var.billing_account_id
-  prefix          = var.prefix
-  name            = var.project_names.datamart
-  services = [
-    "bigquery.googleapis.com",
-    "bigquerystorage.googleapis.com",
-    "bigqueryreservation.googleapis.com",
-    "storage.googleapis.com",
-    "storage-component.googleapis.com",
-  ]
-
-  iam_additive = {
-    "roles/owner" = [module.sa-services-main.iam_email]
-  }
-  service_encryption_key_ids = {
-    bq      = [var.service_encryption_key_ids.multiregional]
-    storage = [var.service_encryption_key_ids.multiregional]
-  }
-  # If used, remember to uncomment 'lifecycle' block in the
-  # modules/vpc-sc/google_access_context_manager_service_perimeter resource.  
-  service_perimeter_standard = var.service_perimeter_standard
-}
-
-module "project-dwh" {
-  source          = "../../../../modules/project"
-  parent          = var.root_node
-  billing_account = var.billing_account_id
-  prefix          = var.prefix
-  name            = var.project_names.dwh
-  services = [
-    "bigquery.googleapis.com",
-    "bigquerystorage.googleapis.com",
-    "bigqueryreservation.googleapis.com",
-    "storage.googleapis.com",
-    "storage-component.googleapis.com",
-  ]
-  iam_additive = {
-    "roles/owner" = [module.sa-services-main.iam_email]
-  }
-  service_encryption_key_ids = {
-    bq      = [var.service_encryption_key_ids.multiregional]
-    storage = [var.service_encryption_key_ids.multiregional]
-  }
-  # If used, remember to uncomment 'lifecycle' block in the
-  # modules/vpc-sc/google_access_context_manager_service_perimeter resource.  
-  service_perimeter_standard = var.service_perimeter_standard
-}
-
-module "project-landing" {
-  source          = "../../../../modules/project"
-  parent          = var.root_node
-  billing_account = var.billing_account_id
-  prefix          = var.prefix
-  name            = var.project_names.landing
-  services = [
-    "pubsub.googleapis.com",
-    "storage.googleapis.com",
-    "storage-component.googleapis.com",
-  ]
-  iam_additive = {
-    "roles/owner" = [module.sa-services-main.iam_email]
-  }
-  service_encryption_key_ids = {
-    pubsub  = [var.service_encryption_key_ids.global]
-    storage = [var.service_encryption_key_ids.multiregional]
-  }
-  # If used, remember to uncomment 'lifecycle' block in the
-  # modules/vpc-sc/google_access_context_manager_service_perimeter resource.  
-  service_perimeter_standard = var.service_perimeter_standard
-}
-
-module "project-services" {
-  source          = "../../../../modules/project"
-  parent          = var.root_node
-  billing_account = var.billing_account_id
-  prefix          = var.prefix
-  name            = var.project_names.services
-  services = [
-    "bigquery.googleapis.com",
-    "cloudresourcemanager.googleapis.com",
-    "iam.googleapis.com",
-    "pubsub.googleapis.com",
-    "storage.googleapis.com",
-    "storage-component.googleapis.com",
-    "sourcerepo.googleapis.com",
-    "stackdriver.googleapis.com",
-    "cloudasset.googleapis.com",
-    "cloudkms.googleapis.com"
-  ]
-  iam_additive = {
-    "roles/owner" = [module.sa-services-main.iam_email]
-  }
-  service_encryption_key_ids = {
-    storage = [var.service_encryption_key_ids.multiregional]
-  }
-  # If used, remember to uncomment 'lifecycle' block in the
-  # modules/vpc-sc/google_access_context_manager_service_perimeter resource.  
-  service_perimeter_standard = var.service_perimeter_standard
-}
-
-module "project-transformation" {
-  source          = "../../../../modules/project"
-  parent          = var.root_node
-  billing_account = var.billing_account_id
-  prefix          = var.prefix
-  name            = var.project_names.transformation
-  services = [
-    "bigquery.googleapis.com",
-    "cloudbuild.googleapis.com",
-    "compute.googleapis.com",
-    "dataflow.googleapis.com",
-    "servicenetworking.googleapis.com",
-    "storage.googleapis.com",
-    "storage-component.googleapis.com",
-  ]
-  iam_additive = {
-    "roles/owner" = [module.sa-services-main.iam_email]
-  }
-  service_encryption_key_ids = {
-    compute  = [var.service_encryption_key_ids.global]
-    storage  = [var.service_encryption_key_ids.multiregional]
-    dataflow = [var.service_encryption_key_ids.global]
-  }
-  # If used, remember to uncomment 'lifecycle' block in the
-  # modules/vpc-sc/google_access_context_manager_service_perimeter resource.  
-  service_perimeter_standard = var.service_perimeter_standard
-}
-
-###############################################################################
-#                               service accounts                              #
-###############################################################################
-
-module "sa-services-main" {
-  source     = "../../../../modules/iam-service-account"
-  project_id = module.project-services.project_id
-  name       = var.service_account_names.main
-  iam        = var.admins != null ? { "roles/iam.serviceAccountTokenCreator" = var.admins } : {}
-
-}
--- a/examples/data-solutions/data-platform-foundations/01-environment/outputs.tf
+++ b/examples/data-solutions/data-platform-foundations/01-environment/outputs.tf
@ -1,36 +0,0 @@
-/**
- * Copyright 2020 Google LLC
- *
- * Licensed under the Apache License, Version 2.0 (the "License");
- * you may not use this file except in compliance with the License.
- * You may obtain a copy of the License at
- *
- *      http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-output "project_ids" {
-  description = "Project ids for created projects."
-  value = {
-    datamart       = module.project-datamart.project_id
-    dwh            = module.project-dwh.project_id
-    landing        = module.project-landing.project_id
-    services       = module.project-services.project_id
-    transformation = module.project-transformation.project_id
-  }
-}
-
-output "service_account" {
-  description = "Main service account."
-  value       = module.sa-services-main.email
-}
-
-output "service_encryption_key_ids" {
-  description = "Cloud KMS encryption keys in {LOCATION => [KEY_URL]} format."
-  value       = var.service_encryption_key_ids
-}
--- a/examples/data-solutions/data-platform-foundations/01-environment/variables.tf
+++ b/examples/data-solutions/data-platform-foundations/01-environment/variables.tf
@ -1,82 +0,0 @@
-# Copyright 2020 Google LLC
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-variable "admins" {
-  description = "List of users allowed to impersonate the service account."
-  type        = list(string)
-  default     = null
-}
-
-variable "billing_account_id" {
-  description = "Billing account id."
-  type        = string
-}
-
-variable "prefix" {
-  description = "Prefix used to generate project id and name."
-  type        = string
-  default     = null
-}
-
-variable "project_names" {
-  description = "Override this variable if you need non-standard names."
-  type = object({
-    datamart       = string
-    dwh            = string
-    landing        = string
-    services       = string
-    transformation = string
-  })
-  default = {
-    datamart       = "datamart"
-    dwh            = "datawh"
-    landing        = "landing"
-    services       = "services"
-    transformation = "transformation"
-  }
-}
-
-variable "root_node" {
-  description = "Parent folder or organization in 'folders/folder_id' or 'organizations/org_id' format."
-  type        = string
-}
-
-variable "service_account_names" {
-  description = "Override this variable if you need non-standard names."
-  type = object({
-    main = string
-  })
-  default = {
-    main = "data-platform-main"
-  }
-}
-
-variable "service_encryption_key_ids" {
-  description = "Cloud KMS encryption key in {LOCATION => [KEY_URL]} format. Keys belong to existing project."
-  type = object({
-    multiregional = string
-    global        = string
-  })
-  default = {
-    multiregional = null
-    global        = null
-  }
-}
-
-
-variable "service_perimeter_standard" {
-  description = "VPC Service control standard perimeter name in the form of 'accessPolicies/ACCESS_POLICY_NAME/servicePerimeters/PERIMETER_NAME'. All projects will be added to the perimeter in enforced mode."
-  type        = string
-  default     = null
-}
--- a/examples/data-solutions/data-platform-foundations/01-environment/versions.tf
+++ b/examples/data-solutions/data-platform-foundations/01-environment/versions.tf
@ -1,29 +0,0 @@
-# Copyright 2022 Google LLC
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-terraform {
-  required_version = ">= 1.0.0"
-  required_providers {
-    google = {
-      source  = "hashicorp/google"
-      version = ">= 4.0.0"
-    }
-    google-beta = {
-      source  = "hashicorp/google-beta"
-      version = ">= 4.0.0"
-    }
-  }
-}
-
-
--- a/examples/data-solutions/data-platform-foundations/01-landing.tf
+++ b/examples/data-solutions/data-platform-foundations/01-landing.tf
--- a/examples/data-solutions/data-platform-foundations/01-storage-services.tf
+++ b/examples/data-solutions/data-platform-foundations/01-storage-services.tf
--- a/examples/data-solutions/data-platform-foundations/02-load.tf
+++ b/examples/data-solutions/data-platform-foundations/02-load.tf
--- a/examples/data-solutions/data-platform-foundations/02-resources/README.md
+++ b/examples/data-solutions/data-platform-foundations/02-resources/README.md
@ -1,83 +0,0 @@
-# Data Platform Foundations - Resources (Step 2)
-
-This is the second step needed to deploy Data Platform Foundations, which creates resources needed to store and process the data, in the projects created in the [previous step](../01-environment/README.md). Please refer to the [top-level README](../README.md) for prerequisites and how to run the first step.
-
-![Data Foundation -  Phase 2](./diagram.png "High-level diagram")
-
-The resources that will be create in each project are:
-
- Common
- Landing
-  - [x] GCS
-  - [x] Pub/Sub
- Orchestration & Transformation
-  - [x] Dataflow
- DWH
-  - [x] Bigquery (L0/1/2)
-  - [x] GCS
- Datamart
-  - [x] Bigquery (views/table)
-  - [x] GCS
-  - [ ] BigTable
-
-## Running the example
-
-In the previous step, we created the environment (projects and service account) which we are going to use in this step.
-
-To create the resources, copy the output of the environment step (**project_ids**) and paste it into the `terraform.tvars`:
-
- Specify your variables in a `terraform.tvars`, you can use the output from the environment stage
-
-```tfm
-project_ids = {
-  datamart       = "datamart-project_id"
-  dwh            = "dwh-project_id"
-  landing        = "landing-project_id"
-  services       = "services-project_id"
-  transformation = "transformation-project_id"
-}
-```
-
-
- The providers.tf file has been configured to impersonate the **main** service account
-
- To launch terraform:
-```bash
-terraform plan
-terraform apply
-```
-Once done testing, you can clean up resources by running `terraform destroy`.
-
-### CMEK configuration
-You can configure GCP resources to use existing CMEK keys configuring the 'service_encryption_key_ids' variable. You need to specify a 'global' and a 'multiregional' key.
-<!-- BEGIN TFDOC -->
-
-## Variables
-
-| name | description | type | required | default |
-|---|---|:---:|:---:|:---:|
-| [project_ids](variables.tf#L108) | Project IDs. | <code title="object&#40;&#123;&#10;  datamart       &#61; string&#10;  dwh            &#61; string&#10;  landing        &#61; string&#10;  services       &#61; string&#10;  transformation &#61; string&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> | ✓ |  |
-| [admins](variables.tf#L16) | List of users allowed to impersonate the service account. | <code>list&#40;string&#41;</code> |  | <code>null</code> |
-| [datamart_bq_datasets](variables.tf#L22) | Datamart Bigquery datasets. | <code title="map&#40;object&#40;&#123;&#10;  iam      &#61; map&#40;list&#40;string&#41;&#41;&#10;  location &#61; string&#10;&#125;&#41;&#41;">map&#40;object&#40;&#123;&#8230;&#125;&#41;&#41;</code> |  | <code title="&#123;&#10;  bq_datamart_dataset &#61; &#123;&#10;    location &#61; &#34;EU&#34;&#10;    iam &#61; &#123;&#10;    &#125;&#10;  &#125;&#10;&#125;">&#123;&#8230;&#125;</code> |
-| [dwh_bq_datasets](variables.tf#L40) | DWH Bigquery datasets. | <code title="map&#40;object&#40;&#123;&#10;  location &#61; string&#10;  iam      &#61; map&#40;list&#40;string&#41;&#41;&#10;&#125;&#41;&#41;">map&#40;object&#40;&#123;&#8230;&#125;&#41;&#41;</code> |  | <code title="&#123;&#10;  bq_raw_dataset &#61; &#123;&#10;    iam      &#61; &#123;&#125;&#10;    location &#61; &#34;EU&#34;&#10;  &#125;&#10;&#125;">&#123;&#8230;&#125;</code> |
-| [landing_buckets](variables.tf#L54) | List of landing buckets to create. | <code title="map&#40;object&#40;&#123;&#10;  location &#61; string&#10;  name     &#61; string&#10;&#125;&#41;&#41;">map&#40;object&#40;&#123;&#8230;&#125;&#41;&#41;</code> |  | <code title="&#123;&#10;  raw-data &#61; &#123;&#10;    location &#61; &#34;EU&#34;&#10;    name     &#61; &#34;raw-data&#34;&#10;  &#125;&#10;  data-schema &#61; &#123;&#10;    location &#61; &#34;EU&#34;&#10;    name     &#61; &#34;data-schema&#34;&#10;  &#125;&#10;&#125;">&#123;&#8230;&#125;</code> |
-| [landing_pubsub](variables.tf#L72) | List of landing pubsub topics and subscriptions to create. | <code title="map&#40;map&#40;object&#40;&#123;&#10;  iam    &#61; map&#40;list&#40;string&#41;&#41;&#10;  labels &#61; map&#40;string&#41;&#10;  options &#61; object&#40;&#123;&#10;    ack_deadline_seconds       &#61; number&#10;    message_retention_duration &#61; number&#10;    retain_acked_messages      &#61; bool&#10;    expiration_policy_ttl      &#61; number&#10;  &#125;&#41;&#10;&#125;&#41;&#41;&#41;">map&#40;map&#40;object&#40;&#123;&#8230;&#125;&#41;&#41;&#41;</code> |  | <code title="&#123;&#10;  landing-1 &#61; &#123;&#10;    sub1 &#61; &#123;&#10;      iam &#61; &#123;&#10;      &#125;&#10;      labels  &#61; &#123;&#125;&#10;      options &#61; null&#10;    &#125;&#10;    sub2 &#61; &#123;&#10;      iam     &#61; &#123;&#125;&#10;      labels  &#61; &#123;&#125;,&#10;      options &#61; null&#10;    &#125;,&#10;  &#125;&#10;&#125;">&#123;&#8230;&#125;</code> |
-| [landing_service_account](variables.tf#L102) | landing service accounts list. | <code>string</code> |  | <code>&#34;sa-landing&#34;</code> |
-| [service_account_names](variables.tf#L119) | Project service accounts list. | <code title="object&#40;&#123;&#10;  datamart       &#61; string&#10;  dwh            &#61; string&#10;  landing        &#61; string&#10;  services       &#61; string&#10;  transformation &#61; string&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> |  | <code title="&#123;&#10;  datamart       &#61; &#34;sa-datamart&#34;&#10;  dwh            &#61; &#34;sa-datawh&#34;&#10;  landing        &#61; &#34;sa-landing&#34;&#10;  services       &#61; &#34;sa-services&#34;&#10;  transformation &#61; &#34;sa-transformation&#34;&#10;&#125;">&#123;&#8230;&#125;</code> |
-| [service_encryption_key_ids](variables.tf#L137) | Cloud KMS encryption key in {LOCATION => [KEY_URL]} format. Keys belong to existing project. | <code title="object&#40;&#123;&#10;  multiregional &#61; string&#10;  global        &#61; string&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> |  | <code title="&#123;&#10;  multiregional &#61; null&#10;  global        &#61; null&#10;&#125;">&#123;&#8230;&#125;</code> |
-| [transformation_buckets](variables.tf#L149) | List of transformation buckets to create. | <code title="map&#40;object&#40;&#123;&#10;  location &#61; string&#10;  name     &#61; string&#10;&#125;&#41;&#41;">map&#40;object&#40;&#123;&#8230;&#125;&#41;&#41;</code> |  | <code title="&#123;&#10;  temp &#61; &#123;&#10;    location &#61; &#34;EU&#34;&#10;    name     &#61; &#34;temp&#34;&#10;  &#125;,&#10;  templates &#61; &#123;&#10;    location &#61; &#34;EU&#34;&#10;    name     &#61; &#34;templates&#34;&#10;  &#125;,&#10;&#125;">&#123;&#8230;&#125;</code> |
-| [transformation_subnets](variables.tf#L167) | List of subnets to create in the transformation Project. | <code title="list&#40;object&#40;&#123;&#10;  ip_cidr_range      &#61; string&#10;  name               &#61; string&#10;  region             &#61; string&#10;  secondary_ip_range &#61; map&#40;string&#41;&#10;&#125;&#41;&#41;">list&#40;object&#40;&#123;&#8230;&#125;&#41;&#41;</code> |  | <code title="&#91;&#10;  &#123;&#10;    ip_cidr_range      &#61; &#34;10.1.0.0&#47;20&#34;&#10;    name               &#61; &#34;transformation-subnet&#34;&#10;    region             &#61; &#34;europe-west3&#34;&#10;    secondary_ip_range &#61; &#123;&#125;&#10;  &#125;,&#10;&#93;">&#91;&#8230;&#93;</code> |
-| [transformation_vpc_name](variables.tf#L185) | Name of the VPC created in the transformation Project. | <code>string</code> |  | <code>&#34;transformation-vpc&#34;</code> |
-
-## Outputs
-
-| name | description | sensitive |
-|---|---|:---:|
-| [datamart-datasets](outputs.tf#L17) | List of bigquery datasets created for the datamart project. |  |
-| [dwh-datasets](outputs.tf#L24) | List of bigquery datasets created for the dwh project. |  |
-| [landing-buckets](outputs.tf#L29) | List of buckets created for the landing project. |  |
-| [landing-pubsub](outputs.tf#L34) | List of pubsub topics and subscriptions created for the landing project. |  |
-| [transformation-buckets](outputs.tf#L44) | List of buckets created for the transformation project. |  |
-| [transformation-vpc](outputs.tf#L49) | Transformation VPC details. |  |
-
-<!-- END TFDOC -->
--- a/examples/data-solutions/data-platform-foundations/02-resources/diagram.png
+++ b/examples/data-solutions/data-platform-foundations/02-resources/diagram.png
--- a/examples/data-solutions/data-platform-foundations/02-resources/main.tf
+++ b/examples/data-solutions/data-platform-foundations/02-resources/main.tf
@ -1,211 +0,0 @@
-/**
- * Copyright 2020 Google LLC
- *
- * Licensed under the Apache License, Version 2.0 (the "License");
- * you may not use this file except in compliance with the License.
- * You may obtain a copy of the License at
- *
- *      http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-###############################################################################
-#                                   IAM                                       #
-###############################################################################
-
-module "datamart-sa" {
-  source     = "../../../../modules/iam-service-account"
-  project_id = var.project_ids.datamart
-  name       = var.service_account_names.datamart
-  iam_project_roles = {
-    "${var.project_ids.datamart}" = ["roles/editor"]
-  }
-  iam = var.admins != null ? { "roles/iam.serviceAccountTokenCreator" = var.admins } : {}
-}
-
-module "dwh-sa" {
-  source     = "../../../../modules/iam-service-account"
-  project_id = var.project_ids.dwh
-  name       = var.service_account_names.dwh
-
-  iam_project_roles = {
-    "${var.project_ids.dwh}" = ["roles/bigquery.admin"]
-  }
-  iam = var.admins != null ? { "roles/iam.serviceAccountTokenCreator" = var.admins } : {}
-}
-
-module "landing-sa" {
-  source     = "../../../../modules/iam-service-account"
-  project_id = var.project_ids.landing
-  name       = var.service_account_names.landing
-  iam_project_roles = {
-    "${var.project_ids.landing}" = [
-      "roles/pubsub.publisher",
-    "roles/storage.objectCreator"]
-  }
-  iam = var.admins != null ? { "roles/iam.serviceAccountTokenCreator" = var.admins } : {}
-}
-
-module "services-sa" {
-  source     = "../../../../modules/iam-service-account"
-  project_id = var.project_ids.services
-  name       = var.service_account_names.services
-  iam_project_roles = {
-    "${var.project_ids.services}" = ["roles/editor"]
-  }
-  iam = var.admins != null ? { "roles/iam.serviceAccountTokenCreator" = var.admins } : {}
-}
-
-module "transformation-sa" {
-  source     = "../../../../modules/iam-service-account"
-  project_id = var.project_ids.transformation
-  name       = var.service_account_names.transformation
-  iam_project_roles = {
-    "${var.project_ids.transformation}" = [
-      "roles/logging.logWriter",
-      "roles/monitoring.metricWriter",
-      "roles/dataflow.admin",
-      "roles/iam.serviceAccountUser",
-      "roles/bigquery.dataOwner",
-      "roles/bigquery.jobUser",
-      "roles/dataflow.worker",
-      "roles/bigquery.metadataViewer",
-      "roles/storage.objectViewer",
-    ],
-    "${var.project_ids.landing}" = [
-      "roles/storage.objectViewer",
-    ],
-    "${var.project_ids.dwh}" = [
-      "roles/bigquery.dataOwner",
-      "roles/bigquery.jobUser",
-      "roles/bigquery.metadataViewer",
-    ]
-  }
-  iam = var.admins != null ? { "roles/iam.serviceAccountTokenCreator" = var.admins } : {}
-}
-
-###############################################################################
-#                                   GCS                                       #
-###############################################################################
-
-module "landing-buckets" {
-  source     = "../../../../modules/gcs"
-  for_each   = var.landing_buckets
-  project_id = var.project_ids.landing
-  prefix     = var.project_ids.landing
-  name       = each.value.name
-  location   = each.value.location
-  iam = {
-    "roles/storage.objectCreator" = [module.landing-sa.iam_email]
-    "roles/storage.admin"         = [module.transformation-sa.iam_email]
-  }
-  encryption_key = var.service_encryption_key_ids.multiregional
-}
-
-module "transformation-buckets" {
-  source     = "../../../../modules/gcs"
-  for_each   = var.transformation_buckets
-  project_id = var.project_ids.transformation
-  prefix     = var.project_ids.transformation
-  name       = each.value.name
-  location   = each.value.location
-  iam = {
-    "roles/storage.admin" = [module.transformation-sa.iam_email]
-  }
-  encryption_key = var.service_encryption_key_ids.multiregional
-}
-
-###############################################################################
-#                                 Bigquery                                    #
-###############################################################################
-
-module "datamart-bq" {
-  source     = "../../../../modules/bigquery-dataset"
-  for_each   = var.datamart_bq_datasets
-  project_id = var.project_ids.datamart
-  id         = each.key
-  location   = each.value.location
-  iam = {
-    for k, v in each.value.iam : k => (
-      k == "roles/bigquery.dataOwner"
-      ? concat(v, [module.datamart-sa.iam_email])
-      : v
-    )
-  }
-  encryption_key = var.service_encryption_key_ids.multiregional
-}
-
-module "dwh-bq" {
-  source     = "../../../../modules/bigquery-dataset"
-  for_each   = var.dwh_bq_datasets
-  project_id = var.project_ids.dwh
-  id         = each.key
-  location   = each.value.location
-  iam = {
-    for k, v in each.value.iam : k => (
-      k == "roles/bigquery.dataOwner"
-      ? concat(v, [module.dwh-sa.iam_email])
-      : v
-    )
-  }
-  encryption_key = var.service_encryption_key_ids.multiregional
-}
-
-###############################################################################
-#                                  Network                                    #
-###############################################################################
-module "vpc-transformation" {
-  source     = "../../../../modules/net-vpc"
-  project_id = var.project_ids.transformation
-  name       = var.transformation_vpc_name
-  subnets    = var.transformation_subnets
-}
-
-module "firewall" {
-  source              = "../../../../modules/net-vpc-firewall"
-  project_id          = var.project_ids.transformation
-  network             = module.vpc-transformation.name
-  admin_ranges        = []
-  http_source_ranges  = []
-  https_source_ranges = []
-  ssh_source_ranges   = []
-
-  custom_rules = {
-    iap-svc = {
-      description          = "Dataflow service."
-      direction            = "INGRESS"
-      action               = "allow"
-      sources              = ["dataflow"]
-      targets              = ["dataflow"]
-      ranges               = []
-      use_service_accounts = false
-      rules                = [{ protocol = "tcp", ports = ["12345-12346"] }]
-      extra_attributes     = {}
-    }
-  }
-}
-
-###############################################################################
-#                                Pub/Sub                                      #
-###############################################################################
-
-module "landing-pubsub" {
-  source     = "../../../../modules/pubsub"
-  for_each   = var.landing_pubsub
-  project_id = var.project_ids.landing
-  name       = each.key
-  subscriptions = {
-    for k, v in each.value : k => { labels = v.labels, options = v.options }
-  }
-  subscription_iam = {
-    for k, v in each.value : k => merge(v.iam, {
-      "roles/pubsub.subscriber" = [module.transformation-sa.iam_email]
-    })
-  }
-  kms_key = var.service_encryption_key_ids.global
-}
--- a/examples/data-solutions/data-platform-foundations/02-resources/outputs.tf
+++ b/examples/data-solutions/data-platform-foundations/02-resources/outputs.tf
@ -1,60 +0,0 @@
-/**
- * Copyright 2020 Google LLC
- *
- * Licensed under the Apache License, Version 2.0 (the "License");
- * you may not use this file except in compliance with the License.
- * You may obtain a copy of the License at
- *
- *      http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-output "datamart-datasets" {
-  description = "List of bigquery datasets created for the datamart project."
-  value = [
-    for k, datasets in module.datamart-bq : datasets.dataset_id
-  ]
-}
-
-output "dwh-datasets" {
-  description = "List of bigquery datasets created for the dwh project."
-  value       = [for k, datasets in module.dwh-bq : datasets.dataset_id]
-}
-
-output "landing-buckets" {
-  description = "List of buckets created for the landing project."
-  value       = [for k, bucket in module.landing-buckets : bucket.name]
-}
-
-output "landing-pubsub" {
-  description = "List of pubsub topics and subscriptions created for the landing project."
-  value = {
-    for t in module.landing-pubsub : t.topic.name => {
-      id            = t.topic.id
-      subscriptions = { for s in t.subscriptions : s.name => s.id }
-    }
-  }
-}
-
-output "transformation-buckets" {
-  description = "List of buckets created for the transformation project."
-  value       = [for k, bucket in module.transformation-buckets : bucket.name]
-}
-
-output "transformation-vpc" {
-  description = "Transformation VPC details."
-  value = {
-    name = module.vpc-transformation.name
-    subnets = {
-      for k, s in module.vpc-transformation.subnets : k => {
-        ip_cidr_range = s.ip_cidr_range
-        region        = s.region
-      }
-    }
-  }
-}
--- a/examples/data-solutions/data-platform-foundations/02-resources/providers.tf
+++ b/examples/data-solutions/data-platform-foundations/02-resources/providers.tf
@ -1,23 +0,0 @@
-/**
- * Copyright 2022 Google LLC
- *
- * Licensed under the Apache License, Version 2.0 (the "License");
- * you may not use this file except in compliance with the License.
- * You may obtain a copy of the License at
- *
- *      http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-provider "google" {
-  impersonate_service_account = "data-platform-main@${var.project_ids.services}.iam.gserviceaccount.com"
-}
-
-provider "google-beta" {
-  impersonate_service_account = "data-platform-main@${var.project_ids.services}.iam.gserviceaccount.com"
-}
--- a/examples/data-solutions/data-platform-foundations/02-resources/variables.tf
+++ b/examples/data-solutions/data-platform-foundations/02-resources/variables.tf
@ -1,189 +0,0 @@
-# Copyright 2020 Google LLC
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-
-variable "admins" {
-  description = "List of users allowed to impersonate the service account."
-  type        = list(string)
-  default     = null
-}
-
-variable "datamart_bq_datasets" {
-  description = "Datamart Bigquery datasets."
-  type = map(object({
-    iam      = map(list(string))
-    location = string
-  }))
-  default = {
-    bq_datamart_dataset = {
-      location = "EU"
-      iam = {
-        # "roles/bigquery.dataOwner"  = []
-        # "roles/bigquery.dataEditor" = []
-        # "roles/bigquery.dataViewer" = []
-      }
-    }
-  }
-}
-
-variable "dwh_bq_datasets" {
-  description = "DWH Bigquery datasets."
-  type = map(object({
-    location = string
-    iam      = map(list(string))
-  }))
-  default = {
-    bq_raw_dataset = {
-      iam      = {}
-      location = "EU"
-    }
-  }
-}
-
-variable "landing_buckets" {
-  description = "List of landing buckets to create."
-  type = map(object({
-    location = string
-    name     = string
-  }))
-  default = {
-    raw-data = {
-      location = "EU"
-      name     = "raw-data"
-    }
-    data-schema = {
-      location = "EU"
-      name     = "data-schema"
-    }
-  }
-}
-
-variable "landing_pubsub" {
-  description = "List of landing pubsub topics and subscriptions to create."
-  type = map(map(object({
-    iam    = map(list(string))
-    labels = map(string)
-    options = object({
-      ack_deadline_seconds       = number
-      message_retention_duration = number
-      retain_acked_messages      = bool
-      expiration_policy_ttl      = number
-    })
-  })))
-  default = {
-    landing-1 = {
-      sub1 = {
-        iam = {
-          # "roles/pubsub.subscriber" = []
-        }
-        labels  = {}
-        options = null
-      }
-      sub2 = {
-        iam     = {}
-        labels  = {},
-        options = null
-      },
-    }
-  }
-}
-
-variable "landing_service_account" {
-  description = "landing service accounts list."
-  type        = string
-  default     = "sa-landing"
-}
-
-variable "project_ids" {
-  description = "Project IDs."
-  type = object({
-    datamart       = string
-    dwh            = string
-    landing        = string
-    services       = string
-    transformation = string
-  })
-}
-
-variable "service_account_names" {
-  description = "Project service accounts list."
-  type = object({
-    datamart       = string
-    dwh            = string
-    landing        = string
-    services       = string
-    transformation = string
-  })
-  default = {
-    datamart       = "sa-datamart"
-    dwh            = "sa-datawh"
-    landing        = "sa-landing"
-    services       = "sa-services"
-    transformation = "sa-transformation"
-  }
-}
-
-variable "service_encryption_key_ids" {
-  description = "Cloud KMS encryption key in {LOCATION => [KEY_URL]} format. Keys belong to existing project."
-  type = object({
-    multiregional = string
-    global        = string
-  })
-  default = {
-    multiregional = null
-    global        = null
-  }
-}
-
-variable "transformation_buckets" {
-  description = "List of transformation buckets to create."
-  type = map(object({
-    location = string
-    name     = string
-  }))
-  default = {
-    temp = {
-      location = "EU"
-      name     = "temp"
-    },
-    templates = {
-      location = "EU"
-      name     = "templates"
-    },
-  }
-}
-
-variable "transformation_subnets" {
-  description = "List of subnets to create in the transformation Project."
-  type = list(object({
-    ip_cidr_range      = string
-    name               = string
-    region             = string
-    secondary_ip_range = map(string)
-  }))
-  default = [
-    {
-      ip_cidr_range      = "10.1.0.0/20"
-      name               = "transformation-subnet"
-      region             = "europe-west3"
-      secondary_ip_range = {}
-    },
-  ]
-}
-
-variable "transformation_vpc_name" {
-  description = "Name of the VPC created in the transformation Project."
-  type        = string
-  default     = "transformation-vpc"
-}
--- a/examples/data-solutions/data-platform-foundations/02-resources/versions.tf
+++ b/examples/data-solutions/data-platform-foundations/02-resources/versions.tf
@ -1,29 +0,0 @@
-# Copyright 2022 Google LLC
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-terraform {
-  required_version = ">= 1.0.0"
-  required_providers {
-    google = {
-      source  = "hashicorp/google"
-      version = ">= 4.0.0"
-    }
-    google-beta = {
-      source  = "hashicorp/google-beta"
-      version = ">= 4.0.0"
-    }
-  }
-}
-
-
--- a/examples/data-solutions/data-platform-foundations/02-storage-services.tf
+++ b/examples/data-solutions/data-platform-foundations/02-storage-services.tf
--- a/examples/data-solutions/data-platform-foundations/03-composer.tf
+++ b/examples/data-solutions/data-platform-foundations/03-composer.tf
--- a/examples/data-solutions/data-platform-foundations/03-orchestration.tf
+++ b/examples/data-solutions/data-platform-foundations/03-orchestration.tf
--- a/examples/data-solutions/data-platform-foundations/03-pipeline/README.md
+++ b/examples/data-solutions/data-platform-foundations/03-pipeline/README.md
@ -1,8 +0,0 @@
-# Manual pipeline Example
-
-Once you deployed projects [step 1](../01-environment/README.md) and resources [step 2](../02-resources/README.md) you can use it to run your data pipeline.
-
-Here we will demo 2 pipelines:
-
-* [GCS to Bigquery](./gcs_to_bigquery.md)
-* [PubSub to Bigquery](./pubsub_to_bigquery.md)
--- a/examples/data-solutions/data-platform-foundations/03-pipeline/gcs_to_bigquery.md
+++ b/examples/data-solutions/data-platform-foundations/03-pipeline/gcs_to_bigquery.md
@ -1,140 +0,0 @@
-# Manual pipeline Example: GCS to Bigquery
-
-In this example we will publish person message in the following format:
-
-```bash
-name,surname,1617898199
-```
-
-A Dataflow pipeline will read those messages and import them into a Bigquery table in the DWH project.
-
-[TODO] An autorized view will be created in the datamart project to expose the table.
-[TODO] Further automation is expected in future.
-
-## Set up the env vars
-```bash
-export DWH_PROJECT_ID=**dwh_project_id** 
-export LANDING_PROJECT_ID=**landing_project_id** 
-export TRANSFORMATION_PROJECT_ID=*transformation_project_id*
-```
-
-## Create BQ table
-Those steps should be done as DWH Service Account.
-
-You can run the command to create a table:
-
-```bash
-gcloud --impersonate-service-account=sa-datawh@$DWH_PROJECT_ID.iam.gserviceaccount.com \
-alpha bq tables create person \
--project=$DWH_PROJECT_ID --dataset=bq_raw_dataset \
--description "This is a Test Person table" \
--schema name=STRING,surname=STRING,timestamp=TIMESTAMP
-```
-
-## Produce CSV data file, JSON schema file and UDF JS file
-
-Those steps should be done as landing Service Account:
-
-Let's now create a series of messages we can use to import:
-
-```bash
-for i in {0..10} 
-do 
-  echo "Lorenzo,Caggioni,$(date +%s)" >> person.csv
-done
-```
-
-and copy files to the GCS bucket:
-
-```bash
-gsutil -i sa-landing@$LANDING_PROJECT_ID.iam.gserviceaccount.com cp person.csv gs://$LANDING_PROJECT_ID-eu-raw-data
-```
-
-Let's create the data JSON schema:
-
-```bash
-cat <<'EOF' >> person_schema.json
-{
-  "BigQuery Schema": [
-    {
-      "name": "name",
-      "type": "STRING"
-    },
-    {
-      "name": "surname",
-      "type": "STRING"
-    },
-    {
-      "name": "timestamp",
-      "type": "TIMESTAMP"
-    }
-  ]
-}
-EOF
-```
-
-and copy files to the GCS bucket:
-
-```bash
-gsutil -i sa-landing@$LANDING_PROJECT_ID.iam.gserviceaccount.com cp person_schema.json gs://$LANDING_PROJECT_ID-eu-data-schema
-
-```
-
-Let's create the data UDF function to transform message data:
-
-```bash
-cat <<'EOF' >> person_udf.js
-function transform(line) {
-  var values = line.split(',');
-
-  var obj = new Object();
-  obj.name = values[0];
-  obj.surname = values[1];
-  obj.timestamp = values[2];
-  var jsonString = JSON.stringify(obj);
-
-  return jsonString;
-}
-EOF
-```
-
-and copy files to the GCS bucket:
-
-```bash
-gsutil -i sa-landing@$LANDING_PROJECT_ID.iam.gserviceaccount.com cp person_udf.js gs://$LANDING_PROJECT_ID-eu-data-schema
-```
-
-if you want to check files copied to GCS, you can use the Transformation service account:
-
-```bash
-gsutil -i sa-transformation@$TRANSFORMATION_PROJECT_ID.iam.gserviceaccount.com ls gs://$LANDING_PROJECT_ID-eu-raw-data
-gsutil -i sa-transformation@$TRANSFORMATION_PROJECT_ID.iam.gserviceaccount.com ls gs://$LANDING_PROJECT_ID-eu-data-schema
-
-```
-
-## Dataflow
-
-Those steps should be done as transformation Service Account.
-
-
-Let's than start a Dataflow batch pipeline using a Google provided template using internal only IPs, the created network and subnetwork, the appropriate service account and requested parameters:
-
-```bash
-gcloud --impersonate-service-account=sa-transformation@$TRANSFORMATION_PROJECT_ID.iam.gserviceaccount.com dataflow jobs run test_batch_01 \
-    --gcs-location gs://dataflow-templates/latest/GCS_Text_to_BigQuery \
-    --project $TRANSFORMATION_PROJECT_ID \
-    --region europe-west3 \
-    --disable-public-ips \
-    --network transformation-vpc \
-    --subnetwork regions/europe-west3/subnetworks/transformation-subnet \
-    --staging-location gs://$TRANSFORMATION_PROJECT_ID-eu-temp \
-    --service-account-email sa-transformation@$TRANSFORMATION_PROJECT_ID.iam.gserviceaccount.com \
-    --parameters \
-javascriptTextTransformFunctionName=transform,\
-JSONPath=gs://$LANDING_PROJECT_ID-eu-data-schema/person_schema.json,\
-javascriptTextTransformGcsPath=gs://$LANDING_PROJECT_ID-eu-data-schema/person_udf.js,\
-inputFilePattern=gs://$LANDING_PROJECT_ID-eu-raw-data/person.csv,\
-outputTable=$DWH_PROJECT_ID:bq_raw_dataset.person,\
-bigQueryLoadingTemporaryDirectory=gs://$TRANSFORMATION_PROJECT_ID-eu-temp
-
-```
--- a/examples/data-solutions/data-platform-foundations/03-pipeline/pubsub_to_bigquery.md
+++ b/examples/data-solutions/data-platform-foundations/03-pipeline/pubsub_to_bigquery.md
@ -1,75 +0,0 @@
-# Manual pipeline Example: PubSub to Bigquery
-
-In this example we will publish person message in the following format:
-
-```txt
-name: Name
-surname: Surname
-timestamp: 1617898199
-```
-
-a Dataflow pipeline will read those messages and import them into a Bigquery table in the DWH project.
-
-An autorized view will be created in the datamart project to expose the table.
-
-[TODO] Further automation is expected in future.
-
-## Set up the env vars
-```bash
-export DWH_PROJECT_ID=**dwh_project_id** 
-export LANDING_PROJECT_ID=**landing_project_id** 
-export TRANSFORMATION_PROJECT_ID=*transformation_project_id*
-```
-
-## Create BQ table
-Those steps should be done as DWH Service Account.
-
-You can run the command to create a table:
-
-```bash
-gcloud --impersonate-service-account=sa-datawh@$DWH_PROJECT_ID.iam.gserviceaccount.com \
-alpha bq tables create person \
--project=$DWH_PROJECT_ID --dataset=bq_raw_dataset \
--description "This is a Test Person table" \
--schema name=STRING,surname=STRING,timestamp=TIMESTAMP
-```
-
-## Produce PubSub messages
-
-Those steps should be done as landing Service Account:
-
-Let's now create a series of messages we can use to import:
-
-```bash
-for i in {0..10} 
-do 
-  gcloud --impersonate-service-account=sa-landing@$LANDING_PROJECT_ID.iam.gserviceaccount.com pubsub topics publish projects/$LANDING_PROJECT_ID/topics/landing-1 --message="{\"name\": \"Lorenzo\", \"surname\": \"Caggioni\", \"timestamp\": \"$(date +%s)\"}"
-done
-```
-
-if you want to check messages published, you can use the Transformation service account and read a message (message won't be acked and will stay in the subscription):
-
-```bash
-gcloud --impersonate-service-account=sa-transformation@$TRANSFORMATION_PROJECT_ID.iam.gserviceaccount.com pubsub subscriptions pull projects/$LANDING_PROJECT_ID/subscriptions/sub1
-```
-
-## Dataflow
-
-Those steps should be done as transformation Service Account:
-
-Let's than start a Dataflow streaming pipeline using a Google provided template using internal only IPs, the created network and subnetwork, the appropriate service account and requested parameters:
-
-```bash
-gcloud dataflow jobs run test_streaming01 \
-    --gcs-location gs://dataflow-templates/latest/PubSub_Subscription_to_BigQuery \
-    --project $TRANSFORMATION_PROJECT_ID \
-    --region europe-west3 \
-    --disable-public-ips \
-    --network transformation-vpc \
-    --subnetwork regions/europe-west3/subnetworks/transformation-subnet \
-    --staging-location gs://$TRANSFORMATION_PROJECT_ID-eu-temp \
-    --service-account-email sa-transformation@$TRANSFORMATION_PROJECT_ID.iam.gserviceaccount.com \
-    --parameters \
-inputSubscription=projects/$LANDING_PROJECT_ID/subscriptions/sub1,\
-outputTableSpec=$DWH_PROJECT_ID:bq_raw_dataset.person
-```
--- a/examples/data-solutions/data-platform-foundations/03-pipeline/resource/raw_data.json
+++ b/examples/data-solutions/data-platform-foundations/03-pipeline/resource/raw_data.json
@ -1,26 +0,0 @@
-{
-  "schema": {
-    "fields": [
-      {
-        "mode": "NULLABLE",
-        "name": "name",
-        "type": "STRING"
-      },
-      {
-        "mode": "NULLABLE",
-        "name": "surname",
-        "type": "STRING"
-      },
-      {
-        "mode": "NULLABLE",
-        "name": "age",
-        "type": "INTEGER"
-      },
-      {
-        "mode": "NULLABLE",
-        "name": "boolean_val",
-        "type": "BOOLEAN"
-      }
-    ]
-  }
-}
--- a/examples/data-solutions/data-platform-foundations/03-storage-services.tf
+++ b/examples/data-solutions/data-platform-foundations/03-storage-services.tf
--- a/examples/data-solutions/data-platform-foundations/04-storage-services.tf
+++ b/examples/data-solutions/data-platform-foundations/04-storage-services.tf
--- a/examples/data-solutions/data-platform-foundations/04-transformation.tf
+++ b/examples/data-solutions/data-platform-foundations/04-transformation.tf
--- a/examples/data-solutions/data-platform-foundations/05-datalake.tf
+++ b/examples/data-solutions/data-platform-foundations/05-datalake.tf
--- a/examples/data-solutions/data-platform-foundations/05-storage-services.tf
+++ b/examples/data-solutions/data-platform-foundations/05-storage-services.tf
--- a/examples/data-solutions/data-platform-foundations/06-common.tf
+++ b/examples/data-solutions/data-platform-foundations/06-common.tf
--- a/examples/data-solutions/data-platform-foundations/07-exposure.tf
+++ b/examples/data-solutions/data-platform-foundations/07-exposure.tf
--- a/examples/data-solutions/data-platform-foundations/README.md
+++ b/examples/data-solutions/data-platform-foundations/README.md
@ -1,61 +1,251 @@
-# Data Foundation Platform
+# Data Platform

-The goal of this example is to Build a robust and flexible Data Foundation on GCP, providing opinionated defaults while still allowing customers to quickly and reliably build and scale out additional data pipelines.
+This module implements an opinionated Data Platform (DP) Architecture that creates and setup projects and related resources that compose an end-to-end data environment.

-The example is composed of three separate provisioning workflows, which are deisgned to be plugged together and create end to end Data Foundations, that support multiple data pipelines on top.
+The code is intentionally simple, as it's intended to provide a generic initial setup and then allow easy customizations to complete the implementation of the intended design.

-1. **[Environment Setup](./01-environment/)**
-  *(once per environment)*
-    * projects
-    * VPC configuration
-    * Composer environment and identity
-    * shared buckets and datasets
-1. **[Data Source Setup](./02-resources)**
-  *(once per data source)*
-    * landing and archive bucket
-    * internal and external identities
-    * domain specific datasets
-1. **[Pipeline Setup](./03-pipeline)**
-  *(once per pipeline)*
-    * pipeline-specific tables and views
-    * pipeline code
-    * Composer DAG
+The following diagram is a high-level reference of the resources created and managed here:

-The resulting GCP architecture is outlined in this diagram
-![Target architecture](./02-resources/diagram.png)
+![Data Platform architecture overview](./images/overview_diagram.png "Data Platform architecture overview")

-A demo pipeline is also part of this example: it can be built and run on top of the foundational infrastructure to quickly verify or test the setup.
+A demo pipeline is also part of this example: it can be built and run on top of the foundational infrastructure to verify or test the setup quickly.

-## Prerequisites
+## Design overview and choices

-In order to bring up this example, you will need
+Despite its simplicity, this stage implements the basics of a design that we've seen working well for various customers.
+
+The approach adapts to different high-level requirements:
+
+- boundaries for each step
+- clearly defined actors
+- least privilege principle
+- rely on service account impersonation
+
+The code in this example doesn't address Organization-level configurations (Organization policy, VPC-SC, centralized logs). We expect those to be managed by automation stages external to this script like those in [FAST](../../../fast).
+
+### Project structure
+
+The DP is designed to rely on several projects, one project per data stage. The stages identified are:
+
+- landing
+- load
+- data lake
+- orchestration
+- transformation
+- exposure
+
+This separation into projects allows adhering to the least-privilege principle by using project-level roles.
+
+The script will create the following projects:
+
+- **Landing** Used to store temporary data. Data is pushed to Cloud Storage, BigQuery, or Cloud PubSub. Resources are configured with a customizable lifecycle policy.
+- **Load** Used to load data from landing to data lake. The load is made with minimal to zero transformation logic (mainly `cast`). Anonymization or tokenization of Personally Identifiable Information (PII) can be implemented here or in the transformation stage, depending on your requirements. The use of [Cloud Dataflow templates](https://cloud.google.com/dataflow/docs/concepts/dataflow-templates) is recommended.
+- **Data Lake** Several projects distributed across 3 separate layers, to host progressively processed and refined data:
+  - **L0 - Raw data** Structured Data, stored in relevant formats: structured data stored in BigQuery, unstructured data stored on Cloud Storage with additional metadata stored in BigQuery (for example pictures stored in Cloud Storage and analysis of the images for Cloud Vision API stored in BigQuery).
+  - **L1 - Cleansed, aggregated and standardized data**
+  - **L2 - Curated layer**
+  - **Playground** Temporary tables that Data Analyst may use to perform R&D on data available in other Data Lake layers.
+- **Orchestration** Used to host Cloud Composer, which orchestrates all tasks that move data across layers.
+- **Transformation** Used to move data between Data Lake layers. We strongly suggest relying on BigQuery Engine to perform the transformations. If BigQuery doesn't have the features needed to perform your transformations, you can use Cloud Dataflow with [Cloud Dataflow templates](https://cloud.google.com/dataflow/docs/concepts/dataflow-templates). This stage can also optionally  anonymize or tokenize PII.
+- **Exposure** Used to host resources that share processed data with external systems. Depending on the access pattern, data can be presented via Cloud SQL, BigQuery, or Bigtable. For BigQuery data, we strongly suggest relying on [Authorized views](https://cloud.google.com/bigquery/docs/authorized-views).
+
+### Roles
+
+We assign roles on resources at the project level, granting the appropriate role via groups for humans and individual principals for service accounts, according to best practices.
+
+### Service accounts
+
+Service account creation follows the least privilege principle, performing a single task which requires access to a defined set of resources. For example, the Cloud Dataflow service account only has access to the landing project and the data lake L0 project.
+
+Using of service account keys within a data pipeline exposes to several security risks deriving from a credentials leak. This example shows how to leverage impersonation to avoid the need of creating keys.
+
+### Groups
+
+We use three groups to control access to resources:
+
+- *Data Engineers* They handle and run the Data Hub, with read access to all resources in order to troubleshoot possible issues with pipelines. This team can also impersonate any service account.
+- *Data Analyst*. They perform analysis on datasets, with read access to the data lake L2 project, and BigQuery READ/WRITE access to the playground project. - *Data Security*:. They handle security configurations related to the Data Hub.
+
+### Virtual Private Cloud (VPC) design
+
+As is often the case in real-world configurations, this example accepts as input an existing [Shared-VPC](https://cloud.google.com/vpc/docs/shared-vpc) via the `network_config` variable.
+
+If the `network_config` variable is not provided, one VPC will be created in each project that supports network resources (load, transformation and orchestration).
+
+### IP ranges and subnetting
+
+To deploy this example with self-managed VPCs you need the following ranges:
+
+- one /24 for the load project VPC subnet used for Cloud Dataflow workers
+- one /24 for the transformation VPC subnet used for Cloud Dataflow workers
+- one /24 range for the orchestration VPC subnet used for Composer workers
+- one /22 and one /24 ranges for the secondary ranges associated with the orchestration VPC subnet
+
+If you are using Shared VPC, you need one subnet with one /22 and one /24 secondary range defined for Composer pods and services.
+
+In both VPC scenarios, you also need these ranges for Composer:
+
+- one /24 for Cloud SQL
+- one /28 for the GKE control plane
+- one /28 for the web server
+
+### Resource naming conventions
+
+Resources in the script use the following acronyms:
+
+- `lnd` for `landing`
+- `lod` for `load`
+- `orc` for `orchestration`
+- `trf` for `transformation`
+- `dtl` for `Data Lake`
+- `cmn` for `common`
+- `plg` for `playground`
+- 2 letters acronym for GCP products, example: `bq` for `BigQuery`, `df` for `Cloud Dataflow`, ...
+
+Resources follow the naming convention described below.
+
+- `prefix-layer` for projects
+- `prefix-layer[2]-gcp-product[2]-counter` for services and service accounts
+
+### Encryption
+
+We suggest a centralized approach to key management, where Security is the only team that can access encryption material, and keyrings and keys are managed in a project external to the DP.
+
+![Centralized Cloud Key Management high-level diagram](./images/kms_diagram.png "Centralized Cloud Key Management high-level diagram")
+
+To configure the use of Cloud KMS on resources, you have to specify the key id on the `service_encryption_keys` variable. Key locations should match resource locations. Example:
+
+```hcl
+service_encryption_keys = {
+    bq       = "KEY_URL_MULTIREGIONAL"
+    composer = "KEY_URL_REGIONAL"
+    dataflow = "KEY_URL_REGIONAL"
+    storage  = "KEY_URL_MULTIREGIONAL"
+    pubsub   = "KEY_URL_MULTIREGIONAL"
+```
+
+This step is optional and depends on customer policies and security best practices.
+
+## Data Anonymization
+
+We suggest using Cloud Data Loss Prevention to identify/mask/tokenize your confidential data.
+
+While implementing a Data Loss Prevention strategy is out of scope for this example, we enable the service in two different projects so that [Cloud Data Loss Prevention templates](https://cloud.google.com/dlp/docs/concepts-templates) can be configured in one of two ways:
+
+- during the ingestion phase, from Dataflow
+- during the transformation phase, from [BigQuery](https://cloud.google.com/bigquery/docs/scan-with-dlp) or [Cloud Dataflow](https://cloud.google.com/architecture/running-automated-dataflow-pipeline-de-identify-pii-dataset)
+
+Cloud Data Loss Prevention resources and templates should be stored in the security project:
+
+![Centralized Cloud Data Loss Prevention high-level diagram](./images/dlp_diagram.png "Centralized Cloud Data Loss Prevention high-level diagram")
+
+## How to run this script
+
+To deploy this example on your GCP organization, you will need

 - a folder or organization where new projects will be created
- a billing account that will be associated to new projects
- an identity (user or service account) with owner permissions on the folder or org, and billing user permissions on the billing account
+- a billing account that will be associated with the new projects

-## Bringing up the platform
+The DP is meant to be executed by a Service Account (or a regular user) having this minimal set of permission:

-[![Open in Cloud Shell](https://gstatic.com/cloudssh/images/open-btn.svg)](https://ssh.cloud.google.com/cloudshell/editor?cloudshell_git_repo=https%3A%2F%2Fgithub.com%2Fterraform-google-modules%2Fcloud-foundation-fabric.git&cloudshell_open_in_editor=README.md&cloudshell_workspace=examples%2Fdata-solutions%2Fdata-platform-foundations)
+- **Org level**:
+  - `"compute.organizations.enableXpnResource"`
+  - `"compute.organizations.disableXpnResource"`
+  - `"compute.subnetworks.setIamPolicy"`
+- **Folder level**:
+  - `"roles/logging.admin"`
+  - `"roles/owner"`
+  - `"roles/resourcemanager.folderAdmin"`
+  - `"roles/resourcemanager.projectCreator"`
+- **Cloud Key Management Keys** (if Cloud Key Management keys are configured):
+  - `"roles/cloudkms.admin"` or Permissions: `cloudkms.cryptoKeys.getIamPolicy`, `cloudkms.cryptoKeys.list`, `cloudkms.cryptoKeys.setIamPolicy`
+- **On the host project** for the Shared VPC/s
+  - `"roles/browser"`
+  - `"roles/compute.viewer"`
+  - `"roles/dns.admin"`

-The end-to-end example is composed of 2 foundational, and 1 optional steps:
+## Variable configuration

-1. [Environment setup](./01-environment/)
-1. [Data source setup](./02-resources/)
-1. (Optional) [Pipeline setup](./03-pipeline/)
+There are three sets of variables you will need to fill in:

-The environment setup is designed to manage a single environment. Various strategies like workspaces, branching, or even separate clones can be used to support multiple environments.
+```hcl
+prefix             = "myco"
+project_create = {
+  parent             = "folders/123456789012"
+  billing_account_id = "111111-222222-333333"
+}
+organization = {
+  domain = "domain.com"
+}
+```

-## TODO
+For more fine details check variables on [`variables.tf`](./variables.tf) and update according to the desired configuration.

-| Description | Priority (1:High - 5:Low ) | Status | Remarks |
-|-------------|----------|:------:|---------|
-| DLP best practices in the pipeline | 2 | Not Started |   |
-| Add Composer with a static DAG running the example | 3 | Not Started |   |
-| Integrate [CI/CD composer data processing workflow framework](https://github.com/jaketf/ci-cd-for-data-processing-workflow) | 3 | Not Started |   |
-| Schema changes, how to handle | 4 | Not Started |   |
-| Data lineage | 4 | Not Started |   |
-| Data quality checks | 4 | Not Started |   |
-| Shared-VPC | 5 | Not Started |   |
-| Logging & monitoring | TBD | Not Started |   |
-| Orcestration for ingestion pipeline (just in the readme) | TBD | Not Started |   |
+## Customizations
+
+### Create Cloud Key Management keys as part of the DP
+
+To create Cloud Key Management keys in the DP you can uncomment the Cloud Key Management resources configured in the [`06-common.tf`](./06-common.tf) file and update Cloud Key Management keys pointers on `local.service_encryption_keys.*` to the local resource created.
+
+### Assign roles at BQ Dataset level
+
+To handle multiple groups of `data-analysts` accessing the same Data Lake layer projects but only to the dataset belonging to a specific group, you may want to assign roles at BigQuery dataset level instead of at project-level.
+To do this, you need to remove IAM binging at project-level for the `data-analysts` group and give roles at BigQuery dataset level using the `iam` variable on `bigquery-dataset` modules.
+
+## Demo pipeline
+
+The application layer is out of scope of this script, but as a demo, it is provided with a Cloud Composer DAG to mode data from the `landing` area to the `DataLake L2` dataset.
+
+Just follow the commands you find in the `demo_commands` Terraform output, go in the Cloud Composer UI and run the `data_pipeline_dag`.
+
+Description of commands:
+
+- 01: copy sample data to a `landing` Cloud Storage bucket impersonating the `load` service account.
+- 02: copy sample data structure definition in the `orchestration` Cloud Storage bucket impersonating the `orchestration` service account.
+- 03: copy the Cloud Composer DAG to the Cloud Composer Storage bucket impersonating the `orchestration` service account.
+- 04: Open the Cloud Composer Airflow UI and run the imported DAG.
+- 05: Run the BigQuery query to see results.
+<!-- BEGIN TFDOC -->
+
+## Variables
+
+| name | description | type | required | default |
+|---|---|:---:|:---:|:---:|
+| [organization](variables.tf#L88) | Organization details. | <code title="object&#40;&#123;&#10;  domain &#61; string&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> | ✓ |  |
+| [prefix](variables.tf#L95) | Unique prefix used for resource names. Not used for projects if 'project_create' is null. | <code>string</code> | ✓ |  |
+| [composer_config](variables.tf#L17) |  | <code title="object&#40;&#123;&#10;  ip_range_cloudsql   &#61; string&#10;  ip_range_gke_master &#61; string&#10;  ip_range_web_server &#61; string&#10;  policy_boolean      &#61; map&#40;bool&#41;&#10;  region              &#61; string&#10;  secondary_ip_range &#61; object&#40;&#123;&#10;    pods     &#61; string&#10;    services &#61; string&#10;  &#125;&#41;&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> |  | <code title="&#123;&#10;  ip_range_cloudsql   &#61; &#34;10.20.10.0&#47;24&#34;&#10;  ip_range_gke_master &#61; &#34;10.20.11.0&#47;28&#34;&#10;  ip_range_web_server &#61; &#34;10.20.11.16&#47;28&#34;&#10;  policy_boolean      &#61; null&#10;  region              &#61; &#34;europe-west1&#34;&#10;  secondary_ip_range &#61; &#123;&#10;    pods     &#61; &#34;10.10.8.0&#47;22&#34;&#10;    services &#61; &#34;10.10.12.0&#47;24&#34;&#10;  &#125;&#10;&#125;">&#123;&#8230;&#125;</code> |
+| [data_force_destroy](variables.tf#L42) | Flag to set 'force_destroy' on data services like BiguQery or Cloud Storage. | <code>bool</code> |  | <code>false</code> |
+| [groups](variables.tf#L48) | Groups. | <code>map&#40;string&#41;</code> |  | <code title="&#123;&#10;  data-analysts  &#61; &#34;gcp-data-analysts&#34;&#10;  data-engineers &#61; &#34;gcp-data-engineers&#34;&#10;  data-security  &#61; &#34;gcp-data-security&#34;&#10;&#125;">&#123;&#8230;&#125;</code> |
+| [location_config](variables.tf#L148) | Locations where resources will be deployed. Map to configure region and multiregion specs. | <code title="object&#40;&#123;&#10;  region       &#61; string&#10;  multi_region &#61; string&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> |  | <code title="&#123;&#10;  region       &#61; &#34;europe-west1&#34;&#10;  multi_region &#61; &#34;eu&#34;&#10;&#125;">&#123;&#8230;&#125;</code> |
+| [network_config](variables.tf#L58) | Network configurations to use. Specify a shared VPC to use, if null networks will be created in projects. | <code title="object&#40;&#123;&#10;  enable_cloud_nat &#61; bool&#10;  host_project     &#61; string&#10;  network          &#61; string&#10;  vpc_subnet_range &#61; object&#40;&#123;&#10;    load           &#61; string&#10;    transformation &#61; string&#10;    orchestration  &#61; string&#10;  &#125;&#41;&#10;  vpc_subnet_self_link &#61; object&#40;&#123;&#10;    load           &#61; string&#10;    transformation &#61; string&#10;    orchestration  &#61; string&#10;  &#125;&#41;&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> |  | <code title="&#123;&#10;  enable_cloud_nat &#61; false&#10;  host_project     &#61; null&#10;  network          &#61; null&#10;  vpc_subnet_range &#61; &#123;&#10;    load           &#61; &#34;10.10.0.0&#47;24&#34;&#10;    transformation &#61; &#34;10.10.0.0&#47;24&#34;&#10;    orchestration  &#61; &#34;10.10.0.0&#47;24&#34;&#10;  &#125;&#10;  vpc_subnet_self_link &#61; null&#10;&#125;">&#123;&#8230;&#125;</code> |
+| [project_create](variables.tf#L100) | Provide values if project creation is needed, uses existing project if null. Parent is in 'folders/nnn' or 'organizations/nnn' format. | <code title="object&#40;&#123;&#10;  billing_account_id &#61; string&#10;  parent             &#61; string&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> |  | <code>null</code> |
+| [project_id](variables.tf#L109) | Project id, references existing project if `project_create` is null. | <code title="object&#40;&#123;&#10;  landing             &#61; string&#10;  load                &#61; string&#10;  orchestration       &#61; string&#10;  trasformation       &#61; string&#10;  datalake-l0         &#61; string&#10;  datalake-l1         &#61; string&#10;  datalake-l2         &#61; string&#10;  datalake-playground &#61; string&#10;  common              &#61; string&#10;  exposure            &#61; string&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> |  | <code title="&#123;&#10;  landing             &#61; &#34;lnd&#34;&#10;  load                &#61; &#34;lod&#34;&#10;  orchestration       &#61; &#34;orc&#34;&#10;  trasformation       &#61; &#34;trf&#34;&#10;  datalake-l0         &#61; &#34;dtl-0&#34;&#10;  datalake-l1         &#61; &#34;dtl-1&#34;&#10;  datalake-l2         &#61; &#34;dtl-2&#34;&#10;  datalake-playground &#61; &#34;dtl-plg&#34;&#10;  common              &#61; &#34;cmn&#34;&#10;  exposure            &#61; &#34;exp&#34;&#10;&#125;">&#123;&#8230;&#125;</code> |
+| [project_services](variables.tf#L137) | List of core services enabled on all projects. | <code>list&#40;string&#41;</code> |  | <code title="&#91;&#10;  &#34;cloudresourcemanager.googleapis.com&#34;,&#10;  &#34;iam.googleapis.com&#34;,&#10;  &#34;serviceusage.googleapis.com&#34;,&#10;  &#34;stackdriver.googleapis.com&#34;&#10;&#93;">&#91;&#8230;&#93;</code> |
+
+## Outputs
+
+| name | description | sensitive |
+|---|---|:---:|
+| [bigquery-datasets](outputs.tf#L17) | BigQuery datasets. |  |
+| [demo_commands](outputs.tf#L93) | Demo commands. |  |
+| [gcs-buckets](outputs.tf#L28) | GCS buckets. |  |
+| [kms_keys](outputs.tf#L42) | Cloud MKS keys. |  |
+| [projects](outputs.tf#L47) | GCP Projects informations. |  |
+| [vpc_network](outputs.tf#L75) | VPC network. |  |
+| [vpc_subnet](outputs.tf#L84) | VPC subnetworks. |  |
+
+<!-- END TFDOC -->
+## TODOs
+
+Features to add in future releases:
+
+- Add support for Column level access on BigQuery
+- Add example templates for Data Catalog
+- Add example on how to use Cloud Data Loss Prevention
+- Add solution to handle Tables, Views, and Authorized Views lifecycle
+- Add solution to handle Metadata lifecycle
+
+## To Test/Fix
+
+- Composer require "Require OS Login" not enforced
+- External Shared-VPC
--- a/examples/data-solutions/data-platform-foundations/backend.tf.sample
+++ b/examples/data-solutions/data-platform-foundations/backend.tf.sample
--- a/examples/data-solutions/data-platform-foundations/demo/README.md
+++ b/examples/data-solutions/data-platform-foundations/demo/README.md
--- a/examples/data-solutions/data-platform-foundations/demo/data/customer_purchase.json
+++ b/examples/data-solutions/data-platform-foundations/demo/data/customer_purchase.json
--- a/examples/data-solutions/data-platform-foundations/demo/data/customers.csv
+++ b/examples/data-solutions/data-platform-foundations/demo/data/customers.csv
--- a/examples/data-solutions/data-platform-foundations/demo/data/customers.json
+++ b/examples/data-solutions/data-platform-foundations/demo/data/customers.json
--- a/examples/data-solutions/data-platform-foundations/demo/data/customers_schema.json
+++ b/examples/data-solutions/data-platform-foundations/demo/data/customers_schema.json
--- a/examples/data-solutions/data-platform-foundations/demo/data/customers_udf.js
+++ b/examples/data-solutions/data-platform-foundations/demo/data/customers_udf.js
--- a/examples/data-solutions/data-platform-foundations/demo/data/purchases.csv
+++ b/examples/data-solutions/data-platform-foundations/demo/data/purchases.csv
--- a/examples/data-solutions/data-platform-foundations/demo/data/purchases.json
+++ b/examples/data-solutions/data-platform-foundations/demo/data/purchases.json
--- a/examples/data-solutions/data-platform-foundations/demo/data/purchases_schema.json
+++ b/examples/data-solutions/data-platform-foundations/demo/data/purchases_schema.json
--- a/examples/data-solutions/data-platform-foundations/demo/data/purchases_udf.js
+++ b/examples/data-solutions/data-platform-foundations/demo/data/purchases_udf.js
--- a/examples/data-solutions/data-platform-foundations/demo/datapipeline.py
+++ b/examples/data-solutions/data-platform-foundations/demo/datapipeline.py
--- a/examples/data-solutions/data-platform-foundations/images/dlp_diagram.png
+++ b/examples/data-solutions/data-platform-foundations/images/dlp_diagram.png
--- a/examples/data-solutions/data-platform-foundations/images/kms_diagram.png
+++ b/examples/data-solutions/data-platform-foundations/images/kms_diagram.png
--- a/examples/data-solutions/data-platform-foundations/images/overview_diagram.png
+++ b/examples/data-solutions/data-platform-foundations/images/overview_diagram.png
--- a/examples/data-solutions/data-platform-foundations/main.tf
+++ b/examples/data-solutions/data-platform-foundations/main.tf
--- a/examples/data-solutions/data-platform-foundations/outputs.tf
+++ b/examples/data-solutions/data-platform-foundations/outputs.tf
--- a/examples/data-solutions/data-platform-foundations/terraform.tfvars.sample
+++ b/examples/data-solutions/data-platform-foundations/terraform.tfvars.sample
--- a/examples/data-solutions/data-platform-foundations/variables.tf
+++ b/examples/data-solutions/data-platform-foundations/variables.tf
--- a/examples/data-solutions/dp-foundation/README.md
+++ b/examples/data-solutions/dp-foundation/README.md
@ -1,251 +0,0 @@
-# Data Platform
-
-This module implements an opinionated Data Platform (DP) Architecture that creates and setup projects and related resources that compose an end-to-end data environment.
-
-The code is intentionally simple, as it's intended to provide a generic initial setup and then allow easy customizations to complete the implementation of the intended design.
-
-The following diagram is a high-level reference of the resources created and managed here:
-
-![Data Platform architecture overview](./images/overview_diagram.png "Data Platform architecture overview")
-
-A demo pipeline is also part of this example: it can be built and run on top of the foundational infrastructure to verify or test the setup quickly.
-
-## Design overview and choices
-
-Despite its simplicity, this stage implements the basics of a design that we've seen working well for various customers.
-
-The approach adapts to different high-level requirements:
-
- boundaries for each step
- clearly defined actors
- least privilege principle
- rely on service account impersonation
-
-The code in this example doesn't address Organization-level configurations (Organization policy, VPC-SC, centralized logs). We expect those to be managed by automation stages external to this script like those in [FAST](../../../fast).
-
-### Project structure
-
-The DP is designed to rely on several projects, one project per data stage. The stages identified are:
-
- landing
- load
- data lake
- orchestration
- transformation
- exposure
-
-This separation into projects allows adhering to the least-privilege principle by using project-level roles.
-
-The script will create the following projects:
-
- **Landing** Used to store temporary data. Data is pushed to Cloud Storage, BigQuery, or Cloud PubSub. Resources are configured with a customizable lifecycle policy.
- **Load** Used to load data from landing to data lake. The load is made with minimal to zero transformation logic (mainly `cast`). Anonymization or tokenization of Personally Identifiable Information (PII) can be implemented here or in the transformation stage, depending on your requirements. The use of [Cloud Dataflow templates](https://cloud.google.com/dataflow/docs/concepts/dataflow-templates) is recommended.
- **Data Lake** Several projects distributed across 3 separate layers, to host progressively processed and refined data:
-  - **L0 - Raw data** Structured Data, stored in relevant formats: structured data stored in BigQuery, unstructured data stored on Cloud Storage with additional metadata stored in BigQuery (for example pictures stored in Cloud Storage and analysis of the images for Cloud Vision API stored in BigQuery).
-  - **L1 - Cleansed, aggregated and standardized data**
-  - **L2 - Curated layer**
-  - **Playground** Temporary tables that Data Analyst may use to perform R&D on data available in other Data Lake layers.
- **Orchestration** Used to host Cloud Composer, which orchestrates all tasks that move data across layers.
- **Transformation** Used to move data between Data Lake layers. We strongly suggest relying on BigQuery Engine to perform the transformations. If BigQuery doesn't have the features needed to perform your transformations, you can use Cloud Dataflow with [Cloud Dataflow templates](https://cloud.google.com/dataflow/docs/concepts/dataflow-templates). This stage can also optionally  anonymize or tokenize PII.
- **Exposure** Used to host resources that share processed data with external systems. Depending on the access pattern, data can be presented via Cloud SQL, BigQuery, or Bigtable. For BigQuery data, we strongly suggest relying on [Authorized views](https://cloud.google.com/bigquery/docs/authorized-views).
-
-### Roles
-
-We assign roles on resources at the project level, granting the appropriate role via groups for humans and individual principals for service accounts, according to best practices.
-
-### Service accounts
-
-Service account creation follows the least privilege principle, performing a single task which requires access to a defined set of resources. For example, the Cloud Dataflow service account only has access to the landing project and the data lake L0 project.
-
-Using of service account keys within a data pipeline exposes to several security risks deriving from a credentials leak. This example shows how to leverage impersonation to avoid the need of creating keys.
-
-### Groups
-
-We use three groups to control access to resources:
-
- *Data Engineers* They handle and run the Data Hub, with read access to all resources in order to troubleshoot possible issues with pipelines. This team can also impersonate any service account.
- *Data Analyst*. They perform analysis on datasets, with read access to the data lake L2 project, and BigQuery READ/WRITE access to the playground project. - *Data Security*:. They handle security configurations related to the Data Hub.
-
-### Virtual Private Cloud (VPC) design
-
-As is often the case in real-world configurations, this example accepts as input an existing [Shared-VPC](https://cloud.google.com/vpc/docs/shared-vpc) via the `network_config` variable.
-
-If the `network_config` variable is not provided, one VPC will be created in each project that supports network resources (load, transformation and orchestration).
-
-### IP ranges and subnetting
-
-To deploy this example with self-managed VPCs you need the following ranges:
-
- one /24 for the load project VPC subnet used for Cloud Dataflow workers
- one /24 for the transformation VPC subnet used for Cloud Dataflow workers
- one /24 range for the orchestration VPC subnet used for Composer workers
- one /22 and one /24 ranges for the secondary ranges associated with the orchestration VPC subnet
-
-If you are using Shared VPC, you need one subnet with one /22 and one /24 secondary range defined for Composer pods and services.
-
-In both VPC scenarios, you also need these ranges for Composer:
-
- one /24 for Cloud SQL
- one /28 for the GKE control plane
- one /28 for the web server
-
-### Resource naming conventions
-
-Resources in the script use the following acronyms:
-
- `lnd` for `landing`
- `lod` for `load`
- `orc` for `orchestration`
- `trf` for `transformation`
- `dtl` for `Data Lake`
- `cmn` for `common`
- `plg` for `playground`
- 2 letters acronym for GCP products, example: `bq` for `BigQuery`, `df` for `Cloud Dataflow`, ...
-
-Resources follow the naming convention described below.
-
- `prefix-layer` for projects
- `prefix-layer[2]-gcp-product[2]-counter` for services and service accounts
-
-### Encryption
-
-We suggest a centralized approach to key management, where Security is the only team that can access encryption material, and keyrings and keys are managed in a project external to the DP.
-
-![Centralized Cloud Key Management high-level diagram](./images/kms_diagram.png "Centralized Cloud Key Management high-level diagram")
-
-To configure the use of Cloud KMS on resources, you have to specify the key id on the `service_encryption_keys` variable. Key locations should match resource locations. Example:
-
-```hcl
-service_encryption_keys = {
-    bq       = "KEY_URL_MULTIREGIONAL"
-    composer = "KEY_URL_REGIONAL"
-    dataflow = "KEY_URL_REGIONAL"
-    storage  = "KEY_URL_MULTIREGIONAL"
-    pubsub   = "KEY_URL_MULTIREGIONAL"
-```
-
-This step is optional and depends on customer policies and security best practices.
-
-## Data Anonymization
-
-We suggest using Cloud Data Loss Prevention to identify/mask/tokenize your confidential data.
-
-While implementing a Data Loss Prevention strategy is out of scope for this example, we enable the service in two different projects so that [Cloud Data Loss Prevention templates](https://cloud.google.com/dlp/docs/concepts-templates) can be configured in one of two ways:
-
- during the ingestion phase, from Dataflow
- during the transformation phase, from [BigQuery](https://cloud.google.com/bigquery/docs/scan-with-dlp) or [Cloud Dataflow](https://cloud.google.com/architecture/running-automated-dataflow-pipeline-de-identify-pii-dataset)
-
-Cloud Data Loss Prevention resources and templates should be stored in the security project:
-
-![Centralized Cloud Data Loss Prevention high-level diagram](./images/dlp_diagram.png "Centralized Cloud Data Loss Prevention high-level diagram")
-
-## How to run this script
-
-To deploy this example on your GCP organization, you will need
-
- a folder or organization where new projects will be created
- a billing account that will be associated with the new projects
-
-The DP is meant to be executed by a Service Account (or a regular user) having this minimal set of permission:
-
- **Org level**:
-  - `"compute.organizations.enableXpnResource"`
-  - `"compute.organizations.disableXpnResource"`
-  - `"compute.subnetworks.setIamPolicy"`
- **Folder level**:
-  - `"roles/logging.admin"`
-  - `"roles/owner"`
-  - `"roles/resourcemanager.folderAdmin"`
-  - `"roles/resourcemanager.projectCreator"`
- **Cloud Key Management Keys** (if Cloud Key Management keys are configured):
-  - `"roles/cloudkms.admin"` or Permissions: `cloudkms.cryptoKeys.getIamPolicy`, `cloudkms.cryptoKeys.list`, `cloudkms.cryptoKeys.setIamPolicy`
- **On the host project** for the Shared VPC/s
-  - `"roles/browser"`
-  - `"roles/compute.viewer"`
-  - `"roles/dns.admin"`
-
-## Variable configuration
-
-There are three sets of variables you will need to fill in:
-
-```hcl
-prefix             = "myco"
-project_create = {
-  parent             = "folders/123456789012"
-  billing_account_id = "111111-222222-333333"
-}
-organization = {
-  domain = "domain.com"
-}
-```
-
-For more fine details check variables on [`variables.tf`](./variables.tf) and update according to the desired configuration.
-
-## Customizations
-
-### Create Cloud Key Management keys as part of the DP
-
-To create Cloud Key Management keys in the DP you can uncomment the Cloud Key Management resources configured in the [`06-common.tf`](./06-common.tf) file and update Cloud Key Management keys pointers on `local.service_encryption_keys.*` to the local resource created.
-
-### Assign roles at BQ Dataset level
-
-To handle multiple groups of `data-analysts` accessing the same Data Lake layer projects but only to the dataset belonging to a specific group, you may want to assign roles at BigQuery dataset level instead of at project-level.
-To do this, you need to remove IAM binging at project-level for the `data-analysts` group and give roles at BigQuery dataset level using the `iam` variable on `bigquery-dataset` modules.
-
-## Demo pipeline
-
-The application layer is out of scope of this script, but as a demo, it is provided with a Cloud Composer DAG to mode data from the `landing` area to the `DataLake L2` dataset.
-
-Just follow the commands you find in the `demo_commands` Terraform output, go in the Cloud Composer UI and run the `data_pipeline_dag`.
-
-Description of commands:
-
- 01: copy sample data to a `landing` Cloud Storage bucket impersonating the `load` service account.
- 02: copy sample data structure definition in the `orchestration` Cloud Storage bucket impersonating the `orchestration` service account.
- 03: copy the Cloud Composer DAG to the Cloud Composer Storage bucket impersonating the `orchestration` service account.
- 04: Open the Cloud Composer Airflow UI and run the imported DAG.
- 05: Run the BigQuery query to see results.
-<!-- BEGIN TFDOC -->
-
-## Variables
-
-| name | description | type | required | default |
-|---|---|:---:|:---:|:---:|
-| [organization](variables.tf#L88) | Organization details. | <code title="object&#40;&#123;&#10;  domain &#61; string&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> | ✓ |  |
-| [prefix](variables.tf#L95) | Unique prefix used for resource names. Not used for projects if 'project_create' is null. | <code>string</code> | ✓ |  |
-| [composer_config](variables.tf#L17) |  | <code title="object&#40;&#123;&#10;  ip_range_cloudsql   &#61; string&#10;  ip_range_gke_master &#61; string&#10;  ip_range_web_server &#61; string&#10;  policy_boolean      &#61; map&#40;bool&#41;&#10;  region              &#61; string&#10;  secondary_ip_range &#61; object&#40;&#123;&#10;    pods     &#61; string&#10;    services &#61; string&#10;  &#125;&#41;&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> |  | <code title="&#123;&#10;  ip_range_cloudsql   &#61; &#34;10.20.10.0&#47;24&#34;&#10;  ip_range_gke_master &#61; &#34;10.20.11.0&#47;28&#34;&#10;  ip_range_web_server &#61; &#34;10.20.11.16&#47;28&#34;&#10;  policy_boolean      &#61; null&#10;  region              &#61; &#34;europe-west1&#34;&#10;  secondary_ip_range &#61; &#123;&#10;    pods     &#61; &#34;10.10.8.0&#47;22&#34;&#10;    services &#61; &#34;10.10.12.0&#47;24&#34;&#10;  &#125;&#10;&#125;">&#123;&#8230;&#125;</code> |
-| [data_force_destroy](variables.tf#L42) | Flag to set 'force_destroy' on data services like BiguQery or Cloud Storage. | <code>bool</code> |  | <code>false</code> |
-| [groups](variables.tf#L48) | Groups. | <code>map&#40;string&#41;</code> |  | <code title="&#123;&#10;  data-analysts  &#61; &#34;gcp-data-analysts&#34;&#10;  data-engineers &#61; &#34;gcp-data-engineers&#34;&#10;  data-security  &#61; &#34;gcp-data-security&#34;&#10;&#125;">&#123;&#8230;&#125;</code> |
-| [location_config](variables.tf#L148) | Locations where resources will be deployed. Map to configure region and multiregion specs. | <code title="object&#40;&#123;&#10;  region       &#61; string&#10;  multi_region &#61; string&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> |  | <code title="&#123;&#10;  region       &#61; &#34;europe-west1&#34;&#10;  multi_region &#61; &#34;eu&#34;&#10;&#125;">&#123;&#8230;&#125;</code> |
-| [network_config](variables.tf#L58) | Network configurations to use. Specify a shared VPC to use, if null networks will be created in projects. | <code title="object&#40;&#123;&#10;  enable_cloud_nat &#61; bool&#10;  host_project     &#61; string&#10;  network          &#61; string&#10;  vpc_subnet_range &#61; object&#40;&#123;&#10;    load           &#61; string&#10;    transformation &#61; string&#10;    orchestration  &#61; string&#10;  &#125;&#41;&#10;  vpc_subnet_self_link &#61; object&#40;&#123;&#10;    load           &#61; string&#10;    transformation &#61; string&#10;    orchestration  &#61; string&#10;  &#125;&#41;&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> |  | <code title="&#123;&#10;  enable_cloud_nat &#61; false&#10;  host_project     &#61; null&#10;  network          &#61; null&#10;  vpc_subnet_range &#61; &#123;&#10;    load           &#61; &#34;10.10.0.0&#47;24&#34;&#10;    transformation &#61; &#34;10.10.0.0&#47;24&#34;&#10;    orchestration  &#61; &#34;10.10.0.0&#47;24&#34;&#10;  &#125;&#10;  vpc_subnet_self_link &#61; null&#10;&#125;">&#123;&#8230;&#125;</code> |
-| [project_create](variables.tf#L100) | Provide values if project creation is needed, uses existing project if null. Parent is in 'folders/nnn' or 'organizations/nnn' format. | <code title="object&#40;&#123;&#10;  billing_account_id &#61; string&#10;  parent             &#61; string&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> |  | <code>null</code> |
-| [project_id](variables.tf#L109) | Project id, references existing project if `project_create` is null. | <code title="object&#40;&#123;&#10;  landing             &#61; string&#10;  load                &#61; string&#10;  orchestration       &#61; string&#10;  trasformation       &#61; string&#10;  datalake-l0         &#61; string&#10;  datalake-l1         &#61; string&#10;  datalake-l2         &#61; string&#10;  datalake-playground &#61; string&#10;  common              &#61; string&#10;  exposure            &#61; string&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> |  | <code title="&#123;&#10;  landing             &#61; &#34;lnd&#34;&#10;  load                &#61; &#34;lod&#34;&#10;  orchestration       &#61; &#34;orc&#34;&#10;  trasformation       &#61; &#34;trf&#34;&#10;  datalake-l0         &#61; &#34;dtl-0&#34;&#10;  datalake-l1         &#61; &#34;dtl-1&#34;&#10;  datalake-l2         &#61; &#34;dtl-2&#34;&#10;  datalake-playground &#61; &#34;dtl-plg&#34;&#10;  common              &#61; &#34;cmn&#34;&#10;  exposure            &#61; &#34;exp&#34;&#10;&#125;">&#123;&#8230;&#125;</code> |
-| [project_services](variables.tf#L137) | List of core services enabled on all projects. | <code>list&#40;string&#41;</code> |  | <code title="&#91;&#10;  &#34;cloudresourcemanager.googleapis.com&#34;,&#10;  &#34;iam.googleapis.com&#34;,&#10;  &#34;serviceusage.googleapis.com&#34;,&#10;  &#34;stackdriver.googleapis.com&#34;&#10;&#93;">&#91;&#8230;&#93;</code> |
-
-## Outputs
-
-| name | description | sensitive |
-|---|---|:---:|
-| [bigquery-datasets](outputs.tf#L17) | BigQuery datasets. |  |
-| [demo_commands](outputs.tf#L93) | Demo commands. |  |
-| [gcs-buckets](outputs.tf#L28) | GCS buckets. |  |
-| [kms_keys](outputs.tf#L42) | Cloud MKS keys. |  |
-| [projects](outputs.tf#L47) | GCP Projects informations. |  |
-| [vpc_network](outputs.tf#L75) | VPC network. |  |
-| [vpc_subnet](outputs.tf#L84) | VPC subnetworks. |  |
-
-<!-- END TFDOC -->
-## TODOs
-
-Features to add in future releases:
-
- Add support for Column level access on BigQuery
- Add example templates for Data Catalog
- Add example on how to use Cloud Data Loss Prevention
- Add solution to handle Tables, Views, and Authorized Views lifecycle
- Add solution to handle Metadata lifecycle
-
-## To Test/Fix
-
- Composer require "Require OS Login" not enforced
- External Shared-VPC