Merge branch 'master' into fast-dev
This commit is contained in:
commit
9226024eb9
|
@ -7,6 +7,7 @@ All notable changes to this project will be documented in this file.
|
|||
- new `net-glb` module for Global External Load balancer
|
||||
- new `project-factory` module in [`examples/factories`](./examples/factories)
|
||||
- Module `project`: add missing Service Identity Accounts: artifactregistry, composer.
|
||||
- new data-solutions example: Cloud Storage to Bigquery with Cloud Dataflow with least privileges
|
||||
|
||||
## [12.0.0] - 2022-01-11
|
||||
|
||||
|
|
|
@ -0,0 +1,151 @@
|
|||
# Cloud Storage to Bigquery with Cloud Dataflow with least privileges
|
||||
|
||||
This example creates the infrastructure needed to run a [Cloud Dataflow](https://cloud.google.com/dataflow) pipeline to import data from [GCS](https://cloud.google.com/storage) to [Bigquery](https://cloud.google.com/bigquery). The example will create different service accounts with least privileges on resources. To run the pipeline, users listed in `data_eng_principals` can impersonate all those service accounts.
|
||||
|
||||
The solution will use:
|
||||
- internal IPs for GCE and Cloud Dataflow instances
|
||||
- Cloud NAT to let resources egress to the Internet, to run system updates and install packages
|
||||
- rely on [Service Account Impersonation](https://cloud.google.com/iam/docs/impersonating-service-accounts) to avoid the use of service account keys
|
||||
- Service Accounts with least privilege on each resource
|
||||
- (Optional) CMEK encription for GCS bucket, DataFlow instances and BigQuery tables
|
||||
|
||||
The example is designed to match real-world use cases with a minimum amount of resources and some compromise listed below. It can be used as a starting point for more complex scenarios.
|
||||
|
||||
This is the high level diagram:
|
||||
|
||||
![GCS to Biquery High-level diagram](diagram.png "GCS to Biquery High-level diagram")
|
||||
## Move to real use case consideration
|
||||
In the example we implemented some compromise to keep the example minimal and easy to read. On a real word use case, you may evaluate the option to:
|
||||
- Configure a Shared-VPC
|
||||
- Use only Identity Groups to assigne roles
|
||||
- Use Authorative IAM role assignement
|
||||
- Split resources in different project: Data Landing, Data Transformation, Data Lake, ...
|
||||
- Use VPC-SC to mitigate data exfiltration
|
||||
|
||||
## Managed resources and services
|
||||
|
||||
This sample creates several distinct groups of resources:
|
||||
|
||||
- projects
|
||||
- Service Project configured for GCS buckets, Dataflow instances and BigQuery tables and orchestration
|
||||
- networking
|
||||
- VPC network
|
||||
- One subnet
|
||||
- Firewall rules for [SSH access via IAP](https://cloud.google.com/iap/docs/using-tcp-forwarding) and open communication within the VPC
|
||||
- IAM
|
||||
- One service account for uploading data into the GCS landing bucket
|
||||
- One service account for Orchestration
|
||||
- One service account for Dataflow instances
|
||||
- One service account for Bigquery tables
|
||||
- GCS
|
||||
- One bucket
|
||||
- BQ
|
||||
- One dataset
|
||||
- One table. Tables are defined in Terraform for the porpuse of the example. Probably, in real scenario, would handle Tables creation in a separate Terraform State or using a different tool/pipeline (for example: Dataform).
|
||||
|
||||
In this example you can also configure users or group of user to assign them viewer role on the resources created and the ability to imprsonate service accounts to test dataflow pipelines before autometing them with Composer or any other orchestration systems.
|
||||
|
||||
## Deploy your enviroment
|
||||
|
||||
Run Terraform init:
|
||||
|
||||
```
|
||||
$ terraform init
|
||||
```
|
||||
|
||||
Configure the Terraform variable in your `terraform.tfvars` file. You need to spefify at least the following variables:
|
||||
|
||||
```
|
||||
billing_account = "0011322-334455-667788"
|
||||
root_node = "folders/123456789012"
|
||||
project_name = "test-demo-tf-001"
|
||||
data_eng_users = ["your_email@domani.example"]
|
||||
```
|
||||
|
||||
You can run now:
|
||||
|
||||
```
|
||||
$ terraform apply
|
||||
```
|
||||
|
||||
You should see the output of the Terraform script with resources created and some command pre-created for you to run the example following steps below.
|
||||
|
||||
## Test your environment with Cloud Dataflow
|
||||
|
||||
We assume all those steps are run using a user listed on `data_eng_principals`. You can authenticate as the user using the following command:
|
||||
|
||||
```
|
||||
$ gcloud init
|
||||
$ gcloud auth application-default login
|
||||
```
|
||||
|
||||
For the purpose of the example we will import from GCS to Bigquery a CSV file with the following structure:
|
||||
|
||||
```
|
||||
name,surname,timestam
|
||||
```
|
||||
|
||||
We need to create 3 file:
|
||||
- A `person.csv` file containing your data in the form `name,surname,timestam`. Here an example line `Lorenzo,Caggioni,1637771951'.
|
||||
- A `person_udf.js` containing the UDF javascript file used by the Dataflow template.
|
||||
- A `person_schema.json` file containing the table schema used to import the CSV.
|
||||
|
||||
You can find an example of those file in the folder `./data-demo`. You can copy the example files in the GCS bucket using the command returned in the terraform output as `command-01-gcs`. Below an example:
|
||||
|
||||
```bash
|
||||
gsutil -i gcs-landing@PROJECT.iam.gserviceaccount.com cp data-demo/* gs://LANDING_BUCKET
|
||||
```
|
||||
|
||||
We can now run the Dataflow pipeline using the `gcloud` returned in the terraform output as `command-02-dataflow`. Below an example:
|
||||
|
||||
```bash
|
||||
gcloud --impersonate-service-account=orch-test@PROJECT.iam.gserviceaccount.com dataflow jobs run test_batch_01 \
|
||||
--gcs-location gs://dataflow-templates/latest/GCS_Text_to_BigQuery \
|
||||
--project PROJECT \
|
||||
--region REGION \
|
||||
--disable-public-ips \
|
||||
--subnetwork https://www.googleapis.com/compute/v1/projects/PROJECT/regions/REGION/subnetworks/subnet \
|
||||
--staging-location gs://PROJECT-eu-df-tmplocation \
|
||||
--service-account-email df-test@PROJECT.iam.gserviceaccount.com \
|
||||
--parameters \
|
||||
javascriptTextTransformFunctionName=transform,\
|
||||
JSONPath=gs://PROJECT-eu-data/person_schema.json,\
|
||||
javascriptTextTransformGcsPath=gs://PROJECT-eu-data/person_udf.js,\
|
||||
inputFilePattern=gs://PROJECT-eu-data/person.csv,\
|
||||
outputTable=PROJECT:bq_dataset.person,\
|
||||
bigQueryLoadingTemporaryDirectory=gs://PROJECT-eu-df-tmplocation
|
||||
```
|
||||
|
||||
You can check data imported into Google BigQuery using the command returned in the terraform output as `command-03-bq`. Below an example:
|
||||
|
||||
```
|
||||
bq query --use_legacy_sql=false 'SELECT * FROM `PROJECT.datalake.person` LIMIT 1000'
|
||||
```
|
||||
|
||||
<!-- BEGIN TFDOC -->
|
||||
|
||||
## Variables
|
||||
|
||||
| name | description | type | required | default |
|
||||
|---|---|:---:|:---:|:---:|
|
||||
| prefix | Unique prefix used for resource names. Not used for project if 'project_create' is null. | <code>string</code> | ✓ | |
|
||||
| project_id | Project id, references existing project if `project_create` is null. | <code>string</code> | ✓ | |
|
||||
| cmek_encryption | Flag to enable CMEK on GCP resources created. | <code>bool</code> | | <code>false</code> |
|
||||
| data_eng_principals | Groups with Service Account Token creator role on service accounts in IAM format, eg 'group:group@domain.com'. | <code>list(string)</code> | | <code>[]</code> |
|
||||
| project_create | Provide values if project creation is needed, uses existing project if null. Parent is in 'folders/nnn' or 'organizations/nnn' format | <code title="object({ billing_account_id = string parent = string })">object({…})</code> | | <code>null</code> |
|
||||
| region | The region where resources will be deployed. | <code>string</code> | | <code>"europe-west1"</code> |
|
||||
| vpc_subnet_range | Ip range used for the VPC subnet created for the example. | <code>string</code> | | <code>"10.0.0.0/20"</code> |
|
||||
|
||||
## Outputs
|
||||
|
||||
| name | description | sensitive |
|
||||
|---|---|:---:|
|
||||
| bq_tables | Bigquery Tables. | |
|
||||
| buckets | GCS bucket Cloud KMS crypto keys. | |
|
||||
| command-01-gcs | gcloud command to copy data into the created bucket impersonating the service account. | |
|
||||
| command-02-dataflow | Command to run Dataflow template impersonating the service account. | |
|
||||
| command-03-bq | BigQuery command to query imported data. | |
|
||||
| project_id | Project id. | |
|
||||
| serviceaccount | Service account. | |
|
||||
|
||||
<!-- END TFDOC -->
|
|
@ -0,0 +1,30 @@
|
|||
# Copyright 2022 Google LLC
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# https://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
# The `impersonate_service_account` option require the identity launching terraform
|
||||
# role `roles/iam.serviceAccountTokenCreator` on the Service Account specified.
|
||||
|
||||
terraform {
|
||||
backend "gcs" {
|
||||
bucket = "BUCKET_NAME"
|
||||
prefix = "PREFIX"
|
||||
impersonate_service_account = "SERVICE_ACCOUNT@PROJECT_ID.iam.gserviceaccount.com"
|
||||
}
|
||||
}
|
||||
provider "google" {
|
||||
impersonate_service_account = "SERVICE_ACCOUNT@PROJECT_ID.iam.gserviceaccount.com"
|
||||
}
|
||||
provider "google-beta" {
|
||||
impersonate_service_account = "SERVICE_ACCOUNT@PROJECT_ID.iam.gserviceaccount.com"
|
||||
}
|
|
@ -0,0 +1,11 @@
|
|||
Lorenzo,Caggioni,1637771951
|
||||
Lorenzo,Caggioni,1637771952
|
||||
Lorenzo,Caggioni,1637771953
|
||||
Lorenzo,Caggioni,1637771954
|
||||
Lorenzo,Caggioni,1637771955
|
||||
Lorenzo,Caggioni,1637771956
|
||||
Lorenzo,Caggioni,1637771957
|
||||
Lorenzo,Caggioni,1637771958
|
||||
Lorenzo,Caggioni,1637771959
|
||||
Lorenzo,Caggioni,1637771910
|
||||
Lorenzo,Caggioni,1637771911
|
|
|
@ -0,0 +1,17 @@
|
|||
[
|
||||
{
|
||||
"mode": "NULLABLE",
|
||||
"name": "name",
|
||||
"type": "STRING"
|
||||
},
|
||||
{
|
||||
"mode": "NULLABLE",
|
||||
"name": "surname",
|
||||
"type": "STRING"
|
||||
},
|
||||
{
|
||||
"mode": "NULLABLE",
|
||||
"name": "timestamp",
|
||||
"type": "TIMESTAMP"
|
||||
}
|
||||
]
|
|
@ -0,0 +1,16 @@
|
|||
{
|
||||
"BigQuery Schema": [
|
||||
{
|
||||
"name": "name",
|
||||
"type": "STRING"
|
||||
},
|
||||
{
|
||||
"name": "surname",
|
||||
"type": "STRING"
|
||||
},
|
||||
{
|
||||
"name": "timestamp",
|
||||
"type": "TIMESTAMP"
|
||||
}
|
||||
]
|
||||
}
|
|
@ -0,0 +1,11 @@
|
|||
function transform(line) {
|
||||
var values = line.split(',');
|
||||
|
||||
var obj = new Object();
|
||||
obj.name = values[0];
|
||||
obj.surname = values[1];
|
||||
obj.timestamp = values[2];
|
||||
var jsonString = JSON.stringify(obj);
|
||||
|
||||
return jsonString;
|
||||
}
|
|
@ -0,0 +1,73 @@
|
|||
# Copyright 2022 Google LLC
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# https://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
###############################################################################
|
||||
# GCS #
|
||||
###############################################################################
|
||||
|
||||
module "gcs-data" {
|
||||
source = "../../../modules/gcs"
|
||||
project_id = module.project.project_id
|
||||
prefix = var.prefix
|
||||
name = "data"
|
||||
location = var.region
|
||||
storage_class = "REGIONAL"
|
||||
encryption_key = var.cmek_encryption ? try(module.kms[0].keys.key-gcs.id, null) : null
|
||||
force_destroy = true
|
||||
}
|
||||
|
||||
module "gcs-df-tmp" {
|
||||
source = "../../../modules/gcs"
|
||||
project_id = module.project.project_id
|
||||
prefix = var.prefix
|
||||
name = "df-tmp"
|
||||
location = var.region
|
||||
storage_class = "REGIONAL"
|
||||
encryption_key = var.cmek_encryption ? try(module.kms[0].keys.key-gcs.id, null) : null
|
||||
force_destroy = true
|
||||
}
|
||||
|
||||
###############################################################################
|
||||
# BQ #
|
||||
###############################################################################
|
||||
|
||||
module "bigquery-dataset" {
|
||||
source = "../../../modules/bigquery-dataset"
|
||||
project_id = module.project.project_id
|
||||
id = "datalake"
|
||||
location = var.region
|
||||
# Define Tables in Terraform for the porpuse of the example.
|
||||
# Probably in a production environment you would handle Tables creation in a
|
||||
# separate Terraform State or using a different tool/pipeline (for example: Dataform).
|
||||
tables = {
|
||||
person = {
|
||||
friendly_name = "Person. Dataflow import."
|
||||
labels = {}
|
||||
options = null
|
||||
partitioning = {
|
||||
field = null
|
||||
range = null # use start/end/interval for range
|
||||
time = null
|
||||
}
|
||||
schema = file("${path.module}/data-demo/person.json")
|
||||
deletion_protection = false
|
||||
options = {
|
||||
clustering = null
|
||||
encryption_key = var.cmek_encryption ? try(module.kms[0].keys.key-bq.id, null) : null
|
||||
expiration_time = null
|
||||
}
|
||||
}
|
||||
}
|
||||
encryption_key = var.cmek_encryption ? try(module.kms[0].keys.key-bq.id, null) : null
|
||||
}
|
Binary file not shown.
After Width: | Height: | Size: 39 KiB |
|
@ -0,0 +1,46 @@
|
|||
# Copyright 2022 Google LLC
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# https://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
module "kms" {
|
||||
count = var.cmek_encryption ? 1 : 0
|
||||
source = "../../../modules/kms"
|
||||
project_id = module.project.project_id
|
||||
keyring = {
|
||||
name = "${var.prefix}-keyring",
|
||||
location = var.region
|
||||
}
|
||||
keys = {
|
||||
key-df = null
|
||||
key-gcs = null
|
||||
key-bq = null
|
||||
}
|
||||
key_iam = {
|
||||
key-gcs = {
|
||||
"roles/cloudkms.cryptoKeyEncrypterDecrypter" = [
|
||||
"serviceAccount:${module.project.service_accounts.robots.storage}"
|
||||
]
|
||||
},
|
||||
key-bq = {
|
||||
"roles/cloudkms.cryptoKeyEncrypterDecrypter" = [
|
||||
"serviceAccount:${module.project.service_accounts.robots.bq}"
|
||||
]
|
||||
},
|
||||
key-df = {
|
||||
"roles/cloudkms.cryptoKeyEncrypterDecrypter" = [
|
||||
"serviceAccount:${module.project.service_accounts.robots.dataflow}",
|
||||
"serviceAccount:${module.project.service_accounts.robots.compute}",
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
|
@ -0,0 +1,110 @@
|
|||
# Copyright 2022 Google LLC
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# https://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
locals {
|
||||
iam = {
|
||||
# GCS roles
|
||||
"roles/storage.objectAdmin" = [
|
||||
module.service-account-df.iam_email,
|
||||
module.service-account-landing.iam_email
|
||||
],
|
||||
"roles/storage.objectViewer" = [
|
||||
module.service-account-orch.iam_email,
|
||||
],
|
||||
# BigQuery roles
|
||||
"roles/bigquery.admin" = concat([
|
||||
module.service-account-orch.iam_email,
|
||||
], var.data_eng_principals
|
||||
)
|
||||
"roles/bigquery.dataEditor" = [
|
||||
module.service-account-df.iam_email,
|
||||
module.service-account-bq.iam_email
|
||||
]
|
||||
"roles/bigquery.dataViewer" = [
|
||||
module.service-account-bq.iam_email,
|
||||
module.service-account-orch.iam_email
|
||||
]
|
||||
"roles/bigquery.jobUser" = [
|
||||
module.service-account-df.iam_email,
|
||||
module.service-account-bq.iam_email
|
||||
]
|
||||
"roles/bigquery.user" = [
|
||||
module.service-account-bq.iam_email,
|
||||
module.service-account-df.iam_email
|
||||
]
|
||||
# common roles
|
||||
"roles/logging.logWriter" = [
|
||||
module.service-account-bq.iam_email,
|
||||
module.service-account-landing.iam_email,
|
||||
module.service-account-orch.iam_email,
|
||||
]
|
||||
"roles/monitoring.metricWriter" = [
|
||||
module.service-account-bq.iam_email,
|
||||
module.service-account-landing.iam_email,
|
||||
module.service-account-orch.iam_email,
|
||||
]
|
||||
"roles/iam.serviceAccountUser" = [
|
||||
module.service-account-orch.iam_email,
|
||||
]
|
||||
"roles/iam.serviceAccountTokenCreator" = concat(
|
||||
var.data_eng_principals,
|
||||
)
|
||||
"roles/viewer" = concat(
|
||||
var.data_eng_principals
|
||||
)
|
||||
# Dataflow roles
|
||||
"roles/dataflow.admin" = concat([
|
||||
module.service-account-orch.iam_email,
|
||||
], var.data_eng_principals
|
||||
)
|
||||
"roles/dataflow.worker" = [
|
||||
module.service-account-df.iam_email,
|
||||
]
|
||||
# network roles
|
||||
"roles/compute.networkUser" = [
|
||||
module.service-account-df.iam_email,
|
||||
"serviceAccount:${module.project.service_accounts.robots.dataflow}"
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
###############################################################################
|
||||
# Projects #
|
||||
###############################################################################
|
||||
|
||||
module "project" {
|
||||
source = "../../../modules/project"
|
||||
name = var.project_id
|
||||
parent = try(var.project_create.parent, null)
|
||||
billing_account = try(var.project_create.billing_account_id, null)
|
||||
project_create = var.project_create != null
|
||||
prefix = var.project_create == null ? null : var.prefix
|
||||
services = [
|
||||
"bigquery.googleapis.com",
|
||||
"bigquerystorage.googleapis.com",
|
||||
"bigqueryreservation.googleapis.com",
|
||||
"cloudkms.googleapis.com",
|
||||
"compute.googleapis.com",
|
||||
"dataflow.googleapis.com",
|
||||
"servicenetworking.googleapis.com",
|
||||
"storage.googleapis.com",
|
||||
"storage-component.googleapis.com",
|
||||
]
|
||||
# additive IAM bindings avoid disrupting bindings in existing project
|
||||
iam = var.project_create != null ? local.iam : {}
|
||||
iam_additive = var.project_create == null ? local.iam : {}
|
||||
service_config = {
|
||||
disable_on_destroy = false, disable_dependent_services = false
|
||||
}
|
||||
}
|
|
@ -0,0 +1,75 @@
|
|||
# Copyright 2022 Google LLC
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# https://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
output "bq_tables" {
|
||||
description = "Bigquery Tables."
|
||||
value = module.bigquery-dataset.table_ids
|
||||
}
|
||||
|
||||
output "buckets" {
|
||||
description = "GCS bucket Cloud KMS crypto keys."
|
||||
value = {
|
||||
data = module.gcs-data.name
|
||||
df-tmp = module.gcs-df-tmp.name
|
||||
}
|
||||
}
|
||||
|
||||
output "project_id" {
|
||||
description = "Project id."
|
||||
value = module.project.project_id
|
||||
}
|
||||
|
||||
output "serviceaccount" {
|
||||
description = "Service account."
|
||||
value = {
|
||||
bq = module.service-account-bq.email
|
||||
df = module.service-account-df.email
|
||||
orch = module.service-account-orch.email
|
||||
landing = module.service-account-landing.email
|
||||
}
|
||||
}
|
||||
|
||||
output "command-01-gcs" {
|
||||
description = "gcloud command to copy data into the created bucket impersonating the service account."
|
||||
value = "gsutil -i ${module.service-account-landing.email} cp data-demo/* ${module.gcs-data.url}"
|
||||
}
|
||||
|
||||
output "command-02-dataflow" {
|
||||
description = "Command to run Dataflow template impersonating the service account."
|
||||
value = <<EOT
|
||||
gcloud --impersonate-service-account=${module.service-account-orch.email} dataflow jobs run test_batch_01 \
|
||||
--gcs-location gs://dataflow-templates/latest/GCS_Text_to_BigQuery \
|
||||
--project ${module.project.project_id} \
|
||||
--region ${var.region} \
|
||||
--disable-public-ips \
|
||||
--subnetwork ${module.vpc.subnets[format("%s/%s", var.region, "subnet")].self_link} \
|
||||
--staging-location ${module.gcs-df-tmp.url} \
|
||||
--service-account-email ${module.service-account-df.email} \
|
||||
${var.cmek_encryption ? format("--dataflow-kms-key=%s", module.kms[0].key_ids.key-df) : ""} \
|
||||
--parameters \
|
||||
javascriptTextTransformFunctionName=transform,\
|
||||
JSONPath=${module.gcs-data.url}/person_schema.json,\
|
||||
javascriptTextTransformGcsPath=${module.gcs-data.url}/person_udf.js,\
|
||||
inputFilePattern=${module.gcs-data.url}/person.csv,\
|
||||
outputTable=${module.project.project_id}:${module.bigquery-dataset.dataset_id}.${module.bigquery-dataset.tables["person"].table_id},\
|
||||
bigQueryLoadingTemporaryDirectory=${module.gcs-df-tmp.url}
|
||||
EOT
|
||||
}
|
||||
|
||||
output "command-03-bq" {
|
||||
description = "BigQuery command to query imported data."
|
||||
value = <<EOT
|
||||
bq query --project_id=${module.project.project_id} --use_legacy_sql=false 'SELECT * FROM `${module.project.project_id}.${module.bigquery-dataset.dataset_id}.${module.bigquery-dataset.tables["person"].table_id}` LIMIT 1000'"
|
||||
EOT
|
||||
}
|
|
@ -0,0 +1,41 @@
|
|||
# Copyright 2022 Google LLC
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# https://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
###############################################################################
|
||||
# Service Accounts #
|
||||
###############################################################################
|
||||
|
||||
module "service-account-bq" {
|
||||
source = "../../../modules/iam-service-account"
|
||||
project_id = module.project.project_id
|
||||
name = "bq-datalake"
|
||||
}
|
||||
|
||||
module "service-account-landing" {
|
||||
source = "../../../modules/iam-service-account"
|
||||
project_id = module.project.project_id
|
||||
name = "gcs-landing"
|
||||
}
|
||||
|
||||
module "service-account-orch" {
|
||||
source = "../../../modules/iam-service-account"
|
||||
project_id = module.project.project_id
|
||||
name = "orchestrator"
|
||||
}
|
||||
|
||||
module "service-account-df" {
|
||||
source = "../../../modules/iam-service-account"
|
||||
project_id = module.project.project_id
|
||||
name = "df-loading"
|
||||
}
|
|
@ -0,0 +1,3 @@
|
|||
data_eng_principals = ["user:data-eng@domain.com"]
|
||||
project_id = "datalake-001"
|
||||
prefix = "prefix"
|
|
@ -0,0 +1,55 @@
|
|||
# Copyright 2022 Google LLC
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# https://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
variable "cmek_encryption" {
|
||||
description = "Flag to enable CMEK on GCP resources created."
|
||||
type = bool
|
||||
default = false
|
||||
}
|
||||
|
||||
variable "data_eng_principals" {
|
||||
description = "Groups with Service Account Token creator role on service accounts in IAM format, eg 'group:group@domain.com'."
|
||||
type = list(string)
|
||||
default = []
|
||||
}
|
||||
variable "prefix" {
|
||||
description = "Unique prefix used for resource names. Not used for project if 'project_create' is null."
|
||||
type = string
|
||||
}
|
||||
|
||||
variable "project_create" {
|
||||
description = "Provide values if project creation is needed, uses existing project if null. Parent is in 'folders/nnn' or 'organizations/nnn' format"
|
||||
type = object({
|
||||
billing_account_id = string
|
||||
parent = string
|
||||
})
|
||||
default = null
|
||||
}
|
||||
|
||||
variable "project_id" {
|
||||
description = "Project id, references existing project if `project_create` is null."
|
||||
type = string
|
||||
}
|
||||
|
||||
variable "region" {
|
||||
description = "The region where resources will be deployed."
|
||||
type = string
|
||||
default = "europe-west1"
|
||||
}
|
||||
|
||||
variable "vpc_subnet_range" {
|
||||
description = "Ip range used for the VPC subnet created for the example."
|
||||
type = string
|
||||
default = "10.0.0.0/20"
|
||||
}
|
|
@ -0,0 +1,29 @@
|
|||
# Copyright 2022 Google LLC
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# https://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
terraform {
|
||||
required_version = ">= 1.0.0"
|
||||
required_providers {
|
||||
google = {
|
||||
source = "hashicorp/google"
|
||||
version = ">= 4.0.0"
|
||||
}
|
||||
google-beta = {
|
||||
source = "hashicorp/google-beta"
|
||||
version = ">= 4.0.0"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -0,0 +1,46 @@
|
|||
# Copyright 2022 Google LLC
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# https://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
###############################################################################
|
||||
# Networking #
|
||||
###############################################################################
|
||||
|
||||
module "vpc" {
|
||||
source = "../../../modules/net-vpc"
|
||||
project_id = module.project.project_id
|
||||
name = "${var.prefix}-vpc"
|
||||
subnets = [
|
||||
{
|
||||
ip_cidr_range = var.vpc_subnet_range
|
||||
name = "subnet"
|
||||
region = var.region
|
||||
secondary_ip_range = {}
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
module "vpc-firewall" {
|
||||
source = "../../../modules/net-vpc-firewall"
|
||||
project_id = module.project.project_id
|
||||
network = module.vpc.name
|
||||
admin_ranges = [var.vpc_subnet_range]
|
||||
}
|
||||
|
||||
module "nat" {
|
||||
source = "../../../modules/net-cloudnat"
|
||||
project_id = module.project.project_id
|
||||
region = var.region
|
||||
name = "${var.prefix}-default"
|
||||
router_network = module.vpc.name
|
||||
}
|
|
@ -0,0 +1,13 @@
|
|||
# Copyright 2022 Google LLC
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
|
@ -0,0 +1,22 @@
|
|||
/**
|
||||
* Copyright 2022 Google LLC
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
module "test" {
|
||||
source = "../../../../../examples/data-solutions/gcs-to-bq-with-least-privileges/"
|
||||
project_create = var.project_create
|
||||
project_id = var.project_id
|
||||
prefix = var.prefix
|
||||
}
|
|
@ -0,0 +1,39 @@
|
|||
/**
|
||||
* Copyright 2022 Google LLC
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
variable "prefix" {
|
||||
description = "Unique prefix used for resource names. Not used for project if 'project_create' is null."
|
||||
type = string
|
||||
default = "prefix"
|
||||
}
|
||||
|
||||
variable "project_create" {
|
||||
description = "Provide values if project creation is needed, uses existing project if null. Parent is in 'folders/nnn' or 'organizations/nnn' format"
|
||||
type = object({
|
||||
billing_account_id = string
|
||||
parent = string
|
||||
})
|
||||
default = {
|
||||
billing_account_id = "123456-123456-123456"
|
||||
parent = "folders/12345678"
|
||||
}
|
||||
}
|
||||
|
||||
variable "project_id" {
|
||||
description = "Project id, references existing project if `project_create` is null."
|
||||
type = string
|
||||
default = "datalake"
|
||||
}
|
|
@ -0,0 +1,27 @@
|
|||
# Copyright 2022 Google LLC
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
|
||||
import os
|
||||
import pytest
|
||||
|
||||
|
||||
FIXTURES_DIR = os.path.join(os.path.dirname(__file__), 'fixture')
|
||||
|
||||
|
||||
def test_resources(e2e_plan_runner):
|
||||
"Test that plan works and the numbers of resources is as expected."
|
||||
modules, resources = e2e_plan_runner(FIXTURES_DIR)
|
||||
assert len(modules) == 11
|
||||
assert len(resources) == 43
|
Loading…
Reference in New Issue