Bugfixing Data Foundations (#310)
* Bugfixing Data Foundations and impersonation support - Fixed SA permissions - Usage of impersonation to avoid SA private key export - Fixed required API enablement - Added FW rules required by dataflow - Added provider for sa impersonation
This commit is contained in:
parent
8b69638f89
commit
15b2736a7c
|
@ -25,10 +25,12 @@ To create the infrastructure:
|
|||
```tfm
|
||||
billing_account = "1234-1234-1234"
|
||||
parent = "folders/12345678"
|
||||
admins = ["user:xxxxx@yyyyy.com"]
|
||||
```
|
||||
|
||||
- make sure you have the right authentication setup (application default credentials, or a service account key)
|
||||
- make sure you have the right authentication setup (application default credentials, or a service account key) with the right permissions
|
||||
- **The output of this stage contains the values for the resources stage**
|
||||
- the `admins` variable contain a list of principals allowed to impersonate the service accounts. These principals will be given the `iam.serviceAccountTokenCreator` role
|
||||
- run `terraform init` and `terraform apply`
|
||||
|
||||
Once done testing, you can clean up resources by running `terraform destroy`.
|
||||
|
@ -57,6 +59,9 @@ The script use 'google_access_context_manager_service_perimeter_resource' terraf
|
|||
| *service_account_names* | Override this variable if you need non-standard names. | <code title="object({ main = string })">object({...})</code> | | <code title="{ main = "data-platform-main" }">...</code> |
|
||||
| *service_encryption_key_ids* | Cloud KMS encryption key in {LOCATION => [KEY_URL]} format. Keys belong to existing project. | <code title="object({ multiregional = string global = string })">object({...})</code> | | <code title="{ multiregional = null global = null }">...</code> |
|
||||
| *service_perimeter_standard* | VPC Service control standard perimeter name in the form of 'accessPolicies/ACCESS_POLICY_NAME/servicePerimeters/PERIMETER_NAME'. All projects will be added to the perimeter in enforced mode. | <code title="">string</code> | | <code title="">null</code> |
|
||||
| *admins* | List of users allowed to impersonate the service account | <code title="">list</code> | | <code title="">null</code> |
|
||||
|
||||
|
||||
|
||||
## Outputs
|
||||
|
||||
|
|
|
@ -31,8 +31,9 @@ module "project-datamart" {
|
|||
"storage.googleapis.com",
|
||||
"storage-component.googleapis.com",
|
||||
]
|
||||
iam = {
|
||||
"roles/editor" = [module.sa-services-main.iam_email]
|
||||
|
||||
iam_additive = {
|
||||
"roles/owner" = [module.sa-services-main.iam_email]
|
||||
}
|
||||
service_encryption_key_ids = {
|
||||
bq = [var.service_encryption_key_ids.multiregional]
|
||||
|
@ -56,8 +57,8 @@ module "project-dwh" {
|
|||
"storage.googleapis.com",
|
||||
"storage-component.googleapis.com",
|
||||
]
|
||||
iam = {
|
||||
"roles/editor" = [module.sa-services-main.iam_email]
|
||||
iam_additive = {
|
||||
"roles/owner" = [module.sa-services-main.iam_email]
|
||||
}
|
||||
service_encryption_key_ids = {
|
||||
bq = [var.service_encryption_key_ids.multiregional]
|
||||
|
@ -79,8 +80,8 @@ module "project-landing" {
|
|||
"storage.googleapis.com",
|
||||
"storage-component.googleapis.com",
|
||||
]
|
||||
iam = {
|
||||
"roles/editor" = [module.sa-services-main.iam_email]
|
||||
iam_additive = {
|
||||
"roles/owner" = [module.sa-services-main.iam_email]
|
||||
}
|
||||
service_encryption_key_ids = {
|
||||
pubsub = [var.service_encryption_key_ids.global]
|
||||
|
@ -98,6 +99,10 @@ module "project-services" {
|
|||
prefix = var.prefix
|
||||
name = var.project_names.services
|
||||
services = [
|
||||
"bigquery.googleapis.com",
|
||||
"cloudresourcemanager.googleapis.com",
|
||||
"iam.googleapis.com",
|
||||
"pubsub.googleapis.com",
|
||||
"storage.googleapis.com",
|
||||
"storage-component.googleapis.com",
|
||||
"sourcerepo.googleapis.com",
|
||||
|
@ -105,8 +110,8 @@ module "project-services" {
|
|||
"cloudasset.googleapis.com",
|
||||
"cloudkms.googleapis.com"
|
||||
]
|
||||
iam = {
|
||||
"roles/editor" = [module.sa-services-main.iam_email]
|
||||
iam_additive = {
|
||||
"roles/owner" = [module.sa-services-main.iam_email]
|
||||
}
|
||||
service_encryption_key_ids = {
|
||||
storage = [var.service_encryption_key_ids.multiregional]
|
||||
|
@ -123,6 +128,7 @@ module "project-transformation" {
|
|||
prefix = var.prefix
|
||||
name = var.project_names.transformation
|
||||
services = [
|
||||
"bigquery.googleapis.com",
|
||||
"cloudbuild.googleapis.com",
|
||||
"compute.googleapis.com",
|
||||
"dataflow.googleapis.com",
|
||||
|
@ -130,8 +136,8 @@ module "project-transformation" {
|
|||
"storage.googleapis.com",
|
||||
"storage-component.googleapis.com",
|
||||
]
|
||||
iam = {
|
||||
"roles/editor" = [module.sa-services-main.iam_email]
|
||||
iam_additive = {
|
||||
"roles/owner" = [module.sa-services-main.iam_email]
|
||||
}
|
||||
service_encryption_key_ids = {
|
||||
compute = [var.service_encryption_key_ids.global]
|
||||
|
@ -151,4 +157,6 @@ module "sa-services-main" {
|
|||
source = "../../../modules/iam-service-account"
|
||||
project_id = module.project-services.project_id
|
||||
name = var.service_account_names.main
|
||||
iam = var.admins != null ? { "roles/iam.serviceAccountTokenCreator" = var.admins } : {}
|
||||
|
||||
}
|
||||
|
|
|
@ -12,6 +12,12 @@
|
|||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
variable "admins" {
|
||||
description = "List of users allowed to impersonate the service account"
|
||||
type = list(string)
|
||||
default = null
|
||||
}
|
||||
|
||||
variable "billing_account_id" {
|
||||
description = "Billing account id."
|
||||
type = string
|
||||
|
|
|
@ -26,7 +26,7 @@ In the previous step, we created the environment (projects and service account)
|
|||
|
||||
To create the resources, copy the output of the environment step (**project_ids**) and paste it into the `terraform.tvars`:
|
||||
|
||||
- Specify your variables in a `terraform.tvars`, you can use the ouptu from the environment stage
|
||||
- Specify your variables in a `terraform.tvars`, you can use the output from the environment stage
|
||||
|
||||
```tfm
|
||||
project_ids = {
|
||||
|
@ -38,15 +38,14 @@ project_ids = {
|
|||
}
|
||||
```
|
||||
|
||||
- Get a key for the service account created in the environment stage:
|
||||
- Go into services project
|
||||
- Go into IAM page
|
||||
- Go into the service account section
|
||||
- Creaet a new key for the service account created in previeous step (**service_account**)
|
||||
- Download the json key into the current folder
|
||||
- make sure you have the right authentication setup: `export GOOGLE_APPLICATION_CREDENTIALS=PATH_TO_SERVICE_ACCOUT_KEY.json`
|
||||
- run `terraform init` and `terraform apply`
|
||||
|
||||
- The providers.tf file has been configured to impersonate the **main** service account
|
||||
|
||||
- To launch terraform:
|
||||
```bash
|
||||
terraform plan
|
||||
terraform apply
|
||||
```
|
||||
Once done testing, you can clean up resources by running `terraform destroy`.
|
||||
|
||||
### CMEK configuration
|
||||
|
@ -68,6 +67,8 @@ You can configure GCP resources to use existing CMEK keys configuring the 'servi
|
|||
| *transformation_buckets* | List of transformation buckets to create | <code title="map(object({ location = string name = string }))">map(object({...}))</code> | | <code title="{ temp = { location = "EU" name = "temp" }, templates = { location = "EU" name = "templates" }, }">...</code> |
|
||||
| *transformation_subnets* | List of subnets to create in the transformation Project. | <code title="list(object({ ip_cidr_range = string name = string region = string secondary_ip_range = map(string) }))">list(object({...}))</code> | | <code title="[ { ip_cidr_range = "10.1.0.0/20" name = "transformation-subnet" region = "europe-west3" secondary_ip_range = {} }, ]">...</code> |
|
||||
| *transformation_vpc_name* | Name of the VPC created in the transformation Project. | <code title="">string</code> | | <code title="">transformation-vpc</code> |
|
||||
| *admins* | List of users allowed to impersonate the service account | <code title="">list</code> | | <code title="">null</code> |
|
||||
|
||||
|
||||
## Outputs
|
||||
|
||||
|
|
|
@ -25,12 +25,18 @@ module "datamart-sa" {
|
|||
iam_project_roles = {
|
||||
"${var.project_ids.datamart}" = ["roles/editor"]
|
||||
}
|
||||
iam = var.admins != null ? { "roles/iam.serviceAccountTokenCreator" = var.admins } : {}
|
||||
}
|
||||
|
||||
module "dwh-sa" {
|
||||
source = "../../../modules/iam-service-account"
|
||||
project_id = var.project_ids.dwh
|
||||
name = var.service_account_names.dwh
|
||||
|
||||
iam_project_roles = {
|
||||
"${var.project_ids.dwh}" = ["roles/bigquery.admin"]
|
||||
}
|
||||
iam = var.admins != null ? { "roles/iam.serviceAccountTokenCreator" = var.admins } : {}
|
||||
}
|
||||
|
||||
module "landing-sa" {
|
||||
|
@ -38,8 +44,11 @@ module "landing-sa" {
|
|||
project_id = var.project_ids.landing
|
||||
name = var.service_account_names.landing
|
||||
iam_project_roles = {
|
||||
"${var.project_ids.landing}" = ["roles/pubsub.publisher"]
|
||||
"${var.project_ids.landing}" = [
|
||||
"roles/pubsub.publisher",
|
||||
"roles/storage.objectCreator"]
|
||||
}
|
||||
iam = var.admins != null ? { "roles/iam.serviceAccountTokenCreator" = var.admins } : {}
|
||||
}
|
||||
|
||||
module "services-sa" {
|
||||
|
@ -49,6 +58,7 @@ module "services-sa" {
|
|||
iam_project_roles = {
|
||||
"${var.project_ids.services}" = ["roles/editor"]
|
||||
}
|
||||
iam = var.admins != null ? { "roles/iam.serviceAccountTokenCreator" = var.admins } : {}
|
||||
}
|
||||
|
||||
module "transformation-sa" {
|
||||
|
@ -66,8 +76,17 @@ module "transformation-sa" {
|
|||
"roles/dataflow.worker",
|
||||
"roles/bigquery.metadataViewer",
|
||||
"roles/storage.objectViewer",
|
||||
],
|
||||
"${var.project_ids.landing}" = [
|
||||
"roles/storage.objectViewer",
|
||||
],
|
||||
"${var.project_ids.dwh}" = [
|
||||
"roles/bigquery.dataOwner",
|
||||
"roles/bigquery.jobUser",
|
||||
"roles/bigquery.metadataViewer",
|
||||
]
|
||||
}
|
||||
iam = var.admins != null ? { "roles/iam.serviceAccountTokenCreator" = var.admins } : {}
|
||||
}
|
||||
|
||||
###############################################################################
|
||||
|
@ -147,6 +166,31 @@ module "vpc-transformation" {
|
|||
subnets = var.transformation_subnets
|
||||
}
|
||||
|
||||
module "firewall" {
|
||||
source = "../../../modules/net-vpc-firewall"
|
||||
project_id = var.project_ids.transformation
|
||||
network = module.vpc-transformation.name
|
||||
admin_ranges_enabled = false
|
||||
admin_ranges = [""]
|
||||
http_source_ranges = []
|
||||
https_source_ranges = []
|
||||
ssh_source_ranges = []
|
||||
|
||||
custom_rules = {
|
||||
iap-svc = {
|
||||
description = "Dataflow service."
|
||||
direction = "INGRESS"
|
||||
action = "allow"
|
||||
sources = ["dataflow"]
|
||||
targets = ["dataflow"]
|
||||
ranges = []
|
||||
use_service_accounts = false
|
||||
rules = [{ protocol = "tcp", ports = ["12345-12346"] }]
|
||||
extra_attributes = {}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
###############################################################################
|
||||
# Pub/Sub #
|
||||
###############################################################################
|
||||
|
|
|
@ -0,0 +1,20 @@
|
|||
# Copyright 2020 Google LLC
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# https://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
provider "google" {
|
||||
impersonate_service_account = "data-platform-main@${var.project_ids.services}.iam.gserviceaccount.com"
|
||||
}
|
||||
provider "google-beta" {
|
||||
impersonate_service_account = "data-platform-main@${var.project_ids.services}.iam.gserviceaccount.com"
|
||||
}
|
|
@ -12,6 +12,13 @@
|
|||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
|
||||
variable "admins" {
|
||||
description = "List of users allowed to impersonate the service account"
|
||||
type = list(string)
|
||||
default = null
|
||||
}
|
||||
|
||||
variable "datamart_bq_datasets" {
|
||||
description = "Datamart Bigquery datasets"
|
||||
type = map(object({
|
||||
|
|
|
@ -3,43 +3,38 @@
|
|||
In this example we will publish person message in the following format:
|
||||
|
||||
```bash
|
||||
Lorenzo,Caggioni,1617898199
|
||||
name,surname,1617898199
|
||||
```
|
||||
|
||||
a Dataflow pipeline will read those messages and import them into a Bigquery table in the DWH project.
|
||||
A Dataflow pipeline will read those messages and import them into a Bigquery table in the DWH project.
|
||||
|
||||
[TODO] An autorized view will be created in the datamart project to expose the table.
|
||||
[TODO] Remove hardcoded 'lcaggio' variables and made ENV variable for it.
|
||||
[TODO] Further automation is expected in future.
|
||||
|
||||
Create and download keys for Service accounts you created.
|
||||
|
||||
## Create BQ table
|
||||
|
||||
Those steps should be done as Transformation Service Account:
|
||||
|
||||
## Set up the env vars
|
||||
```bash
|
||||
gcloud auth activate-service-account sa-dwh@dwh-lc01.iam.gserviceaccount.com --key-file=sa-dwh.json --project=dwh-lc01
|
||||
export DWH_PROJECT_ID=**dwh_project_id**
|
||||
export LANDING_PROJECT_ID=**landing_project_id**
|
||||
export TRANSFORMATION_PROJECT_ID=*transformation_project_id*
|
||||
```
|
||||
|
||||
and you can run the command to create a table:
|
||||
## Create BQ table
|
||||
Those steps should be done as DWH Service Account.
|
||||
|
||||
You can run the command to create a table:
|
||||
|
||||
```bash
|
||||
bq mk \
|
||||
-t \
|
||||
gcloud --impersonate-service-account=sa-datawh@$DWH_PROJECT_ID.iam.gserviceaccount.com \
|
||||
alpha bq tables create person \
|
||||
--project=$DWH_PROJECT_ID --dataset=bq_raw_dataset \
|
||||
--description "This is a Test Person table" \
|
||||
dwh-lc01:bq_raw_dataset.person \
|
||||
name:STRING,surname:STRING,timestamp:TIMESTAMP
|
||||
--schema name=STRING,surname=STRING,timestamp=TIMESTAMP
|
||||
```
|
||||
|
||||
## Produce CSV data file, JSON schema file and UDF JS file
|
||||
|
||||
Those steps should be done as landing Service Account:
|
||||
|
||||
```bash
|
||||
gcloud auth activate-service-account sa-landing@landing-lc01.iam.gserviceaccount.com --key-file=sa-landing.json --project=landing-lc01
|
||||
```
|
||||
|
||||
Let's now create a series of messages we can use to import:
|
||||
|
||||
```bash
|
||||
|
@ -52,7 +47,7 @@ done
|
|||
and copy files to the GCS bucket:
|
||||
|
||||
```bash
|
||||
gsutil cp person.csv gs://landing-lc01-eu-raw-data
|
||||
gsutil -i sa-landing@$LANDING_PROJECT_ID.iam.gserviceaccount.com cp person.csv gs://$LANDING_PROJECT_ID-eu-raw-data
|
||||
```
|
||||
|
||||
Let's create the data JSON schema:
|
||||
|
@ -81,7 +76,8 @@ EOF
|
|||
and copy files to the GCS bucket:
|
||||
|
||||
```bash
|
||||
gsutil cp person_schema.json gs://landing-lc01-eu-data-schema
|
||||
gsutil -i sa-landing@$LANDING_PROJECT_ID.iam.gserviceaccount.com cp person_schema.json gs://$LANDING_PROJECT_ID-eu-data-schema
|
||||
|
||||
```
|
||||
|
||||
Let's create the data UDF function to transform message data:
|
||||
|
@ -105,47 +101,40 @@ EOF
|
|||
and copy files to the GCS bucket:
|
||||
|
||||
```bash
|
||||
gsutil cp person_udf.js gs://landing-lc01-eu-data-schema
|
||||
gsutil -i sa-landing@$LANDING_PROJECT_ID.iam.gserviceaccount.com cp person_udf.js gs://$LANDING_PROJECT_ID-eu-data-schema
|
||||
```
|
||||
|
||||
if you want to check files copied to GCS, you can use the Transformation service account:
|
||||
|
||||
```bash
|
||||
gcloud auth activate-service-account sa-transformation@transformation-lc01.iam.gserviceaccount.com --key-file=sa-transformation.json --project=transformation-lc01
|
||||
```
|
||||
gsutil -i sa-transformation@$TRANSFORMATION_PROJECT_ID.iam.gserviceaccount.com ls gs://$LANDING_PROJECT_ID-eu-raw-data
|
||||
gsutil -i sa-transformation@$TRANSFORMATION_PROJECT_ID.iam.gserviceaccount.com ls gs://$LANDING_PROJECT_ID-eu-data-schema
|
||||
|
||||
and read a message (message won't be acked and will stay in the subscription):
|
||||
|
||||
```bash
|
||||
gsutil ls gs://landing-lc01-eu-raw-data
|
||||
gsutil ls gs://landing-lc01-eu-data-schema
|
||||
```
|
||||
|
||||
## Dataflow
|
||||
|
||||
Those steps should be done as transformation Service Account:
|
||||
Those steps should be done as transformation Service Account.
|
||||
|
||||
|
||||
Let's than start a Dataflow batch pipeline using a Google provided template using internal only IPs, the created network and subnetwork, the appropriate service account and requested parameters:
|
||||
|
||||
```bash
|
||||
gcloud auth activate-service-account sa-transformation@transformation-lc01.iam.gserviceaccount.com --key-file=sa-transformation.json --project=transformation-lc01
|
||||
```
|
||||
|
||||
Let's than start a Dataflwo batch pipeline using a Google provided template using internal only IPs, the created network and subnetwork, the appropriate service account and requested parameters:
|
||||
|
||||
```bash
|
||||
gcloud dataflow jobs run test_batch_lcaggio01 \
|
||||
gcloud --impersonate-service-account=sa-transformation@$TRANSFORMATION_PROJECT_ID.iam.gserviceaccount.com dataflow jobs run test_batch_01 \
|
||||
--gcs-location gs://dataflow-templates/latest/GCS_Text_to_BigQuery \
|
||||
--project transformation-lc01 \
|
||||
--project $TRANSFORMATION_PROJECT_ID \
|
||||
--region europe-west3 \
|
||||
--disable-public-ips \
|
||||
--network transformation-vpc \
|
||||
--subnetwork regions/europe-west3/subnetworks/transformation-subnet \
|
||||
--staging-location gs://transformation-lc01-eu-temp \
|
||||
--service-account-email sa-transformation@transformation-lc01.iam.gserviceaccount.com \
|
||||
--staging-location gs://$TRANSFORMATION_PROJECT_ID-eu-temp \
|
||||
--service-account-email sa-transformation@$TRANSFORMATION_PROJECT_ID.iam.gserviceaccount.com \
|
||||
--parameters \
|
||||
javascriptTextTransformFunctionName=transform,\
|
||||
JSONPath=gs://landing-lc01-eu-data-schema/person_schema.json,\
|
||||
javascriptTextTransformGcsPath=gs://landing-lc01-eu-data-schema/person_udf.js,\
|
||||
inputFilePattern=gs://landing-lc01-eu-raw-data/person.csv,\
|
||||
outputTable=dwh-lc01:bq_raw_dataset.person,\
|
||||
bigQueryLoadingTemporaryDirectory=gs://transformation-lc01-eu-temp
|
||||
JSONPath=gs://$LANDING_PROJECT_ID-eu-data-schema/person_schema.json,\
|
||||
javascriptTextTransformGcsPath=gs://$LANDING_PROJECT_ID-eu-data-schema/person_udf.js,\
|
||||
inputFilePattern=gs://$LANDING_PROJECT_ID-eu-raw-data/person.csv,\
|
||||
outputTable=$DWH_PROJECT_ID:bq_raw_dataset.person,\
|
||||
bigQueryLoadingTemporaryDirectory=gs://$TRANSFORMATION_PROJECT_ID-eu-temp
|
||||
|
||||
```
|
|
@ -3,8 +3,8 @@
|
|||
In this example we will publish person message in the following format:
|
||||
|
||||
```txt
|
||||
name: Lorenzo
|
||||
surname: Caggioni
|
||||
name: Name
|
||||
surname: Surname
|
||||
timestamp: 1617898199
|
||||
```
|
||||
|
||||
|
@ -12,85 +12,64 @@ a Dataflow pipeline will read those messages and import them into a Bigquery tab
|
|||
|
||||
An autorized view will be created in the datamart project to expose the table.
|
||||
|
||||
[TODO] Remove hardcoded 'lcaggio' variables and made ENV variable for it.
|
||||
[TODO] Further automation is expected in future.
|
||||
|
||||
Create and download keys for Service accounts you created, be sure to have `iam.serviceAccountKeys.create` permission on projects or at folder level.
|
||||
|
||||
## Set up the env vars
|
||||
```bash
|
||||
gcloud iam service-accounts keys create sa-landing.json --iam-account=sa-landing@landing-lc01.iam.gserviceaccount.com
|
||||
gcloud iam service-accounts keys create sa-transformation.json --iam-account=sa-transformation@transformation-lc01.iam.gserviceaccount.com
|
||||
gcloud iam service-accounts keys create sa-dwh.json --iam-account=sa-dwh@dwh-lc01.iam.gserviceaccount.com
|
||||
export DWH_PROJECT_ID=**dwh_project_id**
|
||||
export LANDING_PROJECT_ID=**landing_project_id**
|
||||
export TRANSFORMATION_PROJECT_ID=*transformation_project_id*
|
||||
```
|
||||
|
||||
## Create BQ table
|
||||
Those steps should be done as DWH Service Account.
|
||||
|
||||
Those steps should be done as Transformation Service Account:
|
||||
You can run the command to create a table:
|
||||
|
||||
```bash
|
||||
gcloud auth activate-service-account sa-dwh@dwh-lc01.iam.gserviceaccount.com --key-file=sa-dwh.json --project=dwh-lc01
|
||||
```
|
||||
|
||||
and you can run the command to create a table:
|
||||
|
||||
```bash
|
||||
bq mk \
|
||||
-t \
|
||||
gcloud --impersonate-service-account=sa-datawh@$DWH_PROJECT_ID.iam.gserviceaccount.com \
|
||||
alpha bq tables create person \
|
||||
--project=$DWH_PROJECT_ID --dataset=bq_raw_dataset \
|
||||
--description "This is a Test Person table" \
|
||||
dwh-lc01:bq_raw_dataset.person \
|
||||
name:STRING,surname:STRING,timestamp:TIMESTAMP
|
||||
--schema name=STRING,surname=STRING,timestamp=TIMESTAMP
|
||||
```
|
||||
|
||||
## Produce PubSub messages
|
||||
|
||||
Those steps should be done as landing Service Account:
|
||||
|
||||
```bash
|
||||
gcloud auth activate-service-account sa-landing@landing-lc01.iam.gserviceaccount.com --key-file=sa-landing.json --project=landing-lc01
|
||||
```
|
||||
|
||||
and let's now create a series of messages we can use to import:
|
||||
Let's now create a series of messages we can use to import:
|
||||
|
||||
```bash
|
||||
for i in {0..10}
|
||||
do
|
||||
gcloud pubsub topics publish projects/landing-lc01/topics/landing-1 --message="{\"name\": \"Lorenzo\", \"surname\": \"Caggioni\", \"timestamp\": \"$(date +%s)\"}"
|
||||
gcloud --impersonate-service-account=sa-landing@$LANDING_PROJECT_ID.iam.gserviceaccount.com pubsub topics publish projects/$LANDING_PROJECT_ID/topics/landing-1 --message="{\"name\": \"Lorenzo\", \"surname\": \"Caggioni\", \"timestamp\": \"$(date +%s)\"}"
|
||||
done
|
||||
```
|
||||
|
||||
if you want to check messages published, you can use the Transformation service account:
|
||||
if you want to check messages published, you can use the Transformation service account and read a message (message won't be acked and will stay in the subscription):
|
||||
|
||||
```bash
|
||||
gcloud auth activate-service-account sa-transformation@transformation-lc01.iam.gserviceaccount.com --key-file=sa-transformation.json --project=transformation-lc01
|
||||
```
|
||||
|
||||
and read a message (message won't be acked and will stay in the subscription):
|
||||
|
||||
```bash
|
||||
gcloud pubsub subscriptions pull projects/landing-lc01/subscriptions/sub1
|
||||
gcloud --impersonate-service-account=sa-transformation@$TRANSFORMATION_PROJECT_ID.iam.gserviceaccount.com pubsub subscriptions pull projects/$LANDING_PROJECT_ID/subscriptions/sub1
|
||||
```
|
||||
|
||||
## Dataflow
|
||||
|
||||
Those steps should be done as transformation Service Account:
|
||||
|
||||
```bash
|
||||
gcloud auth activate-service-account sa-transformation@transformation-lc01.iam.gserviceaccount.com --key-file=sa-transformation.json --project=transformation-lc01
|
||||
```
|
||||
|
||||
Let's than start a Dataflwo streaming pipeline using a Google provided template using internal only IPs, the created network and subnetwork, the appropriate service account and requested parameters:
|
||||
Let's than start a Dataflow streaming pipeline using a Google provided template using internal only IPs, the created network and subnetwork, the appropriate service account and requested parameters:
|
||||
|
||||
```bash
|
||||
gcloud dataflow jobs run test_lcaggio01 \
|
||||
gcloud dataflow jobs run test_streaming01 \
|
||||
--gcs-location gs://dataflow-templates/latest/PubSub_Subscription_to_BigQuery \
|
||||
--project transformation-lc01 \
|
||||
--project $TRANSFORMATION_PROJECT_ID \
|
||||
--region europe-west3 \
|
||||
--disable-public-ips \
|
||||
--network transformation-vpc \
|
||||
--subnetwork regions/europe-west3/subnetworks/transformation-subnet \
|
||||
--staging-location gs://transformation-lc01-eu-temp \
|
||||
--service-account-email sa-transformation@transformation-lc01.iam.gserviceaccount.com \
|
||||
--staging-location gs://$TRANSFORMATION_PROJECT_ID-eu-temp \
|
||||
--service-account-email sa-transformation@$TRANSFORMATION_PROJECT_ID.iam.gserviceaccount.com \
|
||||
--parameters \
|
||||
inputSubscription=projects/landing-lc01/subscriptions/sub1,\
|
||||
outputTableSpec=dwh-lc01:bq_raw_dataset.person
|
||||
inputSubscription=projects/$LANDING_PROJECT_ID/subscriptions/sub1,\
|
||||
outputTableSpec=$DWH_PROJECT_ID:bq_raw_dataset.person
|
||||
```
|
||||
|
|
|
@ -24,4 +24,4 @@ def test_resources(e2e_plan_runner):
|
|||
"Test that plan works and the numbers of resources is as expected."
|
||||
modules, resources = e2e_plan_runner(FIXTURES_DIR)
|
||||
assert len(modules) == 6
|
||||
assert len(resources) == 45
|
||||
assert len(resources) == 53
|
||||
|
|
Loading…
Reference in New Issue