cloud-foundation-fabric/blueprints/data-solutions
Ayman Farhat 02d8d8367a
[Feature] Update data platform blue print with Dataflow Flex template (#1105)
* Add initial dataflow template code + TF infra

* Refactor the datapipeline DAG to use flex template operator, cleanup code

* Remove unneeded bash scripts, update README with manual examples

* Refactor datapipeline_dc_tags.py and include new Flex template

* Update docs to reflect changes

* Remove sub-dependencies and keep apache beam

* Add missing license headers and update tests

* Set resouces to 291 in tests

* Update outputs via tfdoc

* Update with outputs order and tfdoc

* Correct number of resources

* Fix to add region into command from var

* Enable service account impersonation for running builds

* Update example dataflow run command to use orchestrator SA

* Remove hard coded values in example

* Keep original airflow files, add new which use Flex template as example

* Update tests and doc

* Fix number of resources in plan

* Run tfdoc remove files section in README

* Fix number of modules in tfdoc

* Update number of resources

* Add missin service account

* Update DF demo README

* Quick rename

---------

Co-authored-by: lcaggio <lorenzo.caggioni@gmail.com>
Co-authored-by: Ludovico Magnocavallo <ludomagno@google.com>
2023-02-06 07:35:40 +01:00
..
cloudsql-multiregion Enforce nonempty descriptions ending in a colon 2022-11-24 18:56:01 +01:00
cmek-via-centralized-kms Implement project_config variable 2023-02-05 21:12:46 +01:00
composer-2 Normalize prefix handling in blueprints (#1003) 2022-11-23 11:09:00 +01:00
data-platform-foundations [Feature] Update data platform blue print with Dataflow Flex template (#1105) 2023-02-06 07:35:40 +01:00
data-playground Bump beta provider to 4.48 2023-01-29 15:50:24 +01:00
gcs-to-bq-with-least-privileges Bump beta provider to 4.48 2023-01-29 15:50:24 +01:00
sqlserver-alwayson Allow setting no ranges in firewall module custom rules (#1073) 2022-12-23 09:03:31 +01:00
vertex-mlops Vertex Pipelines MLOps framework blueprint (#1038) 2023-02-02 19:13:13 +01:00
README.md Vertex Pipelines MLOps framework blueprint (#1038) 2023-02-02 19:13:13 +01:00

README.md

GCP Data Services blueprints

The blueprints in this folder implement typical data service topologies and end-to-end scenarios, that allow testing specific features like Cloud KMS to encrypt your data, or VPC-SC to mitigate data exfiltration.

They are meant to be used as minimal but complete starting points to create actual infrastructure, and as playgrounds to experiment with specific Google Cloud features.

Blueprints

Cloud SQL instance with multi-region read replicas

This blueprint creates a Cloud SQL instance with multi-region read replicas as described in the Cloud SQL for PostgreSQL disaster recovery article.


GCE and GCS CMEK via centralized Cloud KMS

This blueprint implements CMEK for GCS and GCE, via keys hosted in KMS running in a centralized project. The blueprint shows the basic resources and permissions for the typical use case of application projects implementing encryption at rest via a centrally managed KMS service.


Cloud Composer version 2 private instance, supporting Shared VPC and external CMEK key

This blueprint creates a Cloud Composer version 2 instance on a VPC with a dedicated service account. The solution supports as inputs: a Shared VPC and Cloud KMS CMEK keys.


Data Platform Foundations

This blueprint implements a robust and flexible Data Foundation on GCP that provides opinionated defaults, allowing customers to build and scale out additional data pipelines quickly and reliably.


Data Playground starter with Cloud Vertex AI Notebook and GCS

This blueprint creates a Vertex AI Notebook running on a VPC with a private IP and a dedicated Service Account. A GCS bucket and a BigQuery dataset are created to store inputs and outputs of data experiments.


Cloud Storage to Bigquery with Cloud Dataflow with least privileges

This blueprint implements resources required to run GCS to BigQuery Dataflow pipelines. The solution rely on a set of Services account created with the least privileges principle.


SQL Server Always On Availability Groups

This blueprint implements SQL Server Always On Availability Groups using Fabric modules. It builds a two node cluster with a fileshare witness instance in an existing VPC and adds the necessary firewalling. The actual setup process (apart from Active Directory operations) has been scripted, so that least amount of manual works needs to performed.


MLOps with Vertex AI

This blueprint implements the infrastructure required to have a fully functional MLOPs environment using Vertex AI: required GCP services activation, Vertex Workbench, GCS buckets to host Vertex AI and Cloud Build artifacts, Artifact Registry docker repository to host custom images, required service accounts, networking and Workload Identity Federation Provider for Github integration (optional).