Update to multiple README.md (#1379)

- blueprints/data-solutions/data-platform-foundations/README.md
- blueprints/factories/project-factory/README.md
- modules/net-ilb-l7/README.md
- modules/project/README.md
This commit is contained in:
Alejandro Leal 2023-05-16 02:11:34 -04:00 committed by GitHub
parent 56132ffb03
commit 6a89d71e96
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
4 changed files with 7 additions and 7 deletions

View File

@ -2,7 +2,7 @@
This module implements an opinionated Data Platform Architecture that creates and setup projects and related resources that compose an end-to-end data environment.
For a minimal Data Platform, plese refer to the [Minimal Data Platform](../data-platform-minimal/) blueprint.
For a minimal Data Platform, please refer to the [Minimal Data Platform](../data-platform-minimal/) blueprint.
The code is intentionally simple, as it's intended to provide a generic initial setup and then allow easy customizations to complete the implementation of the intended design.
@ -41,13 +41,13 @@ This separation into projects allows adhering to the least-privilege principle b
The script will create the following projects:
- **Drop off** Used to store temporary data. Data is pushed to Cloud Storage, BigQuery, or Cloud PubSub. Resources are configured with a customizable lifecycle policy.
- **Load** Used to load data from the drop off zone to the data warehouse. The load is made with minimal to zero transformation logic (mainly `cast`). Anonymization or tokenization of Personally Identifiable Information (PII) can be implemented here or in the transformation stage, depending on your requirements. The use of [Cloud Dataflow templates](https://cloud.google.com/dataflow/docs/concepts/dataflow-templates) is recommended. When you need to handle workloads from different teams, if strong role separation is needed between them, we suggest to customize the scirpt and have separate `Load` projects.
- **Load** Used to load data from the drop off zone to the data warehouse. The load is made with minimal to zero transformation logic (mainly `cast`). Anonymization or tokenization of Personally Identifiable Information (PII) can be implemented here or in the transformation stage, depending on your requirements. The use of [Cloud Dataflow templates](https://cloud.google.com/dataflow/docs/concepts/dataflow-templates) is recommended. When you need to handle workloads from different teams, if strong role separation is needed between them, we suggest to customize the script and have separate `Load` projects.
- **Data Warehouse** Several projects distributed across 3 separate layers, to host progressively processed and refined data:
- **Landing - Raw data** Structured Data, stored in relevant formats: structured data stored in BigQuery, unstructured data stored on Cloud Storage with additional metadata stored in BigQuery (for example pictures stored in Cloud Storage and analysis of the images for Cloud Vision API stored in BigQuery).
- **Curated - Cleansed, aggregated and curated data**
- **Confidential - Curated and unencrypted layer**
- **Orchestration** Used to host Cloud Composer, which orchestrates all tasks that move data across layers.
- **Transformation** Used to move data between Data Warehouse layers. We strongly suggest relying on BigQuery Engine to perform the transformations. If BigQuery doesn't have the features needed to perform your transformations, you can use Cloud Dataflow with [Cloud Dataflow templates](https://cloud.google.com/dataflow/docs/concepts/dataflow-templates). This stage can also optionally anonymize or tokenize PII. When you need to handle workloads from different teams, if strong role separation is needed between them, we suggest to customize the scirpt and have separate `Tranformation` projects.
- **Transformation** Used to move data between Data Warehouse layers. We strongly suggest relying on BigQuery Engine to perform the transformations. If BigQuery doesn't have the features needed to perform your transformations, you can use Cloud Dataflow with [Cloud Dataflow templates](https://cloud.google.com/dataflow/docs/concepts/dataflow-templates). This stage can also optionally anonymize or tokenize PII. When you need to handle workloads from different teams, if strong role separation is needed between them, we suggest to customize the script and have separate `Transformation` projects.
- **Exposure** Used to host resources that share processed data with external systems. Depending on the access pattern, data can be presented via Cloud SQL, BigQuery, or Bigtable. For BigQuery data, we strongly suggest relying on [Authorized views](https://cloud.google.com/bigquery/docs/authorized-views).
### Roles

View File

@ -83,7 +83,7 @@ module "projects" {
```yaml
# ./data/defaults.yaml
# The following applies as overrideable defaults for all projects
# The following applies as overridable defaults for all projects
# All attributes are required
billing_account_id: 012345-67890A-BCDEF0

View File

@ -2,7 +2,7 @@
This module allows managing Internal HTTP/HTTPS Load Balancers (L7 ILBs). It's designed to expose the full configuration of the underlying resources, and to facilitate common usage patterns by providing sensible defaults, and optionally managing prerequisite resources like health checks, instance groups, etc.
Due to the complexity of the underlying resources, changes to the configuration that involve recreation of resources are best applied in stages, starting by disabling the configuration in the urlmap that references the resources that neeed recreation, then doing the same for the backend service, etc.
Due to the complexity of the underlying resources, changes to the configuration that involve recreation of resources are best applied in stages, starting by disabling the configuration in the urlmap that references the resources that need recreation, then doing the same for the backend service, etc.
## Examples
@ -109,7 +109,7 @@ You can leverage externally defined health checks for backend services, or have
Health check configuration is controlled via the `health_check_configs` variable, which behaves in a similar way to other LB modules in this repository.
Defining different health checks fromt he default is very easy. You can for example replace the default HTTP health check with a TCP one and reference it in you backend service:
Defining different health checks from the default is very easy. You can for example replace the default HTTP health check with a TCP one and reference it in you backend service:
```hcl
module "ilb-l7" {

View File

@ -294,7 +294,7 @@ module "project" {
Organization policies can be loaded from a directory containing YAML files where each file defines one or more constraints. The structure of the YAML files is exactly the same as the `org_policies` variable.
Note that contraints defined via `org_policies` take precedence over those in `org_policies_data_path`. In other words, if you specify the same contraint in a YAML file *and* in the `org_policies` variable, the latter will take priority.
Note that constraints defined via `org_policies` take precedence over those in `org_policies_data_path`. In other words, if you specify the same constraint in a YAML file *and* in the `org_policies` variable, the latter will take priority.
The example below deploys a few organization policies split between two YAML files.