273 lines
18 KiB
Markdown
273 lines
18 KiB
Markdown
# GKE Multitenant Blueprint
|
|
|
|
This blueprint presents an opinionated architecture to handle multiple homogeneous GKE clusters. The general idea behind this blueprint is to deploy a single project hosting multiple clusters leveraging several useful GKE features.
|
|
|
|
The pattern used in this design is useful, for blueprint, in cases where multiple clusters host/support the same workloads, such as in the case of a multi-regional deployment. Furthermore, combined with Anthos Config Sync and proper RBAC, this architecture can be used to host multiple tenants (e.g. teams, applications) sharing the clusters.
|
|
|
|
This blueprint is used as part of the [FAST GKE stage](../../../fast/stages/3-gke-multitenant/) but it can also be used independently if desired.
|
|
|
|
<p align="center">
|
|
<img src="diagram.png" alt="GKE multitenant">
|
|
</p>
|
|
|
|
The overall architecture is based on the following design decisions:
|
|
|
|
- All clusters are assumed to be [private](https://cloud.google.com/kubernetes-engine/docs/how-to/private-clusters), therefore only [VPC-native clusters](https://cloud.google.com/kubernetes-engine/docs/concepts/alias-ips) are supported.
|
|
- Logging and monitoring configured to use Cloud Operations for system components and user workloads.
|
|
- [GKE metering](https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-usage-metering) enabled by default and stored in a bigquery dataset created within the project.
|
|
- Optional [GKE Fleet](https://cloud.google.com/kubernetes-engine/docs/fleets-overview) support with the possibility to enable any of the following features:
|
|
- [Fleet workload identity](https://cloud.google.com/anthos/fleet-management/docs/use-workload-identity)
|
|
- [Anthos Config Management](https://cloud.google.com/anthos-config-management/docs/overview)
|
|
- [Anthos Service Mesh](https://cloud.google.com/service-mesh/docs/overview)
|
|
- [Anthos Identity Service](https://cloud.google.com/anthos/identity/setup/fleet)
|
|
- [Multi-cluster services](https://cloud.google.com/kubernetes-engine/docs/concepts/multi-cluster-services)
|
|
- [Multi-cluster ingress](https://cloud.google.com/kubernetes-engine/docs/concepts/multi-cluster-ingress).
|
|
- Support for [Config Sync](https://cloud.google.com/anthos-config-management/docs/config-sync-overview), [Hierarchy Controller](https://cloud.google.com/anthos-config-management/docs/concepts/hierarchy-controller), and [Policy Controller](https://cloud.google.com/anthos-config-management/docs/concepts/policy-controller) when using Anthos Config Management.
|
|
- [Groups for GKE](https://cloud.google.com/kubernetes-engine/docs/how-to/google-groups-rbac) can be enabled to facilitate the creation of flexible RBAC policies referencing group principals.
|
|
- Support for [application layer secret encryption](https://cloud.google.com/kubernetes-engine/docs/how-to/encrypting-secrets).
|
|
- Support to customize peering configuration of the control plane VPC (e.g. to import/export routes to the peered network)
|
|
- Some features are enabled by default in all clusters:
|
|
- [Intranode visibility](https://cloud.google.com/kubernetes-engine/docs/how-to/intranode-visibility)
|
|
- [Dataplane v2](https://cloud.google.com/kubernetes-engine/docs/concepts/dataplane-v2)
|
|
- [Shielded GKE nodes](https://cloud.google.com/kubernetes-engine/docs/how-to/shielded-gke-nodes)
|
|
- [Workload identity](https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity)
|
|
- [Node local DNS cache](https://cloud.google.com/kubernetes-engine/docs/how-to/nodelocal-dns-cache)
|
|
- [Use of the GCE persistent disk CSI driver](https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/gce-pd-csi-driver)
|
|
- Node [auto-upgrade](https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-upgrades) and [auto-repair](https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-repair) for all node pools
|
|
|
|
<!--
|
|
- [GKE subsetting for L4 internal load balancers](https://cloud.google.com/kubernetes-engine/docs/concepts/service-load-balancer#subsetting) enabled by default in all clusters
|
|
-->
|
|
|
|
## Basic usage
|
|
|
|
The following example shows how to deploy two clusters and one node pool for each
|
|
|
|
```hcl
|
|
locals {
|
|
cluster_defaults = {
|
|
private_cluster_config = {
|
|
enable_private_endpoint = true
|
|
master_global_access = true
|
|
}
|
|
}
|
|
subnet_self_links = {
|
|
ew1 = "projects/prj-host/regions/europe-west1/subnetworks/gke-0"
|
|
ew3 = "projects/prj-host/regions/europe-west3/subnetworks/gke-0"
|
|
}
|
|
}
|
|
|
|
module "gke-fleet" {
|
|
source = "./fabric/blueprints/gke/multitenant-fleet/"
|
|
project_id = var.project_id
|
|
billing_account_id = var.billing_account_id
|
|
folder_id = var.folder_id
|
|
prefix = "myprefix"
|
|
group_iam = {
|
|
"gke-admin@example.com" = [
|
|
"roles/container.admin"
|
|
]
|
|
}
|
|
iam = {
|
|
"roles/container.clusterAdmin" = [
|
|
"cicd@my-cicd-project.iam.gserviceaccount.com"
|
|
]
|
|
}
|
|
clusters = {
|
|
cluster-0 = {
|
|
location = "europe-west1"
|
|
private_cluster_config = local.cluster_defaults.private_cluster_config
|
|
vpc_config = {
|
|
subnetwork = local.subnet_self_links.ew1
|
|
master_ipv4_cidr_block = "172.16.10.0/28"
|
|
}
|
|
}
|
|
cluster-1 = {
|
|
location = "europe-west3"
|
|
private_cluster_config = local.cluster_defaults.private_cluster_config
|
|
vpc_config = {
|
|
subnetwork = local.subnet_self_links.ew3
|
|
master_ipv4_cidr_block = "172.16.20.0/28"
|
|
}
|
|
}
|
|
}
|
|
nodepools = {
|
|
cluster-0 = {
|
|
nodepool-0 = {
|
|
node_config = {
|
|
disk_type = "pd-balanced"
|
|
machine_type = "n2-standard-4"
|
|
spot = true
|
|
}
|
|
}
|
|
}
|
|
cluster-1 = {
|
|
nodepool-0 = {
|
|
node_config = {
|
|
disk_type = "pd-balanced"
|
|
machine_type = "n2-standard-4"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
vpc_config = {
|
|
host_project_id = "my-host-project-id"
|
|
vpc_self_link = "projects/prj-host/global/networks/prod-0"
|
|
}
|
|
}
|
|
# tftest modules=7 resources=27
|
|
```
|
|
|
|
## GKE Fleet
|
|
|
|
This example deploys two clusters and configures several GKE Fleet features:
|
|
|
|
- Enables [multi-cluster ingress](https://cloud.google.com/kubernetes-engine/docs/concepts/multi-cluster-ingress) and sets the configuration cluster to be `cluster-eu1`.
|
|
- Enables [Multi-cluster services](https://cloud.google.com/kubernetes-engine/docs/concepts/multi-cluster-services) and assigns the [required roles](https://cloud.google.com/kubernetes-engine/docs/how-to/multi-cluster-services#authenticating) to its service accounts.
|
|
- A `default` Config Management template is created with binary authorization, config sync enabled with a git repository, hierarchy controller, and policy controller.
|
|
- The two clusters are configured to use the `default` Config Management template.
|
|
|
|
```hcl
|
|
locals {
|
|
subnet_self_links = {
|
|
ew1 = "projects/prj-host/regions/europe-west1/subnetworks/gke-0"
|
|
ew3 = "projects/prj-host/regions/europe-west3/subnetworks/gke-0"
|
|
}
|
|
}
|
|
|
|
module "gke" {
|
|
source = "./fabric/blueprints/gke/multitenant-fleet/"
|
|
project_id = var.project_id
|
|
billing_account_id = var.billing_account_id
|
|
folder_id = var.folder_id
|
|
prefix = "myprefix"
|
|
clusters = {
|
|
cluster-0 = {
|
|
location = "europe-west1"
|
|
vpc_config = {
|
|
subnetwork = local.subnet_self_links.ew1
|
|
}
|
|
}
|
|
cluster-1 = {
|
|
location = "europe-west3"
|
|
vpc_config = {
|
|
subnetwork = local.subnet_self_links.ew3
|
|
}
|
|
}
|
|
}
|
|
nodepools = {
|
|
cluster-0 = {
|
|
nodepool-0 = {
|
|
node_config = {
|
|
disk_type = "pd-balanced"
|
|
machine_type = "n2-standard-4"
|
|
spot = true
|
|
}
|
|
}
|
|
}
|
|
cluster-1 = {
|
|
nodepool-0 = {
|
|
node_config = {
|
|
disk_type = "pd-balanced"
|
|
machine_type = "n2-standard-4"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
fleet_features = {
|
|
appdevexperience = false
|
|
configmanagement = true
|
|
identityservice = true
|
|
multiclusteringress = "cluster-0"
|
|
multiclusterservicediscovery = true
|
|
servicemesh = true
|
|
}
|
|
fleet_workload_identity = true
|
|
fleet_configmanagement_templates = {
|
|
default = {
|
|
binauthz = true
|
|
config_sync = {
|
|
git = {
|
|
gcp_service_account_email = null
|
|
https_proxy = null
|
|
policy_dir = "configsync"
|
|
secret_type = "none"
|
|
source_format = "hierarchy"
|
|
sync_branch = "main"
|
|
sync_repo = "https://github.com/myorg/myrepo"
|
|
sync_rev = null
|
|
sync_wait_secs = null
|
|
}
|
|
prevent_drift = true
|
|
source_format = "hierarchy"
|
|
}
|
|
hierarchy_controller = {
|
|
enable_hierarchical_resource_quota = true
|
|
enable_pod_tree_labels = true
|
|
}
|
|
policy_controller = {
|
|
audit_interval_seconds = 30
|
|
exemptable_namespaces = ["kube-system"]
|
|
log_denies_enabled = true
|
|
referential_rules_enabled = true
|
|
template_library_installed = true
|
|
}
|
|
version = "1.10.2"
|
|
}
|
|
}
|
|
fleet_configmanagement_clusters = {
|
|
default = ["cluster-0", "cluster-1"]
|
|
}
|
|
vpc_config = {
|
|
host_project_id = "my-host-project-id"
|
|
vpc_self_link = "projects/prj-host/global/networks/prod-0"
|
|
}
|
|
}
|
|
|
|
# tftest modules=8 resources=38
|
|
```
|
|
|
|
<!-- TFDOC OPTS files:1 -->
|
|
<!-- BEGIN TFDOC -->
|
|
|
|
## Files
|
|
|
|
| name | description | modules |
|
|
|---|---|---|
|
|
| [gke-clusters.tf](./gke-clusters.tf) | GKE clusters. | <code>gke-cluster-standard</code> |
|
|
| [gke-hub.tf](./gke-hub.tf) | GKE hub configuration. | <code>gke-hub</code> |
|
|
| [gke-nodepools.tf](./gke-nodepools.tf) | GKE nodepools. | <code>gke-nodepool</code> |
|
|
| [main.tf](./main.tf) | Project and usage dataset. | <code>bigquery-dataset</code> · <code>project</code> |
|
|
| [outputs.tf](./outputs.tf) | Output variables. | |
|
|
| [variables.tf](./variables.tf) | Module variables. | |
|
|
|
|
## Variables
|
|
|
|
| name | description | type | required | default |
|
|
|---|---|:---:|:---:|:---:|
|
|
| [billing_account_id](variables.tf#L17) | Billing account id. | <code>string</code> | ✓ | |
|
|
| [folder_id](variables.tf#L132) | Folder used for the GKE project in folders/nnnnnnnnnnn format. | <code>string</code> | ✓ | |
|
|
| [prefix](variables.tf#L183) | Prefix used for resource names. | <code>string</code> | ✓ | |
|
|
| [project_id](variables.tf#L192) | ID of the project that will contain all the clusters. | <code>string</code> | ✓ | |
|
|
| [vpc_config](variables.tf#L204) | Shared VPC project and VPC details. | <code title="object({ host_project_id = string vpc_self_link = string })">object({…})</code> | ✓ | |
|
|
| [clusters](variables.tf#L22) | Clusters configuration. Refer to the gke-cluster module for type details. | <code title="map(object({ cluster_autoscaling = optional(any) description = optional(string) enable_addons = optional(any, { horizontal_pod_autoscaling = true, http_load_balancing = true }) enable_features = optional(any, { workload_identity = true }) issue_client_certificate = optional(bool, false) labels = optional(map(string)) location = string logging_config = optional(list(string), ["SYSTEM_COMPONENTS"]) maintenance_config = optional(any, { daily_window_start_time = "03:00" recurring_window = null maintenance_exclusion = [] }) max_pods_per_node = optional(number, 110) min_master_version = optional(string) monitoring_config = optional(object({ enable_components = optional(list(string), ["SYSTEM_COMPONENTS"]) managed_prometheus = optional(bool) })) node_locations = optional(list(string)) private_cluster_config = optional(any) release_channel = optional(string) vpc_config = object({ subnetwork = string network = optional(string) secondary_range_blocks = optional(object({ pods = string services = string })) secondary_range_names = optional(object({ pods = string services = string }), { pods = "pods", services = "services" }) master_authorized_ranges = optional(map(string)) master_ipv4_cidr_block = optional(string) }) }))">map(object({…}))</code> | | <code>{}</code> |
|
|
| [fleet_configmanagement_clusters](variables.tf#L70) | Config management features enabled on specific sets of member clusters, in config name => [cluster name] format. | <code>map(list(string))</code> | | <code>{}</code> |
|
|
| [fleet_configmanagement_templates](variables.tf#L77) | Sets of config management configurations that can be applied to member clusters, in config name => {options} format. | <code title="map(object({ binauthz = bool config_sync = object({ git = object({ gcp_service_account_email = string https_proxy = string policy_dir = string secret_type = string sync_branch = string sync_repo = string sync_rev = string sync_wait_secs = number }) prevent_drift = string source_format = string }) hierarchy_controller = object({ enable_hierarchical_resource_quota = bool enable_pod_tree_labels = bool }) policy_controller = object({ audit_interval_seconds = number exemptable_namespaces = list(string) log_denies_enabled = bool referential_rules_enabled = bool template_library_installed = bool }) version = string }))">map(object({…}))</code> | | <code>{}</code> |
|
|
| [fleet_features](variables.tf#L112) | Enable and configure fleet features. Set to null to disable GKE Hub if fleet workload identity is not used. | <code title="object({ appdevexperience = bool configmanagement = bool identityservice = bool multiclusteringress = string multiclusterservicediscovery = bool servicemesh = bool })">object({…})</code> | | <code>null</code> |
|
|
| [fleet_workload_identity](variables.tf#L125) | Use Fleet Workload Identity for clusters. Enables GKE Hub if set to true. | <code>bool</code> | | <code>false</code> |
|
|
| [group_iam](variables.tf#L137) | Project-level IAM bindings for groups. Use group emails as keys, list of roles as values. | <code>map(list(string))</code> | | <code>{}</code> |
|
|
| [iam](variables.tf#L144) | Project-level authoritative IAM bindings for users and service accounts in {ROLE => [MEMBERS]} format. | <code>map(list(string))</code> | | <code>{}</code> |
|
|
| [labels](variables.tf#L151) | Project-level labels. | <code>map(string)</code> | | <code>{}</code> |
|
|
| [nodepools](variables.tf#L157) | Nodepools configuration. Refer to the gke-nodepool module for type details. | <code title="map(map(object({ gke_version = optional(string) labels = optional(map(string), {}) max_pods_per_node = optional(number) name = optional(string) node_config = optional(any, { disk_type = "pd-balanced" }) node_count = optional(map(number), { initial = 1 }) node_locations = optional(list(string)) nodepool_config = optional(any) pod_range = optional(any) reservation_affinity = optional(any) service_account = optional(any) sole_tenant_nodegroup = optional(string) tags = optional(list(string)) taints = optional(list(object({ key = string value = string effect = string }))) })))">map(map(object({…})))</code> | | <code>{}</code> |
|
|
| [project_services](variables.tf#L197) | Additional project services to enable. | <code>list(string)</code> | | <code>[]</code> |
|
|
|
|
## Outputs
|
|
|
|
| name | description | sensitive |
|
|
|---|---|:---:|
|
|
| [cluster_ids](outputs.tf#L17) | Cluster ids. | |
|
|
| [clusters](outputs.tf#L24) | Cluster resources. | |
|
|
| [project_id](outputs.tf#L29) | GKE project id. | |
|
|
|
|
<!-- END TFDOC -->
|