# GKE Multitenant Blueprint This blueprint presents an opinionated architecture to handle multiple homogeneous GKE clusters. The general idea behind this blueprint is to deploy a single project hosting multiple clusters leveraging several useful GKE features. The pattern used in this design is useful, for blueprint, in cases where multiple clusters host/support the same workloads, such as in the case of a multi-regional deployment. Furthermore, combined with Anthos Config Sync and proper RBAC, this architecture can be used to host multiple tenants (e.g. teams, applications) sharing the clusters. This blueprint is used as part of the [FAST GKE stage](../../../fast/stages/03-gke-multitenant/) but it can also be used independently if desired.

GKE multitenant

The overall architecture is based on the following design decisions: - All clusters are assumed to be [private](https://cloud.google.com/kubernetes-engine/docs/how-to/private-clusters), therefore only [VPC-native clusters](https://cloud.google.com/kubernetes-engine/docs/concepts/alias-ips) are supported. - Logging and monitoring configured to use Cloud Operations for system components and user workloads. - [GKE metering](https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-usage-metering) enabled by default and stored in a bigquery dataset created within the project. - Optional [GKE Fleet](https://cloud.google.com/kubernetes-engine/docs/fleets-overview) support with the possibility to enable any of the following features: - [Fleet workload identity](https://cloud.google.com/anthos/fleet-management/docs/use-workload-identity) - [Anthos Config Management](https://cloud.google.com/anthos-config-management/docs/overview) - [Anthos Service Mesh](https://cloud.google.com/service-mesh/docs/overview) - [Anthos Identity Service](https://cloud.google.com/anthos/identity/setup/fleet) - [Multi-cluster services](https://cloud.google.com/kubernetes-engine/docs/concepts/multi-cluster-services) - [Multi-cluster ingress](https://cloud.google.com/kubernetes-engine/docs/concepts/multi-cluster-ingress). - Support for [Config Sync](https://cloud.google.com/anthos-config-management/docs/config-sync-overview), [Hierarchy Controller](https://cloud.google.com/anthos-config-management/docs/concepts/hierarchy-controller), and [Policy Controller](https://cloud.google.com/anthos-config-management/docs/concepts/policy-controller) when using Anthos Config Management. - [Groups for GKE](https://cloud.google.com/kubernetes-engine/docs/how-to/google-groups-rbac) can be enabled to facilitate the creation of flexible RBAC policies referencing group principals. - Support for [application layer secret encryption](https://cloud.google.com/kubernetes-engine/docs/how-to/encrypting-secrets). - Support to customize peering configuration of the control plane VPC (e.g. to import/export routes to the peered network) - Some features are enabled by default in all clusters: - [Intranode visibility](https://cloud.google.com/kubernetes-engine/docs/how-to/intranode-visibility) - [Dataplane v2](https://cloud.google.com/kubernetes-engine/docs/concepts/dataplane-v2) - [Shielded GKE nodes](https://cloud.google.com/kubernetes-engine/docs/how-to/shielded-gke-nodes) - [Workload identity](https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity) - [Node local DNS cache](https://cloud.google.com/kubernetes-engine/docs/how-to/nodelocal-dns-cache) - [Use of the GCE persistent disk CSI driver](https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/gce-pd-csi-driver) - Node [auto-upgrade](https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-upgrades) and [auto-repair](https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-repair) for all node pools ## Basic usage The following example shows how to deploy a single cluster and a single node pool ```hcl module "gke" { source = "./fabric/blueprints/gke/multitenant-fleet/" project_id = var.project_id billing_account_id = var.billing_account_id folder_id = var.folder_id prefix = "myprefix" vpc_config = { host_project_id = "my-host-project-id" vpc_self_link = "projects/my-host-project-id/global/networks/my-network" } authenticator_security_group = "gke-rbac-base@example.com" group_iam = { "gke-admin@example.com" = [ "roles/container.admin" ] } iam = { "roles/container.clusterAdmin" = [ "cicd@my-cicd-project.iam.gserviceaccount.com" ] } clusters = { mycluster = { cluster_autoscaling = null description = "My cluster" dns_domain = null location = "europe-west1" labels = {} net = { master_range = "172.17.16.0/28" pods = "pods" services = "services" subnet = "projects/my-host-project-id/regions/europe-west1/subnetworks/mycluster-subnet" } overrides = null } } nodepools = { mycluster = { mynodepool = { initial_node_count = 1 node_count = 1 node_type = "n2-standard-4" overrides = null spot = false } } } } # tftest modules=5 resources=26 ``` ## Creating Multiple Clusters The following example shows how to deploy two clusters with different configurations. The first cluster `cluster-euw1` defines the mandatory configuration parameters (description, location, network setup) and inherits the some defaults from the `cluster_defaults` and `nodepool_deaults` variables. These two variables are used whenever the `override` key of the `clusters` and `nodepools` variables are set to `null`. On the other hand, the second cluster (`cluster-euw3`) defines its own configuration by providing a value to the `overrides` key. ```hcl module "gke" { source = "./fabric/blueprints/gke/multitenant-fleet/" project_id = var.project_id billing_account_id = var.billing_account_id folder_id = var.folder_id prefix = "myprefix" vpc_config = { host_project_id = "my-host-project-id" vpc_self_link = "projects/my-host-project-id/global/networks/my-network" } clusters = { cluster-euw1 = { cluster_autoscaling = null description = "Cluster for europ-west1" dns_domain = null location = "europe-west1" labels = {} net = { master_range = "172.17.16.0/28" pods = "pods" services = "services" subnet = "projects/my-host-project-id/regions/europe-west1/subnetworks/euw1-subnet" } overrides = null } cluster-euw3 = { cluster_autoscaling = null description = "Cluster for europe-west3" dns_domain = null location = "europe-west3" labels = {} net = { master_range = "172.17.17.0/28" pods = "pods" services = "services" subnet = "projects/my-host-project-id/regions/europe-west3/subnetworks/euw3-subnet" } overrides = { cloudrun_config = false database_encryption_key = null gcp_filestore_csi_driver_config = true master_authorized_ranges = { rfc1918_1 = "10.0.0.0/8" } max_pods_per_node = 64 pod_security_policy = true release_channel = "STABLE" vertical_pod_autoscaling = false } } } nodepools = { cluster-euw1 = { pool-euw1 = { initial_node_count = 1 node_count = 1 node_type = "n2-standard-4" overrides = null spot = false } } cluster-euw3 = { pool-euw3 = { initial_node_count = 1 node_count = 1 node_type = "n2-standard-4" overrides = { image_type = "UBUNTU_CONTAINERD" max_pods_per_node = 64 node_locations = [] node_tags = [] node_taints = [] } spot = true } } } } # tftest modules=7 resources=28 ``` ## Multiple clusters with GKE Fleet This example deploys two clusters and configures several GKE Fleet features: - Enables [multi-cluster ingress](https://cloud.google.com/kubernetes-engine/docs/concepts/multi-cluster-ingress) and sets the configuration cluster to be `cluster-eu1`. - Enables [Multi-cluster services](https://cloud.google.com/kubernetes-engine/docs/concepts/multi-cluster-services) and assigns the [required roles](https://cloud.google.com/kubernetes-engine/docs/how-to/multi-cluster-services#authenticating) to its service accounts. - A `default` Config Management template is created with binary authorization, config sync enabled with a git repository, hierarchy controller, and policy controller. - The two clusters are configured to use the `default` Config Management template. ```hcl module "gke" { source = "./fabric/blueprints/gke/multitenant-fleet/" project_id = var.project_id billing_account_id = var.billing_account_id folder_id = var.folder_id prefix = "myprefix" vpc_config = { host_project_id = "my-host-project-id" vpc_self_link = "projects/my-host-project-id/global/networks/my-network" } clusters = { cluster-euw1 = { cluster_autoscaling = null description = "Cluster for europe-west1" dns_domain = null location = "europe-west1" labels = {} net = { master_range = "172.17.16.0/28" pods = "pods" services = "services" subnet = "projects/my-host-project-id/regions/europe-west1/subnetworks/euw1-subnet" } overrides = null } cluster-euw3 = { cluster_autoscaling = null description = "Cluster for europe-west3" dns_domain = null location = "europe-west3" labels = {} net = { master_range = "172.17.17.0/28" pods = "pods" services = "services" subnet = "projects/my-host-project-id/regions/europe-west3/subnetworks/euw3-subnet" } overrides = null } } nodepools = { cluster-euw1 = { pool-euw1 = { initial_node_count = 1 node_count = 1 node_type = "n2-standard-4" overrides = null spot = false } } cluster-euw3 = { pool-euw3 = { initial_node_count = 1 node_count = 1 node_type = "n2-standard-4" overrides = null spot = true } } } fleet_features = { appdevexperience = false configmanagement = true identityservice = true multiclusteringress = "cluster-euw1" multiclusterservicediscovery = true servicemesh = true } fleet_workload_identity = true fleet_configmanagement_templates = { default = { binauthz = true config_sync = { git = { gcp_service_account_email = null https_proxy = null policy_dir = "configsync" secret_type = "none" source_format = "hierarchy" sync_branch = "main" sync_repo = "https://github.com/myorg/myrepo" sync_rev = null sync_wait_secs = null } prevent_drift = true source_format = "hierarchy" } hierarchy_controller = { enable_hierarchical_resource_quota = true enable_pod_tree_labels = true } policy_controller = { audit_interval_seconds = 30 exemptable_namespaces = ["kube-system"] log_denies_enabled = true referential_rules_enabled = true template_library_installed = true } version = "1.10.2" } } fleet_configmanagement_clusters = { default = ["cluster-euw1", "cluster-euw3"] } } # tftest modules=8 resources=39 ``` ## Files | name | description | modules | |---|---|---| | [gke-clusters.tf](./gke-clusters.tf) | None | gke-cluster | | [gke-hub.tf](./gke-hub.tf) | None | gke-hub | | [gke-nodepools.tf](./gke-nodepools.tf) | None | gke-nodepool | | [main.tf](./main.tf) | Module-level locals and resources. | bigquery-dataset · project | | [outputs.tf](./outputs.tf) | Output variables. | | | [variables.tf](./variables.tf) | Module variables. | | ## Variables | name | description | type | required | default | producer | |---|---|:---:|:---:|:---:|:---:| | [billing_account_id](variables.tf#L23) | Billing account id. | string | ✓ | | | | [clusters](variables.tf#L57) | | map(object({…})) | ✓ | | | | [folder_id](variables.tf#L158) | Folder used for the GKE project in folders/nnnnnnnnnnn format. | string | ✓ | | | | [nodepools](variables.tf#L201) | | map(map(object({…}))) | ✓ | | | | [prefix](variables.tf#L231) | Prefix used for resources that need unique names. | string | ✓ | | | | [project_id](variables.tf#L236) | ID of the project that will contain all the clusters. | string | ✓ | | | | [vpc_config](variables.tf#L248) | Shared VPC project and VPC details. | object({…}) | ✓ | | | | [authenticator_security_group](variables.tf#L17) | Optional group used for Groups for GKE. | string | | null | | | [cluster_defaults](variables.tf#L28) | Default values for optional cluster configurations. | object({…}) | | {…} | | | [dns_domain](variables.tf#L90) | Domain name used for clusters, prefixed by each cluster name. Leave null to disable Cloud DNS for GKE. | string | | null | | | [fleet_configmanagement_clusters](variables.tf#L96) | Config management features enabled on specific sets of member clusters, in config name => [cluster name] format. | map(list(string)) | | {} | | | [fleet_configmanagement_templates](variables.tf#L103) | Sets of config management configurations that can be applied to member clusters, in config name => {options} format. | map(object({…})) | | {} | | | [fleet_features](variables.tf#L138) | Enable and configue fleet features. Set to null to disable GKE Hub if fleet workload identity is not used. | object({…}) | | null | | | [fleet_workload_identity](variables.tf#L151) | Use Fleet Workload Identity for clusters. Enables GKE Hub if set to true. | bool | | false | | | [group_iam](variables.tf#L163) | Project-level IAM bindings for groups. Use group emails as keys, list of roles as values. | map(list(string)) | | {} | | | [iam](variables.tf#L170) | Project-level authoritative IAM bindings for users and service accounts in {ROLE => [MEMBERS]} format. | map(list(string)) | | {} | | | [labels](variables.tf#L177) | Project-level labels. | map(string) | | {} | | | [nodepool_defaults](variables.tf#L183) | | object({…}) | | {…} | | | [peering_config](variables.tf#L218) | Configure peering with the control plane VPC. Requires compute.networks.updatePeering. Set to null if you don't want to update the default peering configuration. | object({…}) | | {…} | | | [project_services](variables.tf#L241) | Additional project services to enable. | list(string) | | [] | | ## Outputs | name | description | sensitive | consumers | |---|---|:---:|---| | [cluster_ids](outputs.tf#L22) | Cluster ids. | | | | [clusters](outputs.tf#L17) | Cluster resources. | | | | [project_id](outputs.tf#L29) | GKE project id. | | |