360 lines
22 KiB
Markdown
360 lines
22 KiB
Markdown
# GKE Multitenant Example
|
|
|
|
This example presents an opinionated architecture to handle multiple homogeneous GKE clusters. The general idea behind this example is to deploy a single project hosting multiple clusters leveraging several useful GKE features.
|
|
|
|
The pattern used in this design is useful, for example, in cases where multiple clusters host/support the same workloads, such as in the case of a multi-regional deployment. Furthermore, combined with Anthos Config Sync and proper RBAC, this architecture can be used to host multiple tenants (e.g. teams, applications) sharing the clusters.
|
|
|
|
This example is used as part of the [FAST GKE stage](../../../fast/stages/03-gke-multitenant/) but it can also be used independently if desired.
|
|
|
|
<p align="center">
|
|
<img src="diagram.png" alt="GKE multitenant">
|
|
</p>
|
|
|
|
The overall architecture is based on the following design decisions:
|
|
|
|
- All clusters are assumed to be [private](https://cloud.google.com/kubernetes-engine/docs/how-to/private-clusters), therefore only [VPC-native clusters](https://cloud.google.com/kubernetes-engine/docs/concepts/alias-ips) are supported.
|
|
- Logging and monitoring configured to use Cloud Operations for system components and user workloads.
|
|
- [GKE metering](https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-usage-metering) enabled by default and stored in a bigquery dataset created within the project.
|
|
- Optional [GKE Fleet](https://cloud.google.com/kubernetes-engine/docs/fleets-overview) support with the possibility to enable any of the following features:
|
|
- [Fleet workload identity](https://cloud.google.com/anthos/fleet-management/docs/use-workload-identity)
|
|
- [Anthos Config Management](https://cloud.google.com/anthos-config-management/docs/overview)
|
|
- [Anthos Service Mesh](https://cloud.google.com/service-mesh/docs/overview)
|
|
- [Anthos Identity Service](https://cloud.google.com/anthos/identity/setup/fleet)
|
|
- [Multi-cluster services](https://cloud.google.com/kubernetes-engine/docs/concepts/multi-cluster-services)
|
|
- [Multi-cluster ingress](https://cloud.google.com/kubernetes-engine/docs/concepts/multi-cluster-ingress).
|
|
- Support for [Config Sync](https://cloud.google.com/anthos-config-management/docs/config-sync-overview), [Hierarchy Controller](https://cloud.google.com/anthos-config-management/docs/concepts/hierarchy-controller), and [Policy Controller](https://cloud.google.com/anthos-config-management/docs/concepts/policy-controller) when using Anthos Config Management.
|
|
- [Groups for GKE](https://cloud.google.com/kubernetes-engine/docs/how-to/google-groups-rbac) can be enabled to facilitate the creation of flexible RBAC policies referencing group principals.
|
|
- Support for [application layer secret encryption](https://cloud.google.com/kubernetes-engine/docs/how-to/encrypting-secrets).
|
|
- Support to customize peering configuration of the control plane VPC (e.g. to import/export routes to the peered network)
|
|
- Some features are enabled by default in all clusters:
|
|
- [Intranode visibility](https://cloud.google.com/kubernetes-engine/docs/how-to/intranode-visibility)
|
|
- [Dataplane v2](https://cloud.google.com/kubernetes-engine/docs/concepts/dataplane-v2)
|
|
- [Shielded GKE nodes](https://cloud.google.com/kubernetes-engine/docs/how-to/shielded-gke-nodes)
|
|
- [Workload identity](https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity)
|
|
- [Node local DNS cache](https://cloud.google.com/kubernetes-engine/docs/how-to/nodelocal-dns-cache)
|
|
- [Use of the GCE persistent disk CSI driver](https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/gce-pd-csi-driver)
|
|
- Node [auto-upgrade](https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-upgrades) and [auto-repair](https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-repair) for all node pools
|
|
|
|
<!--
|
|
- [GKE subsetting for L4 internal load balancers](https://cloud.google.com/kubernetes-engine/docs/concepts/service-load-balancer#subsetting) enabled by default in all clusters
|
|
-->
|
|
|
|
## Basic usage
|
|
|
|
The following example shows how to deploy a single cluster and a single node pool
|
|
|
|
```hcl
|
|
module "gke" {
|
|
source = "./fabric/blueprints/gke-serverless/multitenant-fleet/"
|
|
project_id = var.project_id
|
|
billing_account_id = var.billing_account_id
|
|
folder_id = var.folder_id
|
|
prefix = "myprefix"
|
|
vpc_config = {
|
|
host_project_id = "my-host-project-id"
|
|
vpc_self_link = "projects/my-host-project-id/global/networks/my-network"
|
|
}
|
|
|
|
authenticator_security_group = "gke-rbac-base@example.com"
|
|
group_iam = {
|
|
"gke-admin@example.com" = [
|
|
"roles/container.admin"
|
|
]
|
|
}
|
|
iam = {
|
|
"roles/container.clusterAdmin" = [
|
|
"cicd@my-cicd-project.iam.gserviceaccount.com"
|
|
]
|
|
}
|
|
|
|
clusters = {
|
|
mycluster = {
|
|
cluster_autoscaling = null
|
|
description = "My cluster"
|
|
dns_domain = null
|
|
location = "europe-west1"
|
|
labels = {}
|
|
net = {
|
|
master_range = "172.17.16.0/28"
|
|
pods = "pods"
|
|
services = "services"
|
|
subnet = "projects/my-host-project-id/regions/europe-west1/subnetworks/mycluster-subnet"
|
|
}
|
|
overrides = null
|
|
}
|
|
}
|
|
nodepools = {
|
|
mycluster = {
|
|
mynodepool = {
|
|
initial_node_count = 1
|
|
node_count = 1
|
|
node_type = "n2-standard-4"
|
|
overrides = null
|
|
spot = false
|
|
}
|
|
}
|
|
}
|
|
}
|
|
# tftest modules=1 resources=0
|
|
```
|
|
|
|
## Creating Multiple Clusters
|
|
|
|
The following example shows how to deploy two clusters with different configurations.
|
|
|
|
The first cluster `cluster-euw1` defines the mandatory configuration parameters (description, location, network setup) and inherits the some defaults from the `cluster_defaults` and `nodepool_deaults` variables. These two variables are used whenever the `override` key of the `clusters` and `nodepools` variables are set to `null`.
|
|
|
|
On the other hand, the second cluster (`cluster-euw3`) defines its own configuration by providing a value to the `overrides` key.
|
|
|
|
|
|
```hcl
|
|
module "gke" {
|
|
source = "./fabric/blueprints/gke-serverless/multitenant-fleet/"
|
|
project_id = var.project_id
|
|
billing_account_id = var.billing_account_id
|
|
folder_id = var.folder_id
|
|
prefix = "myprefix"
|
|
vpc_config = {
|
|
host_project_id = "my-host-project-id"
|
|
vpc_self_link = "projects/my-host-project-id/global/networks/my-network"
|
|
}
|
|
clusters = {
|
|
cluster-euw1 = {
|
|
cluster_autoscaling = null
|
|
description = "Cluster for europ-west1"
|
|
dns_domain = null
|
|
location = "europe-west1"
|
|
labels = {}
|
|
net = {
|
|
master_range = "172.17.16.0/28"
|
|
pods = "pods"
|
|
services = "services"
|
|
subnet = "projects/my-host-project-id/regions/europe-west1/subnetworks/euw1-subnet"
|
|
}
|
|
overrides = null
|
|
}
|
|
cluster-euw3 = {
|
|
cluster_autoscaling = null
|
|
description = "Cluster for europe-west3"
|
|
dns_domain = null
|
|
location = "europe-west3"
|
|
labels = {}
|
|
net = {
|
|
master_range = "172.17.17.0/28"
|
|
pods = "pods"
|
|
services = "services"
|
|
subnet = "projects/my-host-project-id/regions/europe-west3/subnetworks/euw3-subnet"
|
|
}
|
|
overrides = {
|
|
cloudrun_config = false
|
|
database_encryption_key = null
|
|
gcp_filestore_csi_driver_config = true
|
|
master_authorized_ranges = {
|
|
rfc1918_1 = "10.0.0.0/8"
|
|
}
|
|
max_pods_per_node = 64
|
|
pod_security_policy = true
|
|
release_channel = "STABLE"
|
|
vertical_pod_autoscaling = false
|
|
}
|
|
}
|
|
}
|
|
nodepools = {
|
|
cluster-euw1 = {
|
|
pool-euw1 = {
|
|
initial_node_count = 1
|
|
node_count = 1
|
|
node_type = "n2-standard-4"
|
|
overrides = null
|
|
spot = false
|
|
}
|
|
}
|
|
cluster-euw3 = {
|
|
pool-euw3 = {
|
|
initial_node_count = 1
|
|
node_count = 1
|
|
node_type = "n2-standard-4"
|
|
overrides = {
|
|
image_type = "UBUNTU_CONTAINERD"
|
|
max_pods_per_node = 64
|
|
node_locations = []
|
|
node_tags = []
|
|
node_taints = []
|
|
}
|
|
spot = true
|
|
}
|
|
}
|
|
}
|
|
}
|
|
# tftest modules=1 resources=0
|
|
```
|
|
|
|
## Multiple clusters with GKE Fleet
|
|
|
|
This example deploys two clusters and configures several GKE Fleet features:
|
|
|
|
- Enables [multi-cluster ingress](https://cloud.google.com/kubernetes-engine/docs/concepts/multi-cluster-ingress) and sets the configuration cluster to be `cluster-eu1`.
|
|
- Enables [Multi-cluster services](https://cloud.google.com/kubernetes-engine/docs/concepts/multi-cluster-services) and assigns the [required roles](https://cloud.google.com/kubernetes-engine/docs/how-to/multi-cluster-services#authenticating) to its service accounts.
|
|
- A `default` Config Management template is created with binary authorization, config sync enabled with a git repository, hierarchy controller, and policy controller.
|
|
- The two clusters are configured to use the `default` Config Management template.
|
|
|
|
```hcl
|
|
module "gke" {
|
|
source = "./fabric/blueprints/gke-serverless/multitenant-fleet/"
|
|
project_id = var.project_id
|
|
billing_account_id = var.billing_account_id
|
|
folder_id = var.folder_id
|
|
prefix = "myprefix"
|
|
vpc_config = {
|
|
host_project_id = "my-host-project-id"
|
|
vpc_self_link = "projects/my-host-project-id/global/networks/my-network"
|
|
}
|
|
clusters = {
|
|
cluster-euw1 = {
|
|
cluster_autoscaling = null
|
|
description = "Cluster for europe-west1"
|
|
dns_domain = null
|
|
location = "europe-west1"
|
|
labels = {}
|
|
net = {
|
|
master_range = "172.17.16.0/28"
|
|
pods = "pods"
|
|
services = "services"
|
|
subnet = "projects/my-host-project-id/regions/europe-west1/subnetworks/euw1-subnet"
|
|
}
|
|
overrides = null
|
|
}
|
|
cluster-euw3 = {
|
|
cluster_autoscaling = null
|
|
description = "Cluster for europe-west3"
|
|
dns_domain = null
|
|
location = "europe-west3"
|
|
labels = {}
|
|
net = {
|
|
master_range = "172.17.17.0/28"
|
|
pods = "pods"
|
|
services = "services"
|
|
subnet = "projects/my-host-project-id/regions/europe-west3/subnetworks/euw3-subnet"
|
|
}
|
|
overrides = null
|
|
}
|
|
}
|
|
nodepools = {
|
|
cluster-euw1 = {
|
|
pool-euw1 = {
|
|
initial_node_count = 1
|
|
node_count = 1
|
|
node_type = "n2-standard-4"
|
|
overrides = null
|
|
spot = false
|
|
}
|
|
}
|
|
cluster-euw3 = {
|
|
pool-euw3 = {
|
|
initial_node_count = 1
|
|
node_count = 1
|
|
node_type = "n2-standard-4"
|
|
overrides = null
|
|
spot = true
|
|
}
|
|
}
|
|
}
|
|
|
|
fleet_features = {
|
|
appdevexperience = false
|
|
configmanagement = true
|
|
identityservice = true
|
|
multiclusteringress = "cluster-euw1"
|
|
multiclusterservicediscovery = true
|
|
servicemesh = true
|
|
}
|
|
fleet_workload_identity = true
|
|
fleet_configmanagement_templates = {
|
|
default = {
|
|
binauthz = true
|
|
config_sync = {
|
|
git = {
|
|
gcp_service_account_email = null
|
|
https_proxy = null
|
|
policy_dir = "configsync"
|
|
secret_type = "none"
|
|
source_format = "hierarchy"
|
|
sync_branch = "main"
|
|
sync_repo = "https://github.com/myorg/myrepo"
|
|
sync_rev = null
|
|
sync_wait_secs = null
|
|
}
|
|
prevent_drift = true
|
|
source_format = "hierarchy"
|
|
}
|
|
hierarchy_controller = {
|
|
enable_hierarchical_resource_quota = true
|
|
enable_pod_tree_labels = true
|
|
}
|
|
policy_controller = {
|
|
audit_interval_seconds = 30
|
|
exemptable_namespaces = ["kube-system"]
|
|
log_denies_enabled = true
|
|
referential_rules_enabled = true
|
|
template_library_installed = true
|
|
}
|
|
version = "1.10.2"
|
|
}
|
|
}
|
|
fleet_configmanagement_clusters = {
|
|
default = ["cluster-euw1", "cluster-euw3"]
|
|
}
|
|
}
|
|
|
|
# tftest modules=1 resources=0
|
|
```
|
|
|
|
<!-- TFDOC OPTS files:1 show_extra:1 -->
|
|
<!-- BEGIN TFDOC -->
|
|
|
|
## Files
|
|
|
|
| name | description | modules |
|
|
|---|---|---|
|
|
| [gke-clusters.tf](./gke-clusters.tf) | None | <code>gke-cluster</code> |
|
|
| [gke-hub.tf](./gke-hub.tf) | None | <code>gke-hub</code> |
|
|
| [gke-nodepools.tf](./gke-nodepools.tf) | None | <code>gke-nodepool</code> |
|
|
| [main.tf](./main.tf) | Module-level locals and resources. | <code>bigquery-dataset</code> · <code>project</code> |
|
|
| [outputs.tf](./outputs.tf) | Output variables. | |
|
|
| [variables.tf](./variables.tf) | Module variables. | |
|
|
|
|
## Variables
|
|
|
|
| name | description | type | required | default | producer |
|
|
|---|---|:---:|:---:|:---:|:---:|
|
|
| [billing_account_id](variables.tf#L23) | Billing account id. | <code>string</code> | ✓ | | |
|
|
| [clusters](variables.tf#L57) | | <code title="map(object({ cluster_autoscaling = object({ cpu_min = number cpu_max = number memory_min = number memory_max = number }) description = string dns_domain = string labels = map(string) location = string net = object({ master_range = string pods = string services = string subnet = string }) overrides = object({ cloudrun_config = bool database_encryption_key = string master_authorized_ranges = map(string) max_pods_per_node = number pod_security_policy = bool release_channel = string vertical_pod_autoscaling = bool gcp_filestore_csi_driver_config = bool }) }))">map(object({…}))</code> | ✓ | | |
|
|
| [folder_id](variables.tf#L158) | Folder used for the GKE project in folders/nnnnnnnnnnn format. | <code>string</code> | ✓ | | |
|
|
| [nodepools](variables.tf#L201) | | <code title="map(map(object({ node_count = number node_type = string initial_node_count = number overrides = object({ image_type = string max_pods_per_node = number node_locations = list(string) node_tags = list(string) node_taints = list(string) }) spot = bool })))">map(map(object({…})))</code> | ✓ | | |
|
|
| [prefix](variables.tf#L231) | Prefix used for resources that need unique names. | <code>string</code> | ✓ | | |
|
|
| [project_id](variables.tf#L236) | ID of the project that will contain all the clusters. | <code>string</code> | ✓ | | |
|
|
| [vpc_config](variables.tf#L248) | Shared VPC project and VPC details. | <code title="object({ host_project_id = string vpc_self_link = string })">object({…})</code> | ✓ | | |
|
|
| [authenticator_security_group](variables.tf#L17) | Optional group used for Groups for GKE. | <code>string</code> | | <code>null</code> | |
|
|
| [cluster_defaults](variables.tf#L28) | Default values for optional cluster configurations. | <code title="object({ cloudrun_config = bool database_encryption_key = string master_authorized_ranges = map(string) max_pods_per_node = number pod_security_policy = bool release_channel = string vertical_pod_autoscaling = bool gcp_filestore_csi_driver_config = bool })">object({…})</code> | | <code title="{ cloudrun_config = false database_encryption_key = null master_authorized_ranges = { rfc1918_1 = "10.0.0.0/8" rfc1918_2 = "172.16.0.0/12" rfc1918_3 = "192.168.0.0/16" } max_pods_per_node = 110 pod_security_policy = false release_channel = "STABLE" vertical_pod_autoscaling = false gcp_filestore_csi_driver_config = false }">{…}</code> | |
|
|
| [dns_domain](variables.tf#L90) | Domain name used for clusters, prefixed by each cluster name. Leave null to disable Cloud DNS for GKE. | <code>string</code> | | <code>null</code> | |
|
|
| [fleet_configmanagement_clusters](variables.tf#L96) | Config management features enabled on specific sets of member clusters, in config name => [cluster name] format. | <code>map(list(string))</code> | | <code>{}</code> | |
|
|
| [fleet_configmanagement_templates](variables.tf#L103) | Sets of config management configurations that can be applied to member clusters, in config name => {options} format. | <code title="map(object({ binauthz = bool config_sync = object({ git = object({ gcp_service_account_email = string https_proxy = string policy_dir = string secret_type = string sync_branch = string sync_repo = string sync_rev = string sync_wait_secs = number }) prevent_drift = string source_format = string }) hierarchy_controller = object({ enable_hierarchical_resource_quota = bool enable_pod_tree_labels = bool }) policy_controller = object({ audit_interval_seconds = number exemptable_namespaces = list(string) log_denies_enabled = bool referential_rules_enabled = bool template_library_installed = bool }) version = string }))">map(object({…}))</code> | | <code>{}</code> | |
|
|
| [fleet_features](variables.tf#L138) | Enable and configue fleet features. Set to null to disable GKE Hub if fleet workload identity is not used. | <code title="object({ appdevexperience = bool configmanagement = bool identityservice = bool multiclusteringress = string multiclusterservicediscovery = bool servicemesh = bool })">object({…})</code> | | <code>null</code> | |
|
|
| [fleet_workload_identity](variables.tf#L151) | Use Fleet Workload Identity for clusters. Enables GKE Hub if set to true. | <code>bool</code> | | <code>false</code> | |
|
|
| [group_iam](variables.tf#L163) | Project-level IAM bindings for groups. Use group emails as keys, list of roles as values. | <code>map(list(string))</code> | | <code>{}</code> | |
|
|
| [iam](variables.tf#L170) | Project-level authoritative IAM bindings for users and service accounts in {ROLE => [MEMBERS]} format. | <code>map(list(string))</code> | | <code>{}</code> | |
|
|
| [labels](variables.tf#L177) | Project-level labels. | <code>map(string)</code> | | <code>{}</code> | |
|
|
| [nodepool_defaults](variables.tf#L183) | | <code title="object({ image_type = string max_pods_per_node = number node_locations = list(string) node_tags = list(string) node_taints = list(string) })">object({…})</code> | | <code title="{ image_type = "COS_CONTAINERD" max_pods_per_node = 110 node_locations = null node_tags = null node_taints = [] }">{…}</code> | |
|
|
| [peering_config](variables.tf#L218) | Configure peering with the control plane VPC. Requires compute.networks.updatePeering. Set to null if you don't want to update the default peering configuration. | <code title="object({ export_routes = bool import_routes = bool })">object({…})</code> | | <code title="{ export_routes = true // TODO(jccb) is there any situation where the control plane VPC would export any routes? import_routes = false }">{…}</code> | |
|
|
| [project_services](variables.tf#L241) | Additional project services to enable. | <code>list(string)</code> | | <code>[]</code> | |
|
|
|
|
## Outputs
|
|
|
|
| name | description | sensitive | consumers |
|
|
|---|---|:---:|---|
|
|
| [cluster_ids](outputs.tf#L22) | Cluster ids. | | |
|
|
| [clusters](outputs.tf#L17) | Cluster resources. | | |
|
|
| [project_id](outputs.tf#L29) | GKE project id. | | |
|
|
|
|
<!-- END TFDOC -->
|