cloud-foundation-fabric/blueprints/gke/patterns/batch
Ludovico Magnocavallo 6941313c7d
Factories refactor (#1843)
* factories refactor doc

* Adds file schema and filesystem organization

* Update 20231106-factories.md

* move factories out of blueprints and create new factories  README

* align factory in billing-account module

* align factory in dataplex-datascan module

* align factory in billing-account module

* align factory in net-firewall-policy module

* align factory in dns-response-policy module

* align factory in net-vpc-firewall module

* align factory in net-vpc module

* align factory variable names in FAST

* remove decentralized firewall blueprint

* bump terraform version

* bump module versions

* update top-level READMEs

* move project factory to modules

* fix variable names and tests

* tfdoc

* remove changelog link

* add project factory to top-level README

* fix cludrun eventarc diff

* fix README

* fix cludrun eventarc diff

---------

Co-authored-by: Simone Ruffilli <sruffilli@google.com>
2024-02-26 10:16:52 +00:00
..
manifest-templates GKE stateful blueprints (#2059) 2024-02-08 18:28:41 +00:00
README.md GKE stateful blueprints (#2059) 2024-02-08 18:28:41 +00:00
create_jobs.sh GKE stateful blueprints (#2059) 2024-02-08 18:28:41 +00:00
job-team-a.yaml GKE stateful blueprints (#2059) 2024-02-08 18:28:41 +00:00
job-team-b.yaml GKE stateful blueprints (#2059) 2024-02-08 18:28:41 +00:00
main.tf GKE stateful blueprints (#2059) 2024-02-08 18:28:41 +00:00
providers.tf GKE stateful blueprints (#2059) 2024-02-08 18:28:41 +00:00
tutorial.md GKE stateful blueprints (#2059) 2024-02-08 18:28:41 +00:00
variables.tf GKE stateful blueprints (#2059) 2024-02-08 18:28:41 +00:00
versions.tf Factories refactor (#1843) 2024-02-26 10:16:52 +00:00

README.md

Batch Processing on GKE with Kueue

Introduction

This blueprint shows how to deploy a batch system using Kueue to perform job queuing on Google Kubernetes Engine (GKE) using Terraform.

Kueue is a Cloud Native Job scheduler that works with the default Kubernetes scheduler, the Job controller, and the cluster autoscaler to provide an end-to-end batch system. Kueue implements Job queueing, deciding when Jobs should wait and when they should start, based on quotas and a hierarchy for sharing resources fairly among teams.

Requirements

This blueprint assumes the GKE cluster already exists. We recommend using the accompanying Autopilot Cluster Pattern to deploy a cluster according to Google's best practices. Once you have the cluster up-and-running, you can use this blueprint to deploy Kueue in it.

The Kueue manifests use container images hosted by registry.k8s.io, which means that the subnet where the GKE cluster is deployed needs to have Internet connectivity to download the images. If you're using the provided Autopilot Cluster Pattern, you can set the enable_cloud_nat option of the vpc_create variable.

Cluster authentication

Once you have a cluster with Internet connectivity, create a terraform.tfvars and setup the credentials_config variable. We recommend using Anthos Fleet to simplify accessing the control plane.

Kueue Configuration

Only two variables are available to control Kueue's configuration:

  • teams_namespaces which controls the namespaces used by different teams to run jobs.
  • kueue_namespace which controls the namepsace to deploy Kueue's own resources.

Any other configuration can be applied by directly modifying the YAML manifests under the manifest-templates directory.

Sample Configuration

The following template as a starting point for your terraform.tfvars

credentials_config = {
  kubeconfig = {
    path = "~/.kube/config"
  }
}
teams_namespaces = [
  "team-a",
  "team-b"
]

Variables

name description type required default
credentials_config Configure how Terraform authenticates to the cluster. object({…})
kueue_namespace Namespaces of the teams running jobs in the clusters. string "kueue-system"
team_namespaces Namespaces of the teams running jobs in the clusters. list(string) […]
templates_path Path where manifest templates will be read from. Set to null to use the default manifests. string null