cloud-foundation-fabric/modules/bigquery-dataset/README.md

6.7 KiB

Google Cloud Bigquery Module

This module allows managing a single BigQuery dataset, including access configuration, tables and views.

Examples

Simple dataset with access configuration

Access configuration defaults to using incremental accesses, which add to the default ones set at dataset creation. You can use the access_authoritative variable to switch to authoritative mode and have full control over dataset-level access. Be sure to always have at least one OWNER access and to avoid duplicating accesses, or terraform apply will fail.

module "bigquery-dataset" {
  source     = "./modules/bigquery-dataset"
  project_id = "my-project
  id         = "my-dataset"
  access = {
    "OWNER" = [
      { identity_type = "group_by_email",  identity      = "dataset-owners@example.com" }
    ]
  }
}

Dataset options

Dataset options are set via the options variable. all options must be specified, but a null value can be set to options that need to use defaults.

module "bigquery-dataset" {
  source     = "./modules/bigquery-dataset"
  project_id = "my-project
  id         = "my-dataset"
  options = {
    default_table_expiration_ms     = 3600000
    default_partition_expiration_ms = null
    delete_contents_on_destroy      = false
  }
}

Tables and views

Tables are created via the tables variable, or the view variable for views. Support for external tables will be added in a future release.

module "bigquery-dataset" {
  source     = "./modules/bigquery-dataset"
  project_id = "my-project
  id         = "my-dataset"
  tables = {
    table_a = {
      friendly_name = "Table a"
      labels        = {}
      options       = null
      partitioning  = null
      schema = file("table-a.json")
    }
  }
}

If partitioning is needed, populate the partitioning variable using either the time or range attribute.

module "bigquery-dataset" {
  source     = "./modules/bigquery-dataset"
  project_id = "my-project
  id         = "my-dataset"
  tables = {
    table_a = {
      friendly_name = "Table a"
      labels        = {}
      options       = null
      partitioning = {
        field = null
        range = null # use start/end/interval for range
        time  = { type = "DAY", expiration_ms = null }
      }
      schema = file("table-a.json")
    }
  }
}

To create views use the view variable. If you're querying a table created by the same module terraform apply will initially fail and eventually succeed once the underlying table has been created. You can probably also use the module's output in the view's query to create a dependency on the table.

module "bigquery-dataset" {
  source     = "./modules/bigquery-dataset"
  project_id = "my-project
  id         = "my-dataset"
  tables = {
    table_a = {
      friendly_name = "Table a"
      labels        = {}
      options       = null
      partitioning = {
        field = null
        range = null # use start/end/interval for range
        time  = { type = "DAY", expiration_ms = null }
      }
      schema = file("table-a.json")
    }
  }
  views = {
    view_a = {
      friendly_name  = "View a"
      labels         = {}
      query          = "SELECT * from `my-project.my-dataset.table_a`"
      use_legacy_sql = false
    }
  }
}

Variables

name description type required default
id Dataset id. string
project_id Id of the project where datasets will be created. string
access Dataset access rules keyed by role, valid identity types are domain, group_by_email, special_group and user_by_email. Mode can be controlled via the access_authoritative variable. map(list(object({...}))) {}
access_authoritative Use authoritative access instead of additive. bool false
access_views Dataset access rules for views. Mode can be controlled via the access_authoritative variable. list(object({...})) []
encryption_key Self link of the KMS key that will be used to protect destination table. string null
friendly_name Dataset friendly name. string null
labels Dataset labels. map(string) {}
location Dataset location. string EU
options Dataset options. object({...}) ...
tables Table definitions. Options and partitioning default to null. Partitioning can only use range or time, set the unused one to null. map(object({...})) {}
views View definitions. map(object({...})) {}

Outputs

name description sensitive
dataset Dataset resource.
dataset_id Dataset full id.
id Dataset id.
self_link Dataset self link.
tables Table resources.
views View resources.

TODO

  • add support for tables