# Google Cloud Bigquery Module This module allows managing a single BigQuery dataset, including access configuration, tables and views. ## TODO - [ ] check for dynamic values in tables and views - [ ] add support for external tables ## Examples ### Simple dataset with access configuration Access configuration defaults to using the separate `google_bigquery_dataset_access` resource, so as to leave the default dataset access rules untouched. You can choose to manage the `google_bigquery_dataset` access rules instead via the `dataset_access` variable, but be sure to always have at least one `OWNER` access and to avoid duplicating accesses, or `terraform apply` will fail. The access variables are split into `access` and `access_identities` variables, so that dynamic values can be passed in for identities (eg a service account email generated by a different module or resource). ```hcl module "bigquery-dataset" { source = "./fabric/modules/bigquery-dataset" project_id = "my-project" id = "my-dataset" access = { reader-group = { role = "READER", type = "group" } owner = { role = "OWNER", type = "user" } project_owners = { role = "OWNER", type = "special_group" } view_1 = { role = "READER", type = "view" } } access_identities = { reader-group = "playground-test@ludomagno.net" owner = "ludo@ludomagno.net" project_owners = "projectOwners" view_1 = "my-project|my-dataset|my-table" } } # tftest modules=1 resources=5 inventory=simple.yaml ``` ### IAM roles Access configuration can also be specified via IAM instead of basic roles via the `iam` variable. When using IAM, basic roles cannot be used via the `access` family variables. ```hcl module "bigquery-dataset" { source = "./fabric/modules/bigquery-dataset" project_id = "my-project" id = "my-dataset" iam = { "roles/bigquery.dataOwner" = ["user:user1@example.org"] } } # tftest modules=1 resources=2 inventory=iam.yaml ``` ### Authorized Views, Datasets, and Routines You can specify authorized [views](https://cloud.google.com/bigquery/docs/authorized-views), [datasets](https://cloud.google.com/bigquery/docs/authorized-datasets?hl=en), and [routines](https://cloud.google.com/bigquery/docs/authorized-routines) via the `authorized_views`, `authorized_datasets` and `authorized_routines` variables, respectively. ```hcl // Create private BigQuery dataset that will not be publicly accessible, except via the authorized BigQuery resources module "bigquery-dataset-private" { source = "./fabric/modules/bigquery-dataset" project_id = "private_project" id = "private_dataset" authorized_views = [ { project_id = "auth_view_project" dataset_id = "auth_view_dataset" table_id = "auth_view" } ] authorized_datasets = [ { project_id = "auth_dataset_project" dataset_id = "auth_dataset" } ] authorized_routines = [ { project_id = "auth_routine_project" dataset_id = "auth_routine_dataset" routine_id = "auth_routine" } ] } // Create authorized view in a public dataset module "bigquery-authorized-views-dataset-public" { source = "./fabric/modules/bigquery-dataset" project_id = "auth_view_project" id = "auth_view_dataset" views = { auth_view = { friendly_name = "Public" labels = {} query = "SELECT * FROM `private_project.private_dataset.private_table`" use_legacy_sql = false deletion_protection = true } } } // Create public authorized dataset module "bigquery-authorized-dataset-public" { source = "./fabric/modules/bigquery-dataset" project_id = "auth_dataset_project" id = "auth_dataset" } // Create public authorized routine module "bigquery-authorized-authorized-routine-dataset-public" { source = "./fabric/modules/bigquery-dataset" project_id = "auth_routine_project" id = "auth_routine_dataset" } resource "google_bigquery_routine" "public-routine" { project = "private_project" dataset_id = module.bigquery-authorized-authorized-routine-dataset-public.dataset_id routine_id = "auth_routine" routine_type = "TABLE_VALUED_FUNCTION" language = "SQL" definition_body = <<-EOS SELECT 1 + value AS value EOS arguments { name = "value" argument_kind = "FIXED_TYPE" data_type = jsonencode({ "typeKind" = "INT64" }) } return_table_type = jsonencode({ "columns" = [ { "name" = "value", "type" = { "typeKind" = "INT64" } }, ] }) } # tftest modules=4 resources=9 inventory=authorized_resources.yaml ``` Authorized views can be specified both using the standard `access` options and the `authorized_views` blocks. The example configuration below uses both blocks, and will create a dataset with three authorized views `view_id_1`, `view_id_2`, and `view_id_3`. ```hcl module "bigquery-dataset" { source = "./fabric/modules/bigquery-dataset" project_id = "my-project" id = "my-dataset" authorized_views = [ { project_id = "view_project" dataset_id = "view_dataset" table_id = "view_id_1" }, { project_id = "view_project" dataset_id = "view_dataset" table_id = "view_id_2" } ] access = { view_2 = { role = "READER", type = "view" } view_3 = { role = "READER", type = "view" } } access_identities = { view_2 = "view_project|view_dataset|view_id_2" view_3 = "view_project|view_dataset|view_id_3" } } # tftest modules=1 resources=4 inventory=authorized_resources_views.yaml ``` ### Dataset options Dataset options are set via the `options` variable. all options must be specified, but a `null` value can be set to options that need to use defaults. ```hcl module "bigquery-dataset" { source = "./fabric/modules/bigquery-dataset" project_id = "my-project" id = "my-dataset" options = { default_table_expiration_ms = 3600000 default_partition_expiration_ms = null delete_contents_on_destroy = false max_time_travel_hours = 168 } } # tftest modules=1 resources=1 inventory=options.yaml ``` ### Tables and views Tables are created via the `tables` variable, or the `view` variable for views. Support for external tables will be added in a future release. ```hcl locals { countries_schema = jsonencode([ { name = "country", type = "STRING" }, { name = "population", type = "INT64" }, ]) } module "bigquery-dataset" { source = "./fabric/modules/bigquery-dataset" project_id = "my-project" id = "my_dataset" tables = { countries = { friendly_name = "Countries" schema = local.countries_schema deletion_protection = true } } } # tftest modules=1 resources=2 inventory=tables.yaml ``` If partitioning is needed, populate the `partitioning` variable using either the `time` or `range` attribute. ```hcl locals { countries_schema = jsonencode([ { name = "country", type = "STRING" }, { name = "population", type = "INT64" }, ]) } module "bigquery-dataset" { source = "./fabric/modules/bigquery-dataset" project_id = "my-project" id = "my-dataset" tables = { table_a = { deletion_protection = true friendly_name = "Table a" schema = local.countries_schema partitioning = { time = { type = "DAY", expiration_ms = null } } } } } # tftest modules=1 resources=2 inventory=partitioning.yaml ``` To create views use the `view` variable. If you're querying a table created by the same module `terraform apply` will initially fail and eventually succeed once the underlying table has been created. You can probably also use the module's output in the view's query to create a dependency on the table. ```hcl locals { countries_schema = jsonencode([ { name = "country", type = "STRING" }, { name = "population", type = "INT64" }, ]) } module "bigquery-dataset" { source = "./fabric/modules/bigquery-dataset" project_id = "my-project" id = "my_dataset" tables = { countries = { friendly_name = "Countries" schema = local.countries_schema deletion_protection = true } } views = { population = { friendly_name = "Population" query = "SELECT SUM(population) FROM my_dataset.countries" use_legacy_sql = false deletion_protection = true } } } # tftest modules=1 resources=3 inventory=views.yaml ``` ## Variables | name | description | type | required | default | |---|---|:---:|:---:|:---:| | [id](variables.tf#L98) | Dataset id. | string | ✓ | | | [project_id](variables.tf#L162) | Id of the project where datasets will be created. | string | ✓ | | | [access](variables.tf#L17) | Map of access rules with role and identity type. Keys are arbitrary and must match those in the `access_identities` variable, types are `domain`, `group`, `special_group`, `user`, `view`. | map(object({…})) | | {} | | [access_identities](variables.tf#L33) | Map of access identities used for basic access roles. View identities have the format 'project_id\|dataset_id\|table_id'. | map(string) | | {} | | [authorized_datasets](variables.tf#L39) | An array of datasets to be authorized on the dataset. | list(object({…})) | | [] | | [authorized_routines](variables.tf#L48) | An array of authorized routine to be authorized on the dataset. | list(object({…})) | | [] | | [authorized_views](variables.tf#L58) | An array of views to be authorized on the dataset. | list(object({…})) | | [] | | [dataset_access](variables.tf#L68) | Set access in the dataset resource instead of using separate resources. | bool | | false | | [description](variables.tf#L74) | Optional description. | string | | "Terraform managed." | | [encryption_key](variables.tf#L80) | Self link of the KMS key that will be used to protect destination table. | string | | null | | [friendly_name](variables.tf#L86) | Dataset friendly name. | string | | null | | [iam](variables.tf#L92) | IAM bindings in {ROLE => [MEMBERS]} format. Mutually exclusive with the access_* variables used for basic roles. | map(list(string)) | | {} | | [labels](variables.tf#L103) | Dataset labels. | map(string) | | {} | | [location](variables.tf#L109) | Dataset location. | string | | "EU" | | [materialized_views](variables.tf#L115) | Materialized views definitions. | map(object({…})) | | {} | | [options](variables.tf#L148) | Dataset options. | object({…}) | | {} | | [tables](variables.tf#L167) | Table definitions. Options and partitioning default to null. Partitioning can only use `range` or `time`, set the unused one to null. | map(object({…})) | | {} | | [views](variables.tf#L252) | View definitions. | map(object({…})) | | {} | ## Outputs | name | description | sensitive | |---|---|:---:| | [dataset](outputs.tf#L17) | Dataset resource. | | | [dataset_id](outputs.tf#L22) | Dataset id. | | | [id](outputs.tf#L36) | Fully qualified dataset id. | | | [materialized_view_ids](outputs.tf#L50) | Map of fully qualified materialized view ids keyed by view ids. | | | [materialized_views](outputs.tf#L55) | Materialized view resources. | | | [self_link](outputs.tf#L60) | Dataset self link. | | | [table_ids](outputs.tf#L74) | Map of fully qualified table ids keyed by table ids. | | | [tables](outputs.tf#L79) | Table resources. | | | [view_ids](outputs.tf#L84) | Map of fully qualified view ids keyed by view ids. | | | [views](outputs.tf#L89) | View resources. | |