2020-05-02 08:33:48 -07:00
# Google Cloud Bigquery Module
This module allows managing a single BigQuery dataset, including access configuration, tables and views.
2020-05-03 02:46:04 -07:00
## TODO
- [ ] check for dynamic values in tables and views
- [ ] add support for external tables
2020-05-02 08:33:48 -07:00
## Examples
### Simple dataset with access configuration
2020-05-03 06:10:36 -07:00
Access configuration defaults to using the separate `google_bigquery_dataset_access` resource, so as to leave the default dataset access rules untouched.
You can choose to manage the `google_bigquery_dataset` access rules instead via the `dataset_access` variable, but be sure to always have at least one `OWNER` access and to avoid duplicating accesses, or `terraform apply` will fail.
The access variables are split into `access_roles` and `access_identities` variables, so that dynamic values can be passed in for identities (eg a service account email generated by a different module or resource). The `access_views` variable is separate, so as to allow proper type constraints.
2020-05-02 08:33:48 -07:00
```hcl
module "bigquery-dataset" {
source = "./modules/bigquery-dataset"
2020-05-28 23:25:51 -07:00
project_id = "my-project"
2020-05-03 06:10:36 -07:00
id = "my-dataset"
2020-11-21 00:45:56 -08:00
access = {
reader-group = { role = "READER", type = "group" }
owner = { role = "OWNER", type = "user" }
project_owners = { role = "OWNER", type = "special_group" }
view_1 = { role = "READER", type = "view" }
2020-05-03 06:10:36 -07:00
}
access_identities = {
2020-11-21 00:45:56 -08:00
reader-group = "playground-test@ludomagno.net"
owner = "ludo@ludomagno.net"
project_owners = "projectOwners"
view_1 = "my-project|my-dataset|my-table"
2020-05-03 02:43:11 -07:00
}
2020-05-02 08:33:48 -07:00
}
2020-11-21 00:45:56 -08:00
# tftest:modules=1:resources=5
2020-05-02 08:33:48 -07:00
```
2020-11-21 00:45:56 -08:00
### IAM roles
Access configuration can also be specified via IAM instead of basic roles via the `iam` variable. When using IAM, basic roles cannot be used via the `access` family variables.
```hcl
module "bigquery-dataset" {
source = "./modules/bigquery-dataset"
project_id = "my-project"
id = "my-dataset"
iam = {
"roles/bigquery.dataOwner" = ["user:user1@example.org"]
}
}
# tftest:modules=1:resources=2
```
roles/bigquery.dataOwner
2020-05-02 08:33:48 -07:00
### Dataset options
Dataset options are set via the `options` variable. all options must be specified, but a `null` value can be set to options that need to use defaults.
```hcl
module "bigquery-dataset" {
source = "./modules/bigquery-dataset"
2020-05-28 23:25:51 -07:00
project_id = "my-project"
2020-05-02 08:33:48 -07:00
id = "my-dataset"
options = {
default_table_expiration_ms = 3600000
default_partition_expiration_ms = null
delete_contents_on_destroy = false
}
}
2020-11-07 01:28:33 -08:00
# tftest:modules=1:resources=1
2020-05-02 08:33:48 -07:00
```
### Tables and views
Tables are created via the `tables` variable, or the `view` variable for views. Support for external tables will be added in a future release.
```hcl
2021-06-01 09:36:53 -07:00
locals {
countries_schema = jsonencode([
{ name = "country", type = "STRING" },
{ name = "population", type = "INT64" },
])
}
2020-05-02 08:33:48 -07:00
module "bigquery-dataset" {
source = "./modules/bigquery-dataset"
2020-05-28 23:25:51 -07:00
project_id = "my-project"
2021-06-01 09:36:53 -07:00
id = "my_dataset"
2020-05-02 08:33:48 -07:00
tables = {
2021-06-01 09:36:53 -07:00
countries = {
friendly_name = "Countries"
labels = {}
options = null
partitioning = null
schema = local.countries_schema
deletion_protection = true
2020-05-02 08:33:48 -07:00
}
}
}
2021-06-01 09:36:53 -07:00
# tftest:modules=1:resources=2
2020-05-02 08:33:48 -07:00
```
If partitioning is needed, populate the `partitioning` variable using either the `time` or `range` attribute.
```hcl
2021-06-01 09:36:53 -07:00
locals {
countries_schema = jsonencode([
{ name = "country", type = "STRING" },
{ name = "population", type = "INT64" },
])
}
2020-05-02 08:33:48 -07:00
module "bigquery-dataset" {
source = "./modules/bigquery-dataset"
2020-05-28 23:25:51 -07:00
project_id = "my-project"
2020-05-02 08:33:48 -07:00
id = "my-dataset"
tables = {
table_a = {
friendly_name = "Table a"
labels = {}
options = null
partitioning = {
field = null
range = null # use start/end/interval for range
time = { type = "DAY", expiration_ms = null }
}
2021-06-01 09:36:53 -07:00
schema = local.countries_schema
deletion_protection = true
2020-05-02 08:33:48 -07:00
}
}
}
2021-06-01 09:36:53 -07:00
# tftest:modules=1:resources=2
2020-05-02 08:33:48 -07:00
```
To create views use the `view` variable. If you're querying a table created by the same module `terraform apply` will initially fail and eventually succeed once the underlying table has been created. You can probably also use the module's output in the view's query to create a dependency on the table.
```hcl
2021-06-01 09:36:53 -07:00
locals {
countries_schema = jsonencode([
{ name = "country", type = "STRING" },
{ name = "population", type = "INT64" },
])
}
2020-05-02 08:33:48 -07:00
module "bigquery-dataset" {
source = "./modules/bigquery-dataset"
2020-05-28 23:25:51 -07:00
project_id = "my-project"
2021-06-01 09:36:53 -07:00
id = "my_dataset"
2020-05-02 08:33:48 -07:00
tables = {
2021-06-01 09:36:53 -07:00
countries = {
friendly_name = "Countries"
labels = {}
options = null
partitioning = null
schema = local.countries_schema
deletion_protection = true
2020-05-02 08:33:48 -07:00
}
}
views = {
2021-06-01 09:36:53 -07:00
population = {
friendly_name = "Population"
labels = {}
query = "SELECT SUM(population) FROM my_dataset.countries"
use_legacy_sql = false
deletion_protection = true
2020-05-02 08:33:48 -07:00
}
}
}
2021-06-01 09:36:53 -07:00
# tftest:modules=1:resources=3
2020-05-02 08:33:48 -07:00
```
2021-12-30 01:56:19 -08:00
2020-05-02 08:33:48 -07:00
<!-- BEGIN TFDOC -->
2021-12-20 23:51:51 -08:00
2020-05-02 08:33:48 -07:00
## Variables
| name | description | type | required | default |
2021-12-20 23:51:51 -08:00
|---|---|:---:|:---:|:---:|
| id | Dataset id. | < code > string< / code > | ✓ | |
| project_id | Id of the project where datasets will be created. | < code > string< / code > | ✓ | |
| access | Map of access rules with role and identity type. Keys are arbitrary and must match those in the `access_identities` variable, types are `domain` , `group` , `special_group` , `user` , `view` . | < code title = "map(object({ role = string type = string }))" > map( object({…})) </ code > | | < code > {} </ code > |
| access_identities | Map of access identities used for basic access roles. View identities have the format 'project_id|dataset_id|table_id'. | < code > map( string) < / code > | | < code > { } < / code > |
| dataset_access | Set access in the dataset resource instead of using separate resources. | < code > bool< / code > | | < code > false< / code > |
| description | Optional description. | < code > string< / code > | | < code > " Terraform managed." < / code > |
| encryption_key | Self link of the KMS key that will be used to protect destination table. | < code > string< / code > | | < code > null< / code > |
| friendly_name | Dataset friendly name. | < code > string< / code > | | < code > null< / code > |
| iam | IAM bindings in {ROLE => [MEMBERS]} format. Mutually exclusive with the access_* variables used for basic roles. | < code > map( list( string) ) < / code > | | < code > { } < / code > |
| labels | Dataset labels. | < code > map( string) < / code > | | < code > { } < / code > |
| location | Dataset location. | < code > string< / code > | | < code > " EU" < / code > |
| options | Dataset options. | < code title = "object({ default_table_expiration_ms = number default_partition_expiration_ms = number delete_contents_on_destroy = bool })" > object( { … } ) < / code > | | < code title = "{ default_table_expiration_ms = null default_partition_expiration_ms = null delete_contents_on_destroy = false }" > { … } < / code > |
| tables | Table definitions. Options and partitioning default to null. Partitioning can only use `range` or `time` , set the unused one to null. | < code title = "map(object({ friendly_name = string labels = map(string) options = object({ clustering = list(string) encryption_key = string expiration_time = number }) partitioning = object({ field = string range = object({ end = number interval = number start = number }) time = object({ expiration_ms = number type = string }) }) schema = string deletion_protection = bool }))" > map( object({…})) </ code > | | < code > {} </ code > |
| views | View definitions. | < code title = "map(object({ friendly_name = string labels = map(string) query = string use_legacy_sql = bool deletion_protection = bool }))" > map( object( { … } ) ) < / code > | | < code > { } < / code > |
2020-05-02 08:33:48 -07:00
## Outputs
| name | description | sensitive |
|---|---|:---:|
| dataset | Dataset resource. | |
2020-05-03 06:10:36 -07:00
| dataset_id | Dataset id. | |
| id | Fully qualified dataset id. | |
2020-05-02 08:33:48 -07:00
| self_link | Dataset self link. | |
2020-05-03 06:10:36 -07:00
| table_ids | Map of fully qualified table ids keyed by table ids. | |
2020-05-02 08:33:48 -07:00
| tables | Table resources. | |
2020-05-03 06:10:36 -07:00
| view_ids | Map of fully qualified view ids keyed by view ids. | |
2020-05-02 08:33:48 -07:00
| views | View resources. | |
2021-12-20 23:51:51 -08:00
2020-05-02 08:33:48 -07:00
<!-- END TFDOC -->
2021-12-30 01:56:19 -08:00