# Apigee X Analytics in Bigquery The following blueprint shows to how to create an Apigee X trial organization, with an environment group, an environment attached to that environment group and an instance attached to that environment. It creates a NEG that exposes Apigee service attachment. The NEG is added as a backend to a GLB. API proxy requests will pass through the GLB. ![Analytics northbound networking](diagram1.png) In addition to this it also creates the setup depicted in the diagram below to export the Apigee analytics of an organization daily to a BigQuery table. ![Apigee analytics in BigQuery](diagram2.png) Find below a description on how the analytics export to BigQuery works: 1. A Cloud Scheduler Job runs daily at a selected time, publishing a message to a Pub/Sub topic. 2. The message published triggers the execution of a function that makes a call to the Apigee Analytics Export API to export the analytical data available for the previous day. 3. The export function is passed the Apigee organization, environments, datastore name as environment variables. The service account used to run the function needs to be granted the Apigee Admin role on the project. The Apigee Analytics engine asynchronously exports the analytical data to a GCS bucket. This requires the _Apigee Service Agent_ service account to be granted the _Storage Admin_ role on the project. 4. A notification of the files created on GCS is received in a Pub/Sub topic that triggers the execution of the cloud function in charge of loading the data from GCS to the right BigQuery table partition. This function is passed the name of the BigQuery dataset, its location and the name of the table inside that dataset as environment variables. The service account used to run the function needs to be granted the _Storage Object Viewer_ role on the GCS bucket, the _BigQuery Job User_ role on the project and the _BigQuery Data Editor_ role on the table. Note: This setup only works if you are not using custom analytics. ## Running the blueprint 1. Clone this repository or [open it in cloud shell](https://ssh.cloud.google.com/cloudshell/editor?cloudshell_git_repo=https%3A%2F%2Fgithub.com%2Fterraform-google-modules%2Fcloud-foundation-fabric&cloudshell_print=cloud-shell-readme.txt&cloudshell_working_dir=blueprints%2Fapigee%2Fbigquery-analytics), then go through the following steps to create resources: 2. Copy the file [terraform.tfvars.sample](./terraform.tfvars.sample) to a file called ```terraform.tfvars``` and update the values if required. 3. Initialize the terraform configuration ```terraform init``` 4. Apply the terraform configuration ```terraform apply``` Once the resources have been created, do the following: Create an A record in your DNS registrar to point the environment group hostname to the public IP address returned after the terraform configuration was applied. You might need to wait some time until the certificate is provisioned. ## Testing the blueprint Do the following to verify that everything works as expected. 1. Create an Apigee datastore ```./create-datastore.sh``` 2. Deploy an api proxy ```./deploy-apiproxy.sh test``` 3. Send some traffic to the proxy ```./send-requests.sh test.my-domain.com 1000``` 4. At 4am (UTC) every day the Cloud Scheduler will run and will export the analytics to the BigQuery table. Double-check they are there. ## Variables | name | description | type | required | default | |---|---|:---:|:---:|:---:| | [envgroups](variables.tf#L24) | Environment groups (NAME => [HOSTNAMES]). | map(list(string)) | ✓ | | | [environments](variables.tf#L30) | Environments. | map(object({…})) | ✓ | | | [instances](variables.tf#L46) | Instance. | map(object({…})) | ✓ | | | [project_id](variables.tf#L91) | Project ID. | string | ✓ | | | [psc_config](variables.tf#L97) | PSC configuration. | map(string) | ✓ | | | [datastore_name](variables.tf#L17) | Datastore. | string | | "gcs" | | [organization](variables.tf#L59) | Apigee organization. | object({…}) | | {…} | | [path](variables.tf#L75) | Bucket path. | string | | "/analytics" | | [project_create](variables.tf#L82) | Parameters for the creation of the new project. | object({…}) | | null | | [vpc_create](variables.tf#L103) | Boolean flag indicating whether the VPC should be created or not. | bool | | true | ## Outputs | name | description | sensitive | |---|---|:---:| | [ip_address](outputs.tf#L17) | IP address. | | ## Test ```hcl module "test" { source = "./fabric/blueprints/apigee/bigquery-analytics" project_create = { billing_account_id = "12345-12345-12345" parent = "folders/123456789" } project_id = "my-project" envgroups = { test = ["test.cool-demos.space"] } environments = { apis-test = { envgroups = ["test"] regions = ["europe-west1"] } } instances = { europe-west1 = { runtime_ip_cidr_range = "10.0.4.0/22" troubleshooting_ip_cidr_range = "10.1.0.0/28" } } psc_config = { europe-west1 = "10.0.0.0/28" } } # tftest modules=10 resources=65 ```