cloud-foundation-fabric/blueprints/apigee/bigquery-analytics
Simone Ruffilli 6d89b88149
versions.tf maintenance + copyright notice bump (#1782)
* Bump copyright notice to 2023

* Delete versions.tf on blueprints

* Pin provider to major version 5

* Remove comment

* Fix lint

* fix bq-ml blueprint readme

---------

Co-authored-by: Ludovico Magnocavallo <ludomagno@google.com>
Co-authored-by: Julio Castillo <jccb@google.com>
2023-10-20 18:17:47 +02:00
..
functions Fix because of changes in the cloud functions module and the Apigee analytics export schema 2023-04-27 10:26:45 +02:00
templates versions.tf maintenance + copyright notice bump (#1782) 2023-10-20 18:17:47 +02:00
README.md Fixed, added back environments to each instance, that way we can also create instances for already existing environments 2023-09-13 14:43:37 +02:00
diagram1.png Moved apigee bigquery analytics blueprint, added apigee network patterns 2022-12-23 14:38:15 +01:00
diagram2.png Moved apigee bigquery analytics blueprint, added apigee network patterns 2022-12-23 14:38:15 +01:00
main.tf Rename network load balancer modules (#1466) 2023-06-26 07:50:10 +00:00
outputs.tf Moved apigee bigquery analytics blueprint, added apigee network patterns 2022-12-23 14:38:15 +01:00
send-requests.sh versions.tf maintenance + copyright notice bump (#1782) 2023-10-20 18:17:47 +02:00
terraform.tfvars.sample Refactored apigee module and adjusted the blueprints accordingly 2023-06-19 09:01:32 +02:00
variables.tf Fixed, added back environments to each instance, that way we can also create instances for already existing environments 2023-09-13 14:43:37 +02:00

README.md

Apigee X Analytics in Bigquery

The following blueprint shows to how to create an Apigee X trial organization, with an environment group, an environment attached to that environment group and an instance attached to that environment. It creates a NEG that exposes Apigee service attachment. The NEG is added as a backend to a GLB. API proxy requests will pass through the GLB.

Analytics northbound networking

In addition to this it also creates the setup depicted in the diagram below to export the Apigee analytics of an organization daily to a BigQuery table.

Apigee analytics in BigQuery

Find below a description on how the analytics export to BigQuery works:

  1. A Cloud Scheduler Job runs daily at a selected time, publishing a message to a Pub/Sub topic.
  2. The message published triggers the execution of a function that makes a call to the Apigee Analytics Export API to export the analytical data available for the previous day.
  3. The export function is passed the Apigee organization, environments, datastore name as environment variables. The service account used to run the function needs to be granted the Apigee Admin role on the project. The Apigee Analytics engine asynchronously exports the analytical data to a GCS bucket. This requires the Apigee Service Agent service account to be granted the Storage Admin role on the project.
  4. A notification of the files created on GCS is received in a Pub/Sub topic that triggers the execution of the cloud function in charge of loading the data from GCS to the right BigQuery table partition. This function is passed the name of the BigQuery dataset, its location and the name of the table inside that dataset as environment variables. The service account used to run the function needs to be granted the Storage Object Viewer role on the GCS bucket, the BigQuery Job User role on the project and the BigQuery Data Editor role on the table.

Note: This setup only works if you are not using custom analytics.

Running the blueprint

  1. Clone this repository or open it in cloud shell, then go through the following steps to create resources:

  2. Copy the file terraform.tfvars.sample to a file called terraform.tfvars and update the values if required.

  3. Initialize the terraform configuration

    terraform init

  4. Apply the terraform configuration

    terraform apply

Once the resources have been created, do the following:

Create an A record in your DNS registrar to point the environment group hostname to the public IP address returned after the terraform configuration was applied. You might need to wait some time until the certificate is provisioned.

Testing the blueprint

Do the following to verify that everything works as expected.

  1. Create an Apigee datastore

    ./create-datastore.sh

  2. Deploy an api proxy

    ./deploy-apiproxy.sh test

  3. Send some traffic to the proxy

    ./send-requests.sh test.my-domain.com 1000

  4. At 4am (UTC) every day the Cloud Scheduler will run and will export the analytics to the BigQuery table. Double-check they are there.

Variables

name description type required default
envgroups Environment groups (NAME => [HOSTNAMES]). map(list(string))
environments Environments. map(object({…}))
instances Instance. map(object({…}))
project_id Project ID. string
psc_config PSC configuration. map(string)
datastore_name Datastore. string "gcs"
organization Apigee organization. object({…}) {…}
path Bucket path. string "/analytics"
project_create Parameters for the creation of the new project. object({…}) null
vpc_create Boolean flag indicating whether the VPC should be created or not. bool true

Outputs

name description sensitive
ip_address IP address.

Test

module "test" {
  source = "./fabric/blueprints/apigee/bigquery-analytics"
  project_create = {
    billing_account_id = "12345-12345-12345"
    parent             = "folders/123456789"
  }
  project_id = "my-project"
  envgroups = {
    test = ["test.cool-demos.space"]
  }
  environments = {
    apis-test = {
      envgroups = ["test"]
    }
  }
  instances = {
    europe-west1 = {
      runtime_ip_cidr_range         = "10.0.4.0/22"
      troubleshooting_ip_cidr_range = "10.1.0.0/28"
      environments                  = ["apis-test"]
    }
  }
  psc_config = {
    europe-west1 = "10.0.0.0/28"
  }
}
# tftest modules=10 resources=64