Merge pull request #1224 from GoogleCloudPlatform/lcaggio/project-notebook

Fix JIT notebook service account.
2023-03-08 16:33:39 +01:00 · 2023-03-08 16:33:39 +01:00 · 4b108e8993
parent fd07c444cb e213f156ad
commit 4b108e8993
4 changed files with 24 additions and 13 deletions
--- a/blueprints/data-solutions/bq-ml/README.md
+++ b/blueprints/data-solutions/bq-ml/README.md
@ -98,5 +98,5 @@ module "test" {
  prefix     = "prefix"
 }

-# tftest modules=9 resources=46
+# tftest modules=9 resources=47
 ```
--- a/blueprints/data-solutions/data-playground/README.md
+++ b/blueprints/data-solutions/data-playground/README.md
@ -17,30 +17,35 @@ This sample creates several distinct groups of resources:
 - One BigQuery dataset

 ## Virtual Private Cloud (VPC) design
+
 As is often the case in real-world configurations, this blueprint accepts as input an existing Shared-VPC via the network_config variable. Make sure that 'container.googleapis.com', 'notebooks.googleapis.com' and 'servicenetworking.googleapis.com' are enabled in the VPC host project.

 If the network_config variable is not provided, one VPC will be created in each project that supports network resources (load, transformation and orchestration).

 ## Deploy your enviroment
+
 We assume the identiy running the following steps has the following role:

 - resourcemanager.projectCreator in case a new project will be created.
 - owner on the project in case you use an existing project.

 Run Terraform init:
+
 ```
-$ terraform init
+terraform init
 ```

 Configure the Terraform variable in your terraform.tfvars file. You need to spefify at least the following variables:
+
 ```
 prefix = "prefix"
 project_id      = "data-001"
 ```

 You can run now:
+
 ```
-$ terraform apply
+terraform apply
 ```

 You can now connect to the Vertex AI notbook to perform your data analysy.
@ -81,5 +86,5 @@ module "test" {
    parent             = "folders/467898377"
  }
 }
-# tftest modules=8 resources=39
+# tftest modules=8 resources=40
 ```
--- a/blueprints/data-solutions/vertex-mlops/README.md
+++ b/blueprints/data-solutions/vertex-mlops/README.md
@ -1,20 +1,23 @@
 # MLOps with Vertex AI

 ## Introduction
-This example implements the infrastructure required to deploy an end-to-end [MLOps process](https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf) using [Vertex AI](https://cloud.google.com/vertex-ai) platform. 

-##  GCP resources
+This example implements the infrastructure required to deploy an end-to-end [MLOps process](https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf) using [Vertex AI](https://cloud.google.com/vertex-ai) platform.
+
+## GCP resources
+
 The blueprint will deploy all the required resources to have a fully functional MLOPs environment containing:
+
 - Vertex Workbench (for the experimentation environment)
 - GCP Project (optional) to host all the resources
- Isolated VPC network and a subnet to be used by Vertex and Dataflow. Alternatively, an external Shared VPC can be configured using the `network_config`variable. 
+- Isolated VPC network and a subnet to be used by Vertex and Dataflow. Alternatively, an external Shared VPC can be configured using the `network_config`variable.
 - Firewall rule to allow the internal subnet communication required by Dataflow
 - Cloud NAT required to reach the internet from the different computing resources (Vertex and Dataflow)
 - GCS buckets to host Vertex AI and Cloud Build Artifacts. By default the buckets will be regional and should match the Vertex AI region for the different resources (i.e. Vertex Managed Dataset) and processes (i.e. Vertex trainining)
 - BigQuery Dataset where the training data will be stored. This is optional, since the training data could be already hosted in an existing BigQuery dataset.
 - Artifact Registry Docker repository to host the custom images.
 - Service account (`mlops-[env]@`) with the minimum permissions required by Vertex AI and Dataflow (if this service is used inside of the Vertex AI Pipeline).
- Service account (`github@`) to be used by Workload Identity Federation, to federate Github identity (Optional). 
+- Service account (`github@`) to be used by Workload Identity Federation, to federate Github identity (Optional).
 - Secret to store the Github SSH key to get access the CICD code repo.

 ![MLOps project description](./images/mlops_projects.png "MLOps project description")
@ -28,13 +31,14 @@ Assign roles relying on User groups is a way to decouple the final set of permis
 We use the following groups to control access to resources:

 - *Data Scientits* (gcp-ml-ds@<company.org>). They manage notebooks and create ML pipelines.
- *ML Engineers* (gcp-ml-eng@<company.org>). They manage the different Vertex resources. 
- *ML Viewer* (gcp-ml-eng@<company.org>). Group with wiewer permission for the different resources. 
+- *ML Engineers* (gcp-ml-eng@<company.org>). They manage the different Vertex resources.
+- *ML Viewer* (gcp-ml-eng@<company.org>). Group with wiewer permission for the different resources.

 Please note that these groups are not suitable for production grade environments. Roles can be customized in the `main.tf`file.

-##  Instructions
-###  Deploy the experimentation environment
+## Instructions
+
+### Deploy the experimentation environment

 - Create a `terraform.tfvars` file and specify the variables to match your desired configuration. You can use the provided `terraform.tfvars.sample`  as reference.
 - Run `terraform init` and `terraform apply`
@ -76,6 +80,7 @@ This blueprint can be used as a building block for setting up an end2end ML Ops
 <!-- END TFDOC -->

 ## TODO
+
 - Add support for User Managed Notebooks, SA permission option and non default SA for Single User mode.
 - Improve default naming for local VPC and Cloud NAT

@ -105,5 +110,5 @@ module "test" {
    parent             = "folders/111111111111"
  }
 }
-# tftest modules=12 resources=56
+# tftest modules=12 resources=57
 ```
--- a/modules/project/service-accounts.tf
+++ b/modules/project/service-accounts.tf
@ -83,6 +83,7 @@ locals {
    "multiclusteringress.googleapis.com", # grant roles/multiclusteringress.serviceAgent to multicluster-ingress
    "pubsub.googleapis.com",              # grant roles/pubsub.serviceAgent to pubsub
    "meshconfig.googleapis.com",          # grant roles/anthosservicemesh.serviceAgent to meshconfig
+    "notebooks.googleapis.com",           # no grants needed
    "secretmanager.googleapis.com",       # no grants needed
    "sqladmin.googleapis.com",            # grant roles/cloudsql.serviceAgent to sqladmin (TODO: verify)
  ]