In this tutorial we will also see how to make explainable predictions, in order to understand what are the most important features that most influence the algorithm outputs.
This tutorial uses a fictitious e-commerce dataset collecting programmatically generated data from the fictitious e-commerce store called The Look. The dataset is publicy available on BigQuery at this location `bigquery-public-data.thelook_ecommerce`.
# Goal
The goal of this tutorial is to train a classification ML model using BigQuery ML and predict if a new web session is going to convert.
The tutorial focuses more on how to combine Vertex AI and BigQuery ML to create a model that can be used both for near-real time and batch predictions rather than the design of the model itself.
# Main components
In this tutorial we will make use of the following main components:
- Big Query:
- standard: to create a view which contains the model features and the target variable
- ML: to train, evaluate and make batch predictions
- Vertex AI:
- Pipeline: to define a configurable and re-usable set of steps to train and evaluate a BQML model
- Experiment: to keep track of all the trainings done via the Pipeline
- Model Registry: to keep track of the trained versions of a specific model
- Endpoint: to serve the model via API
- Workbench: to run this demo
# How to get started
1. Access the Vertex AI Workbench
2. clone this repository
2. run the [`bmql_pipeline.ipynb`](bmql_pipeline.ipynb) Jupyter Notebook