Back to HomeGCP

GCP AI/ML and Vertex AI Complete Guide: From Model Training to Production Deployment

16 min min read
#Vertex AI#GCP AI#Machine Learning#AutoML#Gemini API#MLOps#BigQuery ML#TensorFlow#Generative AI#Enterprise AI

GCP AI/ML and Vertex AI Complete Guide: From Model Training to Production Deployment

GCP AI/ML and Vertex AI Complete Guide: From Model Training to Production Deployment

Want to adopt AI in your company but don't know where to start?

Training your own model is too complex, but using ready-made APIs might not be flexible enough?

GCP's AI services offer solutions ranging from "no-code" to "fully customized." This article will introduce you to GCP's AI ecosystem, from the Vertex AI platform to Gemini API, helping you find the best entry point.

Want to understand GCP's core services first? Please refer to "GCP Complete Guide: From Beginner Concepts to Enterprise Practice."


GCP AI/ML Service Ecosystem Overview

GCP's AI services aren't just one product—they're an entire ecosystem.

Google Cloud AI Market Position and Advantages

What advantages does Google have in AI?

Technical Foundation:

  • TensorFlow is open-sourced by Google
  • TPU (Tensor Processing Unit) is developed by Google
  • Transformer architecture (the basis of GPT, BERT) was invented by Google

Practical Experience:

  • Google Search, YouTube recommendations, Gmail spam filtering all use ML
  • These experiences are reflected in GCP's AI service design

Unique Advantages:

  • The most powerful data analytics platform (BigQuery)
  • Native AI infrastructure (TPU)
  • Complete MLOps toolchain

Choosing Between Pre-trained APIs and Custom Models

GCP AI services fall into two categories:

Pre-trained APIs (Ready-made):

  • Call the API directly to use
  • No training data needed
  • No ML knowledge required
  • Suitable for: common tasks, quick validation

Custom Models (Train your own):

  • Train with your data
  • Can optimize for specific needs
  • Requires ML knowledge or using AutoML
  • Suitable for: special requirements, seeking best results

How to Choose?

ScenarioChoiceReason
Recognize common objectsVision APIAlready trained
Detect product defectsAutoML VisionNeed your own data
Translate common languagesTranslation APIQuality is already good
Translate technical termsCustom modelRequires domain knowledge
Quick prototype validationPre-trained APIGet results quickly
Seeking best resultsCustom modelTargeted optimization

AI Service Architecture Diagram

GCP AI Service Layers:

┌─────────────────────────────────────────────────┐
│        Application Layer: Gemini API, Agent Builder │
├─────────────────────────────────────────────────┤
│              Platform Layer: Vertex AI              │
│  ┌──────────┬──────────┬──────────┬──────────┐ │
│  │ Workbench │ AutoML   │ Pipelines │ Model    │ │
│  │          │          │           │ Garden   │ │
│  └──────────┴──────────┴──────────┴──────────┘ │
├─────────────────────────────────────────────────┤
│          Data Layer: BigQuery, Cloud Storage        │
├─────────────────────────────────────────────────┤
│      Infrastructure: GPU, TPU, Compute Engine       │
└─────────────────────────────────────────────────┘

Vertex AI Platform Deep Dive

Vertex AI is GCP's unified AI platform. All ML work can be completed here.

Vertex AI Core Features

What does Vertex AI integrate?

FeatureDescriptionPrevious Service
WorkbenchJupyter Notebook environmentAI Platform Notebooks
TrainingModel training serviceAI Platform Training
PredictionModel deployment serviceAI Platform Prediction
AutoMLAutomated machine learningAutoML Vision/NL/Tables
PipelinesML workflowKubeflow Pipelines
Feature StoreFeature managementNew feature
Model RegistryModel version managementNew feature
Model GardenPre-trained model libraryNew feature

Benefits:

  • One interface to manage all ML work
  • Seamless integration between tools
  • Unified permissions and billing management

Workbench (Jupyter Notebook Environment)

The first step in ML is usually opening a Notebook to explore data.

Workbench Types:

TypeFeaturesSuitable For
Managed NotebooksFully managed, quick startMost users
User-Managed NotebooksMore controlNeed custom configuration

Create Workbench Instance:

gcloud workbench instances create my-notebook \
  --location=asia-east1-b \
  --machine-type=n1-standard-4

Pre-installed Tools:

  • JupyterLab
  • TensorFlow, PyTorch
  • Pandas, Scikit-learn
  • BigQuery connector
  • Git integration

Model Registry Management

Trained models need version management.

Features:

  • Model version tracking
  • Model metadata management
  • Deployment status tracking
  • A/B testing support

Upload Model to Registry:

from google.cloud import aiplatform

aiplatform.init(project='my-project', location='asia-east1')

model = aiplatform.Model.upload(
    display_name='my-model',
    artifact_uri='gs://my-bucket/model/',
    serving_container_image_uri='us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-8:latest'
)

Pipelines Workflow Automation

Automate the entire ML workflow.

What a Pipeline includes:

  1. Data loading
  2. Data preprocessing
  3. Model training
  4. Model evaluation
  5. Model deployment

Using Kubeflow Pipelines SDK:

from kfp import dsl
from kfp.v2 import compiler

@dsl.pipeline(name='my-pipeline')
def my_pipeline():
    # Define each step
    data_op = load_data_component()
    train_op = train_model_component(data=data_op.output)
    deploy_op = deploy_model_component(model=train_op.output)

# Compile and execute
compiler.Compiler().compile(my_pipeline, 'pipeline.json')

Feature Store Engineering

Features are the core of ML. Feature Store helps you manage them.

What problems does it solve?

  • Training and inference use the same features
  • Features can be shared across teams
  • Feature version management
  • Point-in-time correctness

Use Cases:

  • User features (age, preferences, behavior)
  • Product features (category, price, rating)
  • Real-time features (recent clicks, cart status)

AutoML: No-Code AI Modeling

Can you train ML models without writing code? AutoML makes this possible.

How AutoML Works

AutoML automatically handles:

  1. Data exploration and cleaning
  2. Feature engineering
  3. Model architecture search
  4. Hyperparameter tuning
  5. Model training
  6. Model evaluation

You only need to:

  1. Prepare labeled data
  2. Upload to Vertex AI
  3. Click "Train"
  4. Wait for completion

AutoML Vision (Image Recognition)

Supported Tasks:

  • Single-label classification (What is this?)
  • Multi-label classification (What things are there?)
  • Object detection (Where is it?)

Data Requirements:

  • Minimum 100 images per category
  • Recommended 1,000+ images for better results
  • Supports JPG, PNG, BMP, GIF

Use Cases:

  • Manufacturing: Defect detection
  • Retail: Product classification
  • Healthcare: Medical imaging assistance

AutoML Natural Language (Text Analysis)

Supported Tasks:

  • Text classification (sentiment analysis, topic classification)
  • Entity extraction (find names, places, organizations)
  • Sentiment analysis (positive, negative, neutral)

Data Requirements:

  • Minimum 1,000 documents
  • At least 100 per category
  • Supports plain text or CSV

Use Cases:

  • Customer service: Auto-classify complaints
  • Media: News topic classification
  • Social: Sentiment analysis

AutoML Tables (Structured Data)

Supported Tasks:

  • Classification (Will this customer churn?)
  • Regression (How many will this product sell?)

Data Requirements:

  • Minimum 1,000 rows of data
  • At least 2 feature columns
  • Supports CSV or BigQuery tables

Use Cases:

  • Finance: Credit risk assessment
  • Retail: Sales forecasting
  • Marketing: Customer churn prediction

AutoML Use Cases and Limitations

Good for AutoML:

  • No ML team
  • Want to quickly validate ideas
  • Task is a standard type
  • Data volume is not particularly large

Not suitable for AutoML:

  • Need cutting-edge model performance
  • Have complex custom requirements
  • Extremely large data volume (custom training more cost-effective)
  • Need special architectures (like GAN, reinforcement learning)

Cost Considerations:

  • AutoML charges by training hour
  • Training an image model costs about $3-20/hour
  • Complex tasks may require tens of hours of training

Gemini API and Generative AI

The hottest AI technology in 2024-2025: Generative AI.

Gemini Model Version Comparison (Pro / Flash / Ultra)

ModelFeaturesSuitable ForPrice
Gemini 2.0 FlashUltra-fast, low costReal-time apps, high-volume requestsLowest
Gemini 1.5 ProBalanced performance and costGeneral business appsMedium
Gemini 1.5 FlashFast responseConversation systems, lightweight tasksLower
Gemini UltraBest performanceComplex reasoning, professional tasksHighest

Selection Recommendations:

  • Start with Flash for prototyping
  • Evaluate Pro after confirming feasibility
  • Only use Ultra when truly needed

API Calls and Billing

Basic Call Example:

import google.generativeai as genai

genai.configure(api_key='YOUR_API_KEY')

model = genai.GenerativeModel('gemini-1.5-pro')
response = model.generate_content('Explain what machine learning is')

print(response.text)

Calling from Vertex AI:

from vertexai.generative_models import GenerativeModel

model = GenerativeModel('gemini-1.5-pro')
response = model.generate_content('Write a product description')

print(response.text)

Billing Method:

  • Charged by tokens (input + output)
  • 1,000 English words ≈ 700-900 tokens
  • Different models have different prices

Prompt Engineering Best Practices

A good prompt looks like this:

You are a professional product copywriter.

Task: Write a 50-word promotional copy for the following product.

Product Information:
- Name: Ultra-lightweight Laptop
- Weight: 900g
- Features: 16-hour battery life, military-grade durability

Requirements:
1. Use clear, professional English
2. Tone is lively but professional
3. Emphasize lightweight and battery advantages

Prompt Techniques:

  • Role Setting: Tell the model what role it is
  • Clear Task: Clearly state what to do
  • Provide Examples: Give one or two expected output examples
  • Specify Format: JSON? Bullet points? Paragraphs?
  • Set Constraints: Word count, language, tone

Enterprise Application Cases

Case 1: Customer Service Auto-Reply

  • Use Gemini to understand customer questions
  • Find answers from knowledge base
  • Generate natural language responses

Case 2: Document Summarization

  • Upload lengthy reports
  • Auto-generate key summaries
  • Extract key data

Case 3: Code Assistance

  • Explain existing code
  • Generate test cases
  • Suggest refactoring directions

Case 4: Content Generation

  • Product descriptions
  • Marketing copy
  • Technical documentation

BigQuery ML: SQL-Driven Machine Learning

Can data analysts do ML? They can with SQL.

BQML Supported Model Types

Model TypeSQL CommandSuitable Tasks
Linear RegressionLINEAR_REGPredict values
Logistic RegressionLOGISTIC_REGBinary classification
K-MeansKMEANSCustomer segmentation
Time SeriesARIMA_PLUSTrend forecasting
XGBoostBOOSTED_TREE_CLASSIFIERComplex classification
DNNDNN_CLASSIFIERDeep learning
AutoML TablesAUTOML_CLASSIFIERAutomated ML

Create and Train Model Syntax

Create Model:

CREATE OR REPLACE MODEL `my_dataset.sales_forecast`
OPTIONS(
  model_type='ARIMA_PLUS',
  time_series_timestamp_col='date',
  time_series_data_col='sales',
  time_series_id_col='product_id'
) AS
SELECT
  date,
  product_id,
  sales
FROM
  `my_dataset.sales_data`
WHERE
  date < '2024-01-01'

Forecast:

SELECT *
FROM ML.FORECAST(
  MODEL `my_dataset.sales_forecast`,
  STRUCT(30 AS horizon, 0.95 AS confidence_level)
)

Evaluate Model:

SELECT *
FROM ML.EVALUATE(MODEL `my_dataset.my_model`)

Use Cases and Performance Considerations

Good for BQML:

  • Data is already in BigQuery
  • Team is familiar with SQL
  • Want to quickly validate ideas
  • Task is standard classification/regression

Not suitable for BQML:

  • Need cutting-edge performance
  • Task requires custom architecture
  • Image, audio, and other unstructured data

Cost Tips:

  • Training costs calculated by data processed
  • Complex models take longer to train
  • Can set training budget limits

AI/ML Cost Planning and Optimization

AI projects can easily go over budget. Good cost planning is important.

Training vs Inference Cost Structure

Training Costs:

  • One-time cost
  • Charged by compute time
  • GPU/TPU costs are high
  • Can use Spot VMs to save money

Inference Costs:

  • Ongoing cost
  • Charged by predictions or time
  • Need to consider 24/7 running costs
  • Batch inference is cheaper than real-time

Cost Comparison Example:

ItemTraining CostInference Cost (Monthly)
Small Model$50-200$100-300
Medium Model$500-2,000$500-1,500
Large Model$5,000-20,000$2,000-10,000

GPU/TPU Selection and Cost Comparison

GPU Options:

GPUMemorySuitable ForHourly Cost
T416GBInference, small training~$0.35
L424GBBalanced~$0.70
A100 40GB40GBLarge training~$3.00
A100 80GB80GBVery large models~$4.00
H10080GBLatest and most powerful~$8.00

TPU Options:

TPUSuitable ForHourly Cost
v2-8Medium training~$4.50
v3-8Large training~$8.00
v5eInference optimized~$1.20

Selection Recommendations:

  • Development phase → T4 or L4
  • Production training → A100
  • TensorFlow large models → TPU
  • Inference service → T4 or v5e

Batch Inference Cost Reduction

Real-time vs Batch Inference:

TypeLatencyCostSuitable For
Real-time (Online)MillisecondsHigherReal-time apps
BatchMinutes to hoursLowerHigh-volume processing

Batch Inference Use Cases:

  • Daily customer score updates
  • Product recommendation pre-calculation
  • Report data analysis
  • Historical data backfill

Cost Difference: Batch inference can be 60-80% cheaper than real-time inference.


Enterprise AI Adoption Best Practices

From POC to production—how do enterprise AI projects progress?

Path from POC to Production

Phase 1: Exploration and Definition (2-4 weeks)

  • Confirm business problem
  • Assess data availability
  • Define success metrics
  • Evaluate technical feasibility

Phase 2: POC (4-8 weeks)

  • Small-scale data validation
  • Quickly build prototype
  • Verify if results meet targets
  • Estimate production environment costs

Phase 3: Development (8-16 weeks)

  • Complete data processing pipeline
  • Model tuning
  • Build MLOps processes
  • Integrate with existing systems

Phase 4: Launch (4-8 weeks)

  • Performance testing
  • Gradual rollout
  • Monitoring and alerting setup
  • Documentation and knowledge transfer

Common Failure Reasons:

  • Skipping POC and going straight to development
  • Underestimating data cleaning work
  • No clear success metrics
  • No MLOps leading to maintenance difficulties

MLOps and Model Monitoring

What MLOps includes:

  • Version control (data, code, models)
  • Automated training pipeline
  • Automated model deployment
  • Continuous monitoring and retraining

Model Monitoring Metrics:

  • Prediction performance (accuracy, recall)
  • Data drift
  • Concept drift
  • Latency and throughput

Vertex AI Model Monitoring:

from google.cloud import aiplatform

# Enable monitoring
endpoint = aiplatform.Endpoint('endpoint-id')
endpoint.update(
    traffic_split={'model-v1': 100},
    enable_model_monitoring=True,
    model_monitoring_config={
        'alert_config': {
            'email_alert_config': {
                'user_emails': ['[email protected]']
            }
        }
    }
)

Data Governance and Compliance

Data Privacy:

  • PII de-identification
  • Data minimization principle
  • Access control
  • Usage logging and tracking

Model Compliance:

  • Model explainability
  • Bias detection and mitigation
  • Decision transparency
  • Human review mechanism

GCP Compliance Tools:

  • Data Loss Prevention (DLP): Automatically detect and mask sensitive data
  • Cloud Audit Logs: Record all operations
  • VPC Service Controls: Network-level isolation

For security details, see "GCP Security and Cloud Armor Protection Complete Guide."


Want to Adopt AI in Your Enterprise?

From Gemini to building your own LLM, there are many choices but also many pitfalls.

Schedule AI Adoption Consultation and let experienced professionals help you avoid pitfalls.

CloudInsight's AI Adoption Services:

  • Requirements Assessment: Clarify business needs, confirm if AI is the best solution
  • Technology Selection: Use ready-made APIs or train your own?
  • POC Planning: Quickly validate feasibility and effectiveness
  • Cost Estimation: Complete cost estimation for training, inference, and maintenance
  • Architecture Design: Complete solution from data to deployment

Conclusion: Building Your GCP AI Strategy

GCP's AI services are comprehensive. The key is finding the right entry point for you.

Selection Recommendations:

Your SituationRecommended Solution
Want to quickly try AIGemini API
Have data but no ML teamAutoML
Data is in BigQueryBigQuery ML
Have ML team wanting more controlVertex AI Custom Training
Need complete MLOpsVertex AI Pipelines

Recommendations for Different Roles:

For Business Executives:

  • Start with Gemini for internal efficiency tools
  • Accumulate experience from small projects
  • Expand investment after success

For Engineers:

  • Get familiar with the Vertex AI platform
  • Practice AutoML and custom training
  • Understand MLOps best practices

For Data Analysts:

  • Start with BigQuery ML
  • Gradually learn AutoML
  • Collaborate with engineering teams

AI adoption is a journey, not a single project. Start small, keep learning, and gradually scale up.


Further Reading


Image Descriptions

Illustration: GCP AI Services Layered Architecture Diagram

Scene Description: Four-layer architecture diagram, from bottom to top: "Infrastructure Layer" (GPU, TPU), "Data Layer" (BigQuery, Cloud Storage), "Platform Layer" (Vertex AI), "Application Layer" (Gemini API, Agent Builder). Each layer uses different shades of blue, with connecting lines between layers.

Visual Focus:

  • Main content clearly presented

Required Elements:

  • Per description key elements

Chinese Text to Display: None

Color Tone: Professional, clear

Elements to Avoid: Abstract graphics, gears, glowing effects

Slug: gcp-ai-services-layered-architecture


Illustration: Vertex AI Features Overview

Scene Description: Hexagon-style feature block diagram with Vertex AI logo in the center, surrounded by six hexagonal blocks labeled Workbench, Training, Prediction, AutoML, Pipelines, Feature Store. Each block uses different colors to distinguish feature types.

Visual Focus:

  • Main content clearly presented

Required Elements:

  • Per description key elements

Chinese Text to Display: None

Color Tone: Professional, clear

Elements to Avoid: Abstract graphics, gears, glowing effects

Slug: vertex-ai-features-hexagon-overview


Illustration: AutoML vs Custom Training Decision Matrix

Scene Description: Two-dimensional matrix diagram, X-axis is "ML Expertise Level" (low to high), Y-axis is "Customization Needs" (low to high). Lower-left quadrant labeled AutoML (green), upper-right quadrant labeled Custom Training (blue), lower-right quadrant labeled BigQuery ML (orange), upper-left quadrant labeled Pre-trained API (gray).

Visual Focus:

  • Main content clearly presented

Required Elements:

  • Per description key elements

Chinese Text to Display: None

Color Tone: Professional, clear

Elements to Avoid: Abstract graphics, gears, glowing effects

Slug: automl-custom-training-decision-matrix


Illustration: AI Project Lifecycle Flowchart

Scene Description: Circular flowchart showing AI project lifecycle. Starting from "Business Problem," moving clockwise through "Data Collection," "Model Development," "Model Deployment," "Monitoring & Maintenance," then back to "Business Problem" forming a cycle. Each phase uses different colored arcs, with "Continuous Improvement" labeled in the center.

Visual Focus:

  • Main content clearly presented

Required Elements:

  • Per description key elements

Chinese Text to Display: None

Color Tone: Professional, clear

Elements to Avoid: Abstract graphics, gears, glowing effects

Slug: ai-project-lifecycle-circular-diagram


References

  1. Google Cloud, "Vertex AI Documentation" (2024)
  2. Google Cloud, "AutoML Documentation" (2024)
  3. Google Cloud, "Gemini API Documentation" (2024)
  4. Google Cloud, "BigQuery ML Documentation" (2024)
  5. Google Cloud, "MLOps: Continuous delivery and automation pipelines in machine learning" (2024)

Need Professional Cloud Advice?

Whether you're evaluating cloud platforms, optimizing existing architecture, or looking for cost-saving solutions, we can help

Book Free Consultation

Related Articles