Thursday, June 12, 2025

GCP -Model Lifecycle

 

Google Cloud Data Model Lifecycle


┌────────────┐ │ 1. DESIGN │ └────┬───────┘ │ ▼ ┌──────────────────────────────┐ │ • Understand business needs │ │ • Choose schema (Star, 3NF) │ │ • Identify data sources │ │ • Plan storage (BQ, GCS) │ └────────┬─────────────────────┘ │ ▼ ┌─────────┐ │ 2. BUILD│ └───┬─────┘ ▼ ┌────────────────────────────────────┐ │ • Ingest data │ │ - Batch: Dataflow, Data Fusion │ │ - Real-time: Pub/Sub + Dataflow │ │ • Transform using SQL, DBT │ │ • Store in BigQuery │ │ • Model (Facts, Dimensions) │ └────────┬───────────────────────────┘ │ ▼ ┌──────────┐ │ 3. TEST │ └────┬─────┘ ▼ ┌────────────────────────────────────────┐ │ • Unit test SQL models │ │ • Data quality: DBT/Great Expectations│ │ • Performance: Partitioning, EXPLAIN │ │ • Integration test entire pipeline │ └──────────┬─────────────────────────────┘ │ ▼ ┌────────────┐ │ 4. DEPLOY │ └────┬───────┘ ▼ ┌────────────────────────────────────────┐ │ • Git + Cloud Build for CI/CD │ │ • Deploy DBT models to BigQuery │ │ • Schedule with Cloud Composer │ │ • Monitor with Stackdriver Logging │ └────────────────────────────────────────┘

No comments:

Post a Comment

Python using AI

  Python using AI - Prompts & Codes Tools useful for Python + AI ChatGPT - https://chatgpt.com/ Claude AI - https://claude.ai/new ...