Saturday, June 21, 2025

Snowflake Table Types

 "In my recent banking project on Snowflake, I used different table types for optimized performance and cost.

I used External Tables to read raw customer files directly from S3.
Then I staged them in Transient Tables for cleansing.
Cleaned data was loaded into Permanent Tables like customer_master.
For analytics, I followed a star schema using fact and dimension tables.
To speed up dashboards, I created Materialized Views for monthly summaries and also used Clone Tables during testing and validation.
This structure allowed me to build a robust, cost-efficient, and scalable banking analytics pipeline."

Snowflake Table Types - Interview Summary

1. Table Types & Use Cases

  • Permanent Table – For storing validated business data (e.g., customer_master)

  • Transient Table – For intermediate data (e.g., raw_txn_data_trans); no Fail-safe

  • Temporary Table – Used within session only

  • External Table – Reads raw files from cloud (e.g., S3)

  • Materialized View – Stores pre-aggregated data for performance

  • Clone Table – Fast, zero-copy clone for testing/validation


2. Layered Table Usage in Retail Banking

  • πŸ”Ή Raw Zoneraw_customer_data_ext, raw_txn_data_trans

  • πŸ”Ή Clean Zonecustomer_master, account_master

  • πŸ”Ή Analytics Layertxn_fact, product_dim, customer_dim

  • πŸ”Ή Reporting Layermonthly_txn_summary_mv, top_branches_mv

Raw Zone

-- External Table
CREATE OR REPLACE EXTERNAL TABLE raw_customer_data_ext ( customer_id STRING, name STRING, email STRING, phone STRING, raw_load_date DATE ) WITH LOCATION = @external_stage/customer_data FILE_FORMAT = (TYPE = 'CSV' FIELD_OPTIONALLY_ENCLOSED_BY = '"'); -- Transient Table CREATE OR REPLACE TRANSIENT TABLE raw_txn_data_trans ( txn_id STRING, customer_id STRING, txn_amount NUMBER(10,2), txn_type STRING, txn_date DATE ); -- Temporary Table CREATE OR REPLACE TEMPORARY TABLE raw_loan_apps_temp ( app_id STRING, customer_id STRING, loan_amount NUMBER(10,2), status STRING, submission_date DATE );

Clean Zone

CREATE OR REPLACE TABLE customer_master (
customer_id STRING PRIMARY KEY, name STRING, email STRING, phone STRING, dob DATE, kyc_status STRING, created_at TIMESTAMP ); CREATE OR REPLACE TABLE account_master ( account_id STRING PRIMARY KEY, customer_id STRING, account_type STRING, opened_date DATE, branch_id STRING, balance NUMBER(12,2) ); CREATE OR REPLACE TABLE branch_master ( branch_id STRING PRIMARY KEY, branch_name STRING, city STRING, region STRING );

Analytics Layer

CREATE OR REPLACE TABLE txn_fact (
txn_id STRING PRIMARY KEY, account_id STRING, product_id STRING, txn_date DATE, txn_amount NUMBER(10,2), txn_type STRING ); CREATE OR REPLACE TABLE customer_dim ( customer_id STRING PRIMARY KEY, name STRING, email STRING, phone STRING, kyc_status STRING, city STRING, region STRING ); CREATE OR REPLACE TABLE product_dim ( product_id STRING PRIMARY KEY, product_name STRING, product_category STRING, interest_rate NUMBER(5,2) ); CREATE OR REPLACE TABLE date_dim ( date_id DATE PRIMARY KEY, day INT, month INT, quarter INT, year INT, weekday STRING );

Reporting & Derived Tables

CREATE OR REPLACE MATERIALIZED VIEW monthly_txn_summary_mv AS SELECT customer_id, DATE_TRUNC('MONTH', txn_date) AS month, COUNT(*) AS txn_count, SUM(txn_amount) AS total_amount FROM txn_fact GROUP BY customer_id, month; CREATE OR REPLACE TABLE churn_risk_clone CLONE customer_dim; CREATE OR REPLACE MATERIALIZED VIEW top_branches_mv AS SELECT b.branch_id, b.branch_name, SUM(t.txn_amount) AS total_txn FROM txn_fact t JOIN account_master a ON t.account_id = a.account_id JOIN branch_master b ON a.branch_id = b.branch_id GROUP BY b.branch_id, b.branch_name ORDER BY total_txn DESC;

No comments:

Post a Comment

Python using AI

  Python using AI - Prompts & Codes Tools useful for Python + AI ChatGPT - https://chatgpt.com/ Claude AI - https://claude.ai/new ...