Thursday, June 19, 2025

Snowflake full-stack

 Database & warehouse setup
 Data loading (CSV, JSON, Parquet)
 Query optimization (clustering, caching, materialized views)
 Security (RLS, data sharing, grants)
 Time Travel & cloning
 Cost monitoring & administration


Snowflake Operations Guide

1. Creating Databases, Tables, Views, and Virtual Warehouses

Databases:

CREATE DATABASE my_database;


Tables:

sql
Copy
Download
CREATE TABLE my_table (
    id INT,
    name STRING,
    created_at TIMESTAMP
);

Views:

sql
Copy
Download
CREATE VIEW my_view AS
SELECT id, name 
FROM my_table
WHERE created_at > CURRENT_DATE() - 30;

Virtual Warehouses:

sql
Copy
Download
CREATE WAREHOUSE my_warehouse
WITH 
    WAREHOUSE_SIZE = 'X-SMALL'
    AUTO_SUSPEND = 300  -- in seconds
    AUTO_RESUME = TRUE;

2. Loading and Querying Structured and Semi-Structured Data

Loading structured data (CSV):

sql
Copy
Download
-- Create stage
CREATE OR REPLACE STAGE my_csv_stage
    FILE_FORMAT = (TYPE = 'CSV' FIELD_OPTIONALLY_ENCLOSED_BY = '"');

-- Put file to stage (using SnowSQL or web interface)
-- COPY INTO command
COPY INTO my_table
FROM @my_csv_stage/data.csv
FILE_FORMAT = (TYPE = 'CSV' FIELD_OPTIONALLY_ENCLOSED_BY = '"');

Loading semi-structured data (JSON):

sql
Copy
Download
-- Create table with VARIANT column
CREATE TABLE my_json_data (
    raw_data VARIANT
);

-- Load JSON data
COPY INTO my_json_data
FROM @my_stage/data.json
FILE_FORMAT = (TYPE = 'JSON');

Querying semi-structured data:

sql
Copy
Download
SELECT 
    raw_data:customer_id::INT AS customer_id,
    raw_data:order_details[0].product_id::STRING AS first_product
FROM my_json_data
WHERE raw_data:order_date::DATE > '2023-01-01';

3. Performing Analytical Queries and Joins

Basic analytical query:

sql
Copy
Download
SELECT 
    product_id,
    COUNT(*) AS order_count,
    SUM(amount) AS total_revenue,
    AVG(amount) AS average_order_value
FROM orders
WHERE order_date BETWEEN '2023-01-01' AND '2023-12-31'
GROUP BY product_id
ORDER BY total_revenue DESC;

Join example:

sql
Copy
Download
SELECT 
    o.order_id,
    o.order_date,
    c.customer_name,
    p.product_name,
    o.quantity,
    o.amount
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN products p ON o.product_id = p.product_id
WHERE o.order_date > DATEADD(month, -3, CURRENT_DATE());

4. Cloning Objects

Clone a table:

sql
Copy
Download
CREATE TABLE orders_clone CLONE orders;

Clone a schema (with all its objects):

sql
Copy
Download
CREATE SCHEMA analytics_dev CLONE analytics_prod;

Clone a database:

sql
Copy
Download
CREATE DATABASE dev_environment CLONE prod_environment;

5. Undoing User Errors Using Time Travel

Querying historical data:

sql
Copy
Download
-- See data as it was 1 hour ago
SELECT * FROM my_table AT(OFFSET => -60*60);

Restoring a dropped table:

sql
Copy
Download
-- First check when it was dropped
SHOW TABLES HISTORY LIKE 'my_dropped_table';

-- Then restore it
UNDROP TABLE my_dropped_table;

Restoring data to a previous point:

sql
Copy
Download
-- Create new table with data from before the error
CREATE TABLE my_table_restored AS
SELECT * FROM my_table BEFORE(TIMESTAMP => '2023-11-15 14:00:00'::TIMESTAMP);

-- Or use time travel to update existing table
INSERT OVERWRITE INTO my_table
SELECT * FROM my_table BEFORE(STATEMENT => '8e5d0ca9-005e-44e6-b858-a8f5b37c5726');

Time Travel retention periods:

  • Standard Edition: 1 day

  • Enterprise Edition: Up to 90 days (default is 1 day, configurable per table)

Note: The exact Time Travel duration depends on your Snowflake edition and configuration.

No comments:

Post a Comment

Python using AI

  Python using AI - Prompts & Codes Tools useful for Python + AI ChatGPT - https://chatgpt.com/ Claude AI - https://claude.ai/new ...