✅ Database & warehouse
setup
✅ Data loading
(CSV, JSON, Parquet)
✅ Query
optimization (clustering, caching, materialized views)
✅ Security
(RLS, data sharing, grants)
✅ Time Travel
& cloning
✅ Cost
monitoring & administration
Snowflake Operations Guide
1. Creating Databases, Tables, Views, and Virtual Warehouses
Databases:
CREATE DATABASE my_database;
Tables:
CREATE TABLE my_table ( id INT, name STRING, created_at TIMESTAMP );
Views:
CREATE VIEW my_view AS SELECT id, name FROM my_table WHERE created_at > CURRENT_DATE() - 30;
Virtual Warehouses:
CREATE WAREHOUSE my_warehouse WITH WAREHOUSE_SIZE = 'X-SMALL' AUTO_SUSPEND = 300 -- in seconds AUTO_RESUME = TRUE;
2. Loading and Querying Structured and Semi-Structured Data
Loading structured data (CSV):
-- Create stage CREATE OR REPLACE STAGE my_csv_stage FILE_FORMAT = (TYPE = 'CSV' FIELD_OPTIONALLY_ENCLOSED_BY = '"'); -- Put file to stage (using SnowSQL or web interface) -- COPY INTO command COPY INTO my_table FROM @my_csv_stage/data.csv FILE_FORMAT = (TYPE = 'CSV' FIELD_OPTIONALLY_ENCLOSED_BY = '"');
Loading semi-structured data (JSON):
-- Create table with VARIANT column CREATE TABLE my_json_data ( raw_data VARIANT ); -- Load JSON data COPY INTO my_json_data FROM @my_stage/data.json FILE_FORMAT = (TYPE = 'JSON');
Querying semi-structured data:
SELECT raw_data:customer_id::INT AS customer_id, raw_data:order_details[0].product_id::STRING AS first_product FROM my_json_data WHERE raw_data:order_date::DATE > '2023-01-01';
3. Performing Analytical Queries and Joins
Basic analytical query:
SELECT product_id, COUNT(*) AS order_count, SUM(amount) AS total_revenue, AVG(amount) AS average_order_value FROM orders WHERE order_date BETWEEN '2023-01-01' AND '2023-12-31' GROUP BY product_id ORDER BY total_revenue DESC;
Join example:
SELECT o.order_id, o.order_date, c.customer_name, p.product_name, o.quantity, o.amount FROM orders o JOIN customers c ON o.customer_id = c.customer_id JOIN products p ON o.product_id = p.product_id WHERE o.order_date > DATEADD(month, -3, CURRENT_DATE());
4. Cloning Objects
Clone a table:
CREATE TABLE orders_clone CLONE orders;
Clone a schema (with all its objects):
CREATE SCHEMA analytics_dev CLONE analytics_prod;
Clone a database:
CREATE DATABASE dev_environment CLONE prod_environment;
5. Undoing User Errors Using Time Travel
Querying historical data:
-- See data as it was 1 hour ago SELECT * FROM my_table AT(OFFSET => -60*60);
Restoring a dropped table:
-- First check when it was dropped SHOW TABLES HISTORY LIKE 'my_dropped_table'; -- Then restore it UNDROP TABLE my_dropped_table;
Restoring data to a previous point:
-- Create new table with data from before the error CREATE TABLE my_table_restored AS SELECT * FROM my_table BEFORE(TIMESTAMP => '2023-11-15 14:00:00'::TIMESTAMP); -- Or use time travel to update existing table INSERT OVERWRITE INTO my_table SELECT * FROM my_table BEFORE(STATEMENT => '8e5d0ca9-005e-44e6-b858-a8f5b37c5726');
Time Travel retention periods:
Standard Edition: 1 day
Enterprise Edition: Up to 90 days (default is 1 day, configurable per table)
Note: The exact Time Travel duration depends on your Snowflake edition and configuration.
No comments:
Post a Comment