Skip to main content

Snowflake Configuration

Connect to Snowflake for analytics, large-scale queries, and business intelligence workloads.

Connection Parameters

Required Fields

FieldDescriptionExample
AccountSnowflake account identifierxy12345.us-east-1
WarehouseCompute warehouse nameCOMPUTE_WH
DatabaseDatabase nameANALYTICS_DB
UsernameSnowflake usernameapp_user
PasswordUser password (encrypted at rest)

There are no optional fields for Snowflake. All five fields above are required.

Connection String Format

snowflake://user:***@account/database/warehouse

Configuration Example

When creating a Snowflake data source, provide the following information:

FieldExample ValueNotes
Data source labelprod-snowflakeKebab-case unique identifier (used as both name and label)
Accountxy12345.us-east-1Account identifier
WarehouseCOMPUTE_WHWarehouse name
DatabaseANALYTICS_DBDatabase name
Usernameapp_userSnowflake user
PasswordpasswordEncrypted at rest

Finding Your Account Identifier

Your Snowflake account identifier can be found in several places:

  1. From URL: Look at your Snowflake URL: https://<account_identifier>.snowflakecomputing.com
  2. From Snowflake UI: Account -> Account Name (shown in bottom left)
  3. Format Examples:
    • Legacy: xy12345
    • Preferred: xy12345.us-east-1
    • With organization: orgname-xy12345
Account Identifier Format

Use the full account identifier including region for best compatibility: xy12345.us-east-1

Test Connection

When you create or test a Snowflake data source, the platform connects using the snowflake-sdk Node.js driver with the provided account, username, password, warehouse, database, and region credentials. The connection uses a 15-second timeout. On success, the connection is destroyed and "Connection successful" is returned.

Schema Discovery

Snowflake has full native schema discovery support. Clicking Refresh Metadata returns:

  • Tables: All tables in the current database (via SHOW TABLES)
  • Schemas: All schemas in the current database (via SHOW SCHEMAS)
  • Databases: All accessible databases (via SHOW DATABASES)
  • Size: Total bytes from INFORMATION_SCHEMA.TABLE_STORAGE_METRICS
  • Row count: Total row count from INFORMATION_SCHEMA.TABLE_STORAGE_METRICS

Column-Level Metadata

You can fetch column details for individual tables, which returns:

  • Column name, data type, character maximum length, numeric precision and scale
  • Nullability
  • Default values
  • Row count and bytes from INFORMATION_SCHEMA.TABLES

Column queries use the PUBLIC schema by default if no schema is specified.

Usage in Workflows (STRONGLY_SERVICES)

When a Snowflake data source is attached to a workflow, credentials are injected via STRONGLY_SERVICES with all fields at the top level (not nested under a credentials key). A connectionString is included:

{
"datasources": {
"prod_snowflake": {
"type": "snowflake",
"name": "prod-snowflake",
"account": "xy12345.us-east-1",
"warehouse": "COMPUTE_WH",
"database": "ANALYTICS_DB",
"username": "app_user",
"password": "actual_password",
"connectionString": "snowflake://app_user:actual_password@xy12345.us-east-1/ANALYTICS_DB/COMPUTE_WH"
}
}
}

Python Example

import os, json
import snowflake.connector

# Parse STRONGLY_SERVICES environment variable
services = json.loads(os.getenv('STRONGLY_SERVICES', '{}'))
datasources = services.get('datasources', {})

# Get Snowflake data source (key is sanitized name: hyphens become underscores)
sf = datasources['prod_snowflake']

# Connect using top-level credential fields
conn = snowflake.connector.connect(
account=sf['account'],
user=sf['username'],
password=sf['password'],
warehouse=sf['warehouse'],
database=sf['database']
)

# Execute queries
cursor = conn.cursor()
cursor.execute("SELECT * FROM users WHERE active = true")
users = cursor.fetchall()

Python with Pandas

import os, json
import snowflake.connector
import pandas as pd

# Parse STRONGLY_SERVICES environment variable
services = json.loads(os.getenv('STRONGLY_SERVICES', '{}'))
datasources = services.get('datasources', {})

# Get Snowflake data source
sf = datasources['prod_snowflake']

# Connect using top-level fields
conn = snowflake.connector.connect(
account=sf['account'],
user=sf['username'],
password=sf['password'],
warehouse=sf['warehouse'],
database=sf['database']
)

# Query into DataFrame
df = pd.read_sql("SELECT * FROM users WHERE active = true", conn)
print(df.head())

Node.js Example

const snowflake = require('snowflake-sdk');

// Parse STRONGLY_SERVICES environment variable
const services = JSON.parse(process.env.STRONGLY_SERVICES || '{}');
const datasources = services.datasources || {};

// Get Snowflake data source (key is sanitized name)
const sf = datasources['prod_snowflake'];

// Create connection using top-level fields
const connection = snowflake.createConnection({
account: sf.account,
username: sf.username,
password: sf.password,
warehouse: sf.warehouse,
database: sf.database
});

// Connect
connection.connect((err, conn) => {
if (err) {
console.error('Unable to connect: ' + err.message);
} else {
console.log('Successfully connected to Snowflake.');

// Execute query
connection.execute({
sqlText: 'SELECT * FROM users WHERE active = true',
complete: (err, stmt, rows) => {
if (err) {
console.error('Failed to execute statement: ' + err.message);
} else {
console.log('Number of rows produced: ' + rows.length);
}
}
});
}
});

Warehouse Configuration

Snowflake warehouses control compute resources:

Warehouse Sizes

SizeServersCredits/HourUse Case
X-Small11Development, testing
Small22Light production workloads
Medium44Standard production
Large88Heavy analytics
X-Large1616Very large queries

Auto-suspend and Auto-resume

Configure warehouses to automatically suspend when idle:

ALTER WAREHOUSE COMPUTE_WH SET
AUTO_SUSPEND = 300 -- Suspend after 5 minutes
AUTO_RESUME = TRUE; -- Resume automatically when needed
Cost Optimization

Use auto-suspend to minimize costs. Warehouses incur charges only when running.

Common Issues

Invalid Account Identifier

  • Verify the account identifier format
  • Include the region: xy12345.us-east-1
  • Check for typos or extra spaces
  • Try the legacy format if the preferred format doesn't work

Authentication Failed

  • Verify username and password are correct
  • Check if user account is locked or expired
  • Ensure user has necessary privileges
  • Verify IP whitelist if configured

Warehouse Not Found

  • Ensure warehouse name is spelled correctly
  • Check if warehouse exists: SHOW WAREHOUSES;
  • Verify user has USAGE privilege on the warehouse:
GRANT USAGE ON WAREHOUSE COMPUTE_WH TO USER app_user;

Database Not Found

  • Verify the database name is correct (case-sensitive)
  • Check if database exists: SHOW DATABASES;
  • Grant access if needed:
GRANT USAGE ON DATABASE ANALYTICS_DB TO USER app_user;

Insufficient Privileges

Grant necessary privileges to your application user:

-- Grant warehouse usage
GRANT USAGE ON WAREHOUSE COMPUTE_WH TO USER app_user;

-- Grant database access
GRANT USAGE ON DATABASE ANALYTICS_DB TO USER app_user;

-- Grant schema access
GRANT USAGE ON SCHEMA ANALYTICS_DB.PUBLIC TO USER app_user;

-- Grant table access (read-only)
GRANT SELECT ON ALL TABLES IN SCHEMA ANALYTICS_DB.PUBLIC TO USER app_user;

-- Grant future tables access
GRANT SELECT ON FUTURE TABLES IN SCHEMA ANALYTICS_DB.PUBLIC TO USER app_user;

Best Practices

  1. Use Appropriate Warehouse Size: Match warehouse size to workload requirements
  2. Enable Auto-suspend: Configure warehouses to auto-suspend when idle
  3. Least Privilege: Grant minimal required permissions to application users
  4. Query Optimization: Use clustering keys and optimize queries for better performance
  5. Result Caching: Leverage Snowflake's result cache for repeated queries
  6. Multi-cluster Warehouses: Use for high concurrency workloads
  7. Monitor Costs: Track warehouse usage and optimize query patterns

Query Performance Tips

Use Query Tags

Track queries from your application:

cursor.execute("ALTER SESSION SET QUERY_TAG = 'my-app-users-query'")
cursor.execute("SELECT * FROM users WHERE active = true")

Optimize with Clustering

For large tables, use clustering keys:

ALTER TABLE users CLUSTER BY (created_date, region);

Use Materialized Views

Cache complex query results:

CREATE MATERIALIZED VIEW active_users AS
SELECT * FROM users WHERE active = true;