How to Connect a Data Source
This guide walks you through the process of creating and configuring a data source connection.
Step 1: Navigate to Data Sources
- Click Data Sources in the main navigation
- Click Create Data Source button
Step 2: Select Data Source Type
Browse over 60 supported data source types organized across 9 categories:
- Relational -- MySQL, PostgreSQL, MSSQL, Oracle, Redshift, Snowflake, BigQuery, CockroachDB, CrateDB, TimescaleDB, QuestDB, ClickHouse, SingleStore, Greenplum
- Document/NoSQL -- MongoDB, Elasticsearch, DynamoDB, Firestore, Supabase, CouchDB, Couchbase
- Key-Value/Cache -- Redis, Memcached
- Graph -- Neo4j, Amazon Neptune, ArangoDB, TigerGraph
- Vector -- Milvus, Pinecone, Weaviate, Qdrant, Chroma, pgvector, LanceDB, Vespa, Marqo
- Multi-Model -- SurrealDB, FaunaDB
- Spreadsheet -- Airtable, Google Sheets, Baserow, NocoDB, SeaTable, Grist
- Cloud Storage -- Amazon S3, Google Cloud Storage, MinIO, Azure Blob Storage
- Message Queue -- RabbitMQ, Apache Kafka, Amazon SQS, Apache Pulsar
- Generic -- JDBC, ODBC
Use the category dropdown filter or search box to quickly find your data source type. Types are displayed in a paginated card grid (20 per page), sorted alphabetically.
Step 3: Configure Basic Information
After selecting a type, you are taken to the configuration form.
Required Fields
- Data source label: A unique identifier in kebab-case format (e.g.,
prod-mysql-db). This is auto-formatted as you type -- only lowercase letters, numbers, and hyphens are allowed. This value is used as both the internalnameandlabelfields. - Description: Optional free-text description of the data source's purpose (max 1000 characters).
- Tags: Optional labels for organization (e.g., production, analytics, customer-data). Type a tag and press Enter or click Add.
Use descriptive kebab-case names like prod-postgres or analytics-snowflake to easily identify your data sources. The name must match the pattern ^[a-z0-9]+(-[a-z0-9]+)*$. Names must be unique per user within their organization scope.
Step 4: Enter Connection Credentials
Credentials vary by type. The form dynamically renders the correct fields based on the selected type. See the specific configuration guides for detailed information:
- PostgreSQL Configuration
- MySQL Configuration
- MongoDB Configuration
- Snowflake Configuration
- BigQuery Configuration
- Amazon S3 Configuration
- Redis Configuration
Step 5: Test Connection (Optional)
Before saving, you can test your connection:
- Click the Test Connection button
- The platform creates a temporary data source, tests it, then removes it
- If successful, you will see a "Connection test successful!" message
- If it fails, the error message indicates the issue (authentication, network, etc.)
Note that clicking Create Data Source also automatically tests the connection. The data source status will be set to connected if the test succeeds, or error if it fails. The data source is still saved either way.
Connection Testing by Type
The platform uses different strategies depending on the data source type:
| Test Strategy | Types | What It Does |
|---|---|---|
| Native driver test | MySQL, PostgreSQL, MongoDB, Redis, S3/MinIO, Snowflake, BigQuery, Oracle, MSSQL, Neo4j, Elasticsearch, Redshift | Connects with the actual database driver and executes a test query (e.g., SELECT 1, ping, ListBuckets) |
| PostgreSQL wire protocol | ClickHouse, QuestDB, CockroachDB, Greenplum, pgvector, Supabase | Uses the pg driver since these databases support the PostgreSQL wire protocol |
| MySQL wire protocol | SingleStore, TimescaleDB, CrateDB | Uses the mysql2 driver since these databases support the MySQL wire protocol |
| AWS STS validation | DynamoDB, SQS | Validates AWS credentials using GetCallerIdentity |
| GCS validation | GCS | Lists buckets to verify Google Cloud credentials |
| Credentials saved | Pinecone, Weaviate, Qdrant, Chroma, Firestore, Kafka, RabbitMQ, Pulsar, and others | Saves credentials with message "Credentials saved - use Test Connection to verify" |
| Connection string check | JDBC, ODBC | Validates that a connection string is provided |
Step 6: Save
Click Create Data Source to save. The connection is now available for use in workflows. After creation, you will be redirected to the data source details page.
Post-Creation: Permissions
After creating a data source, you can configure who can access it:
Access Control Options
- Private (default): Only you can access this data source
- Allow All Users: All users in your organization can use this connection
- Specific Users: Share with individual users by adding them to the allowed users list
Permissions are managed on the data source details page using the datasources.updatePermissions, datasources.share, and datasources.unshare methods. In multi-tenant mode, you can only share with users in your own organization.
Be careful when granting access to production data sources. Always follow your organization's security policies.
Post-Creation: Schema Discovery
After creating a data source, you can discover its schema/metadata:
- Navigate to the data source details page
- Click Refresh Metadata to fetch tables, schemas, databases, collections, or buckets
- For supported databases, you can also fetch column-level details for individual tables
Schema Discovery Support
| Support Level | Types | What It Returns |
|---|---|---|
| Full native connector | MySQL, PostgreSQL, MongoDB, Redis, S3, MinIO, Snowflake, BigQuery, Oracle, Redshift, Neo4j | Tables, schemas, databases, size, row counts |
| PostgreSQL-compatible | CockroachDB, CrateDB, TimescaleDB, Greenplum, pgvector, QuestDB | Uses PostgreSQL metadata queries |
| MySQL-compatible | SingleStore | Uses MySQL metadata queries |
| Placeholder/HTTP | Elasticsearch, DynamoDB, Firestore, Kafka, and others | Returns empty metadata (not yet implemented) |
Column-Level Metadata
Column-level metadata (column names, types, nullability, keys, defaults) can be fetched for individual tables for the following types: MySQL, PostgreSQL, Oracle, Redshift, Snowflake, BigQuery, and MongoDB.
Common Connection Issues
Authentication Failures
- Verify username and password are correct
- Check if user has necessary database permissions
- Ensure IP whitelist includes the platform's IP addresses
Network Errors
- Verify hostname and port are correct
- Check firewall rules allow connections from the platform
- Ensure VPN or network connectivity is established
SSL/TLS Issues
- Enable SSL/TLS if required by your database
- Verify certificate validity
- Check if self-signed certificates need special configuration
Next Steps
- Learn how to use data sources in workflows
- Explore specific database configurations
- Set up workflows that use your data sources