Skip to main content

How to Connect a Data Source

This guide walks you through the process of creating and configuring a data source connection.

Step 1: Navigate to Data Sources

  1. Click Data Sources in the main navigation
  2. Click Create Data Source button

Step 2: Select Data Source Type

Browse over 60 supported data source types organized across 9 categories:

  • Relational -- MySQL, PostgreSQL, MSSQL, Oracle, Redshift, Snowflake, BigQuery, CockroachDB, CrateDB, TimescaleDB, QuestDB, ClickHouse, SingleStore, Greenplum
  • Document/NoSQL -- MongoDB, Elasticsearch, DynamoDB, Firestore, Supabase, CouchDB, Couchbase
  • Key-Value/Cache -- Redis, Memcached
  • Graph -- Neo4j, Amazon Neptune, ArangoDB, TigerGraph
  • Vector -- Milvus, Pinecone, Weaviate, Qdrant, Chroma, pgvector, LanceDB, Vespa, Marqo
  • Multi-Model -- SurrealDB, FaunaDB
  • Spreadsheet -- Airtable, Google Sheets, Baserow, NocoDB, SeaTable, Grist
  • Cloud Storage -- Amazon S3, Google Cloud Storage, MinIO, Azure Blob Storage
  • Message Queue -- RabbitMQ, Apache Kafka, Amazon SQS, Apache Pulsar
  • Generic -- JDBC, ODBC

Use the category dropdown filter or search box to quickly find your data source type. Types are displayed in a paginated card grid (20 per page), sorted alphabetically.

Step 3: Configure Basic Information

After selecting a type, you are taken to the configuration form.

Required Fields

  • Data source label: A unique identifier in kebab-case format (e.g., prod-mysql-db). This is auto-formatted as you type -- only lowercase letters, numbers, and hyphens are allowed. This value is used as both the internal name and label fields.
  • Description: Optional free-text description of the data source's purpose (max 1000 characters).
  • Tags: Optional labels for organization (e.g., production, analytics, customer-data). Type a tag and press Enter or click Add.
Naming Convention

Use descriptive kebab-case names like prod-postgres or analytics-snowflake to easily identify your data sources. The name must match the pattern ^[a-z0-9]+(-[a-z0-9]+)*$. Names must be unique per user within their organization scope.

Step 4: Enter Connection Credentials

Credentials vary by type. The form dynamically renders the correct fields based on the selected type. See the specific configuration guides for detailed information:

Step 5: Test Connection (Optional)

Before saving, you can test your connection:

  1. Click the Test Connection button
  2. The platform creates a temporary data source, tests it, then removes it
  3. If successful, you will see a "Connection test successful!" message
  4. If it fails, the error message indicates the issue (authentication, network, etc.)

Note that clicking Create Data Source also automatically tests the connection. The data source status will be set to connected if the test succeeds, or error if it fails. The data source is still saved either way.

Connection Testing by Type

The platform uses different strategies depending on the data source type:

Test StrategyTypesWhat It Does
Native driver testMySQL, PostgreSQL, MongoDB, Redis, S3/MinIO, Snowflake, BigQuery, Oracle, MSSQL, Neo4j, Elasticsearch, RedshiftConnects with the actual database driver and executes a test query (e.g., SELECT 1, ping, ListBuckets)
PostgreSQL wire protocolClickHouse, QuestDB, CockroachDB, Greenplum, pgvector, SupabaseUses the pg driver since these databases support the PostgreSQL wire protocol
MySQL wire protocolSingleStore, TimescaleDB, CrateDBUses the mysql2 driver since these databases support the MySQL wire protocol
AWS STS validationDynamoDB, SQSValidates AWS credentials using GetCallerIdentity
GCS validationGCSLists buckets to verify Google Cloud credentials
Credentials savedPinecone, Weaviate, Qdrant, Chroma, Firestore, Kafka, RabbitMQ, Pulsar, and othersSaves credentials with message "Credentials saved - use Test Connection to verify"
Connection string checkJDBC, ODBCValidates that a connection string is provided

Step 6: Save

Click Create Data Source to save. The connection is now available for use in workflows. After creation, you will be redirected to the data source details page.

Post-Creation: Permissions

After creating a data source, you can configure who can access it:

Access Control Options

  • Private (default): Only you can access this data source
  • Allow All Users: All users in your organization can use this connection
  • Specific Users: Share with individual users by adding them to the allowed users list

Permissions are managed on the data source details page using the datasources.updatePermissions, datasources.share, and datasources.unshare methods. In multi-tenant mode, you can only share with users in your own organization.

Access Control

Be careful when granting access to production data sources. Always follow your organization's security policies.

Post-Creation: Schema Discovery

After creating a data source, you can discover its schema/metadata:

  1. Navigate to the data source details page
  2. Click Refresh Metadata to fetch tables, schemas, databases, collections, or buckets
  3. For supported databases, you can also fetch column-level details for individual tables

Schema Discovery Support

Support LevelTypesWhat It Returns
Full native connectorMySQL, PostgreSQL, MongoDB, Redis, S3, MinIO, Snowflake, BigQuery, Oracle, Redshift, Neo4jTables, schemas, databases, size, row counts
PostgreSQL-compatibleCockroachDB, CrateDB, TimescaleDB, Greenplum, pgvector, QuestDBUses PostgreSQL metadata queries
MySQL-compatibleSingleStoreUses MySQL metadata queries
Placeholder/HTTPElasticsearch, DynamoDB, Firestore, Kafka, and othersReturns empty metadata (not yet implemented)

Column-Level Metadata

Column-level metadata (column names, types, nullability, keys, defaults) can be fetched for individual tables for the following types: MySQL, PostgreSQL, Oracle, Redshift, Snowflake, BigQuery, and MongoDB.

Common Connection Issues

Authentication Failures

  • Verify username and password are correct
  • Check if user has necessary database permissions
  • Ensure IP whitelist includes the platform's IP addresses

Network Errors

  • Verify hostname and port are correct
  • Check firewall rules allow connections from the platform
  • Ensure VPN or network connectivity is established

SSL/TLS Issues

  • Enable SSL/TLS if required by your database
  • Verify certificate validity
  • Check if self-signed certificates need special configuration

Next Steps