Amazon S3 Configuration

Connect to Amazon S3 for file storage, data lakes, backups, and object storage needs.

Connection Parameters

Required Fields

Field	Description	Example
Bucket Name	S3 bucket name	`my-data-bucket`
Region	AWS region	`us-east-1`
Access Key ID	AWS access key	`AKIA...`
Secret Access Key	AWS secret key (encrypted at rest)

Optional Fields

Field	Description	Example
Custom Endpoint	For S3-compatible services (MinIO, DigitalOcean Spaces, etc.)	`https://minio.example.com:9000`

Credential Field Names

The credential fields use camelCase names: bucketName, region, accessKeyId, secretAccessKey, and endpoint. These are the exact field names you will see in STRONGLY_SERVICES.

Configuration Example

When creating an S3 data source, provide the following information:

Field	Example Value	Notes
Data source label	`prod-s3-storage`	Kebab-case unique identifier (used as both `name` and `label`)
Bucket Name	`my-data-bucket`	S3 bucket name
Region	`us-east-1`	AWS region
Access Key ID	`AKIA...`	AWS access key
Secret Access Key	`wJalr...`	Encrypted at rest
Custom Endpoint	-	Only for S3-compatible services

AWS IAM Configuration

Creating an IAM User

Go to AWS IAM Console
Navigate to Users -> Add Users
Select Access key - Programmatic access
Attach policies (see Required Permissions)
Complete user creation
Save the Access Key ID and Secret Access Key

Security

Save your secret access key immediately. AWS won't show it again after creation.

Required Permissions

Grant the following permissions to your IAM user:

Read-Only Access

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:GetObjectVersion",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::my-data-bucket",
        "arn:aws:s3:::my-data-bucket/*"
      ]
    }
  ]
}

Read-Write Access

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:GetObjectVersion",
        "s3:PutObject",
        "s3:DeleteObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::my-data-bucket",
        "arn:aws:s3:::my-data-bucket/*"
      ]
    }
  ]
}

Full Access (Including Bucket Management)

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "s3:*",
      "Resource": [
        "arn:aws:s3:::my-data-bucket",
        "arn:aws:s3:::my-data-bucket/*"
      ]
    }
  ]
}

Least Privilege

Use the minimum required permissions. For most applications, read-only or read-write access is sufficient.

Test Connection

When you create or test an S3 data source, the platform uses the @aws-sdk/client-s3 library to create an S3 client with the provided region, accessKeyId, and secretAccessKey. If a custom endpoint is provided, forcePathStyle is enabled for S3-compatible services. The test executes a ListBuckets command to verify credentials. On success, it returns "Bucket accessible."

MinIO data sources use the same test connection logic.

Schema Discovery

S3 has full native schema discovery support. Clicking Refresh Metadata returns:

Buckets: If a bucketName is configured, returns that single bucket. Otherwise, lists all accessible buckets via ListBuckets.
Size: Total size of all objects in the configured bucket (in bytes)
Object count: Total number of objects in the bucket (returned in the rowCount field)

The bucket contents are enumerated using ListObjectsV2 with pagination (1000 objects per page).

Browsing Bucket Contents

The platform also provides a dedicated method (datasources.getS3BucketContents) to browse S3 bucket contents with folder navigation. This returns files and folders (using the / delimiter) with their keys, sizes, and last modified dates. This works for S3, MinIO, GCS, and Azure Blob Storage type data sources.

Usage in Workflows (STRONGLY_SERVICES)

When an S3 data source is attached to a workflow, credentials are injected via STRONGLY_SERVICES with all fields at the top level (not nested under a credentials key). No connectionString is generated for S3 (the connection string format s3://bucket is used for display only):

{
  "datasources": {
    "prod_s3_storage": {
      "type": "s3",
      "name": "prod-s3-storage",
      "bucketName": "my-data-bucket",
      "region": "us-east-1",
      "accessKeyId": "AKIA...",
      "secretAccessKey": "wJalr...",
      "endpoint": null
    }
  }
}

Python Example (boto3)

import os, json
import boto3

# Parse STRONGLY_SERVICES environment variable
services = json.loads(os.getenv('STRONGLY_SERVICES', '{}'))
datasources = services.get('datasources', {})

# Get S3 data source (key is sanitized name: hyphens become underscores)
s3_ds = datasources['prod_s3_storage']

# Create S3 client using top-level camelCase fields
s3_config = {
    'aws_access_key_id': s3_ds['accessKeyId'],
    'aws_secret_access_key': s3_ds['secretAccessKey'],
    'region_name': s3_ds['region']
}

# Add custom endpoint if provided (for S3-compatible services)
if s3_ds.get('endpoint'):
    s3_config['endpoint_url'] = s3_ds['endpoint']

s3 = boto3.client('s3', **s3_config)

# List objects in bucket
response = s3.list_objects_v2(Bucket=s3_ds['bucketName'])
for obj in response.get('Contents', []):
    print(f"File: {obj['Key']}, Size: {obj['Size']} bytes")

# Upload a file
s3.upload_file('local-file.txt', s3_ds['bucketName'], 'remote-file.txt')

# Download a file
s3.download_file(s3_ds['bucketName'], 'remote-file.txt', 'downloaded-file.txt')

# Read file content directly
obj = s3.get_object(Bucket=s3_ds['bucketName'], Key='data.json')
content = obj['Body'].read().decode('utf-8')
print(content)

Python with Custom Endpoint (MinIO)

import boto3

# For S3-compatible services like MinIO
s3 = boto3.client(
    's3',
    endpoint_url=s3_ds.get('endpoint'),  # e.g., 'https://minio.example.com'
    aws_access_key_id=s3_ds['accessKeyId'],
    aws_secret_access_key=s3_ds['secretAccessKey'],
    region_name=s3_ds['region']
)

Node.js Example (AWS SDK v3)

const { S3Client, ListObjectsV2Command, GetObjectCommand, PutObjectCommand } = require('@aws-sdk/client-s3');
const { createReadStream, createWriteStream } = require('fs');

// Parse STRONGLY_SERVICES environment variable
const services = JSON.parse(process.env.STRONGLY_SERVICES || '{}');
const datasources = services.datasources || {};

// Get S3 data source (key is sanitized name)
const s3ds = datasources['prod_s3_storage'];

// Create S3 client using top-level camelCase fields
const s3Config = {
  region: s3ds.region,
  credentials: {
    accessKeyId: s3ds.accessKeyId,
    secretAccessKey: s3ds.secretAccessKey
  }
};

// Add custom endpoint if provided
if (s3ds.endpoint) {
  s3Config.endpoint = s3ds.endpoint;
  s3Config.forcePathStyle = true;
}

const s3Client = new S3Client(s3Config);

// List objects
const listCommand = new ListObjectsV2Command({
  Bucket: s3ds.bucketName
});
const listResponse = await s3Client.send(listCommand);
console.log('Objects:', listResponse.Contents);

// Upload a file
const uploadCommand = new PutObjectCommand({
  Bucket: s3ds.bucketName,
  Key: 'remote-file.txt',
  Body: createReadStream('local-file.txt')
});
await s3Client.send(uploadCommand);

// Download a file
const downloadCommand = new GetObjectCommand({
  Bucket: s3ds.bucketName,
  Key: 'remote-file.txt'
});
const downloadResponse = await s3Client.send(downloadCommand);
const body = await downloadResponse.Body.transformToString();
console.log(body);

Common Operations

Upload File with Metadata

s3.upload_file(
    'local-file.txt',
    s3_ds['bucketName'],
    'remote-file.txt',
    ExtraArgs={
        'Metadata': {
            'uploaded-by': 'my-app',
            'content-type': 'text/plain'
        },
        'ContentType': 'text/plain'
    }
)

Generate Presigned URL

Create temporary URLs for file access:

# Generate presigned URL (valid for 1 hour)
url = s3.generate_presigned_url(
    'get_object',
    Params={
        'Bucket': s3_ds['bucketName'],
        'Key': 'private-file.txt'
    },
    ExpiresIn=3600
)
print(f"Temporary URL: {url}")

Copy Objects Between Buckets

copy_source = {
    'Bucket': 'source-bucket',
    'Key': 'source-file.txt'
}

s3.copy_object(
    CopySource=copy_source,
    Bucket='destination-bucket',
    Key='destination-file.txt'
)

Delete Objects

# Delete single object
s3.delete_object(Bucket=s3_ds['bucketName'], Key='file-to-delete.txt')

# Delete multiple objects
s3.delete_objects(
    Bucket=s3_ds['bucketName'],
    Delete={
        'Objects': [
            {'Key': 'file1.txt'},
            {'Key': 'file2.txt'},
            {'Key': 'file3.txt'}
        ]
    }
)

S3-Compatible Services

This configuration also works with S3-compatible services:

MinIO

s3 = boto3.client(
    's3',
    endpoint_url='https://minio.example.com',
    aws_access_key_id=s3_ds['accessKeyId'],
    aws_secret_access_key=s3_ds['secretAccessKey'],
    region_name='us-east-1'
)

DigitalOcean Spaces

s3 = boto3.client(
    's3',
    endpoint_url='https://nyc3.digitaloceanspaces.com',
    aws_access_key_id=s3_ds['accessKeyId'],
    aws_secret_access_key=s3_ds['secretAccessKey'],
    region_name='nyc3'
)

Wasabi

s3 = boto3.client(
    's3',
    endpoint_url='https://s3.wasabisys.com',
    aws_access_key_id=s3_ds['accessKeyId'],
    aws_secret_access_key=s3_ds['secretAccessKey'],
    region_name='us-east-1'
)

Common Issues

Access Denied

Verify IAM permissions include required S3 actions
Check bucket policy allows access from IAM user
Ensure bucket exists and name is correct
Verify access keys are correct and not expired

Invalid Access Key ID

Check access key ID is correct
Verify IAM user still exists
Ensure access key hasn't been deleted or deactivated
Regenerate keys if necessary

Bucket Not Found

Verify bucket name is correct (case-sensitive)
Check bucket exists in correct region
Ensure IAM user has permission to access bucket

Region Mismatch

Verify region matches bucket location
Use us-east-1 for buckets without specific region
Check region when creating S3 client

Best Practices

Use IAM Roles: For applications running on AWS, use IAM roles instead of access keys
Least Privilege: Grant minimal required permissions
Enable Versioning: Use S3 versioning for important data
Server-Side Encryption: Enable encryption at rest for sensitive data
Lifecycle Policies: Configure lifecycle policies to manage storage costs
Access Logging: Enable S3 access logging for audit trails
Secure Keys: Never commit access keys to version control
Use HTTPS: Always use HTTPS endpoints for data in transit encryption
Multipart Upload: Use multipart upload for large files (>100MB)
Monitor Costs: Set up billing alerts and monitor S3 usage

Performance Optimization

Multipart Upload for Large Files

import boto3
from boto3.s3.transfer import TransferConfig

# Configure multipart upload thresholds
config = TransferConfig(
    multipart_threshold=1024 * 25,  # 25 MB
    max_concurrency=10,
    multipart_chunksize=1024 * 25,
    use_threads=True
)

s3.upload_file(
    'large-file.zip',
    s3_ds['bucketName'],
    'remote-large-file.zip',
    Config=config
)

Parallel Downloads

# Download multiple files in parallel
from concurrent.futures import ThreadPoolExecutor

def download_file(key):
    s3.download_file(s3_ds['bucketName'], key, f"local-{key}")

files = ['file1.txt', 'file2.txt', 'file3.txt']
with ThreadPoolExecutor(max_workers=5) as executor:
    executor.map(download_file, files)

Connection Parameters​

Required Fields​

Optional Fields​

Configuration Example​

AWS IAM Configuration​

Creating an IAM User​

Required Permissions​

Read-Only Access​

Read-Write Access​

Full Access (Including Bucket Management)​

Test Connection​

Schema Discovery​

Browsing Bucket Contents​

Usage in Workflows (STRONGLY_SERVICES)​

Python Example (boto3)​

Python with Custom Endpoint (MinIO)​

Node.js Example (AWS SDK v3)​

Common Operations​

Upload File with Metadata​

Generate Presigned URL​

Copy Objects Between Buckets​

Delete Objects​

S3-Compatible Services​

MinIO​

DigitalOcean Spaces​

Wasabi​

Common Issues​

Access Denied​

Invalid Access Key ID​

Bucket Not Found​

Region Mismatch​

Best Practices​

Performance Optimization​

Multipart Upload for Large Files​

Parallel Downloads​

Related Resources​