Skip to main content

FinOps & Cost Management

Track infrastructure costs, manage budgets with threshold enforcement, schedule resource auto-stop/start, organize resources into groups, and analyze cost trends with forecasting and anomaly detection.

Overview

The FinOps module provides comprehensive cost management for the Strongly AI platform. It combines AWS cost data with platform-level resource tracking to give administrators full visibility into infrastructure spending, execution costs, and optimization opportunities.

Core Capabilities

  • Cost Dashboard -- Real-time cost metrics with monthly trends, daily breakdowns, service-level analysis, and cost forecasting
  • Budgets -- Create budgets at platform, user, or resource group scope with threshold alerts and enforcement actions
  • Resource Groups -- Organize platform resources (apps, workflows, add-ons, models) into logical groups for budget and schedule management
  • Resource Schedules -- Time-based auto-stop/start schedules to reduce costs during off-hours
  • Cost Forecasting -- ML-based predictions of future costs based on historical spending patterns
  • Anomaly Detection -- Automatic identification of unusual cost spikes
  • Execution Costs -- Track costs associated with workflow executions
  • Cache Management -- Backend caching layer for cost data with manual refresh capability

Architecture

FinOps data flows through several layers:

  1. Backend API -- Fetches cost data from AWS Cost Explorer, calculates execution costs, and generates forecasts
  2. Cache Layer -- Cost data is cached for fast access to avoid repeated AWS API calls
  3. Frontend Methods -- Proxy requests to the backend API or serve cached data
  4. Budget/Schedule/Group Management -- Managed via the platform's data layer

Backend API Endpoints

The platform communicates with the backend service for cost data:

EndpointDescription
GET /api/v1/finops/monthly-costsMonthly cost breakdown by service
GET /api/v1/finops/daily-costsDaily cost breakdown
GET /api/v1/finops/cost-forecastML-based cost predictions
GET /api/v1/finops/service-breakdownCost breakdown by AWS service for a date range
GET /api/v1/finops/savings-recommendationsAWS optimization recommendations
GET /api/v1/finops/cost-anomaliesDetected cost anomalies
GET /api/v1/finops/execution-costsWorkflow execution cost tracking
GET /api/v1/finops/combined-costsCombined infrastructure + execution costs
POST /api/v1/finops/cache/trigger-refreshTrigger background cache refresh

Cost Dashboard

Accessing the Dashboard

  1. Click FinOps in the main navigation
  2. The dashboard displays key metrics:
    • Current Month Spend -- Real-time cost accumulation with trend vs. previous month
    • Average Monthly Cost -- Baseline for comparison over selected time range
    • Projected Next Month -- ML-based forecast of upcoming costs
    • Potential Savings -- Total from optimization recommendations

Data Sources

The dashboard aggregates data from multiple sources:

  • Monthly costs -- Cached from AWS Cost Explorer, broken down by service
  • Daily costs -- Granular daily spending for the last 30 days
  • Predictions -- ML-based cost forecasts
  • Service breakdown -- Cost distribution across AWS services
  • Anomalies -- Statistically detected cost spikes

All data is served from cache for fast access. When no cached data exists, the system falls back to legacy cache entries for backward compatibility.

Cost Explorer Methods

MethodParametersDescription
costExplorer.getAllDatamonths (default: 12)Get all cached cost data in a single call
costExplorer.getMonthlyCostsmonths (default: 12)Monthly cost breakdown
costExplorer.getDailyCostsdays (default: 30)Daily cost breakdown
costExplorer.getCostForecastmonthsAhead (default: 3)Cost predictions
costExplorer.getServiceBreakdownstartDate, endDateService-level cost breakdown for a date range
costExplorer.getSavingsRecommendations--AWS optimization recommendations
costExplorer.getCostAnomalies--Detected cost anomalies
costExplorer.getExecutionCostsmonths, yearWorkflow execution costs
costExplorer.getCombinedCostsmonths (default: 12)Infrastructure + execution costs combined
costExplorer.triggerCacheRefresh--Trigger background cache refresh

All cost methods require admin role access.

Cache Refresh

Cost data is cached to minimize AWS API calls. To refresh the cache:

  1. Navigate to the FinOps dashboard
  2. Click the Refresh button
  3. The system triggers a background cache refresh via the backend API
  4. New data appears on the dashboard once the refresh completes

The refresh is fire-and-forget -- the UI returns immediately while the backend processes the refresh asynchronously.

Resource Groups

Resource Groups allow you to organize platform resources into logical groupings for budget and schedule management.

Supported Resource Types

TypeDescription
appDeployed applications
workflowWorkflow definitions
addonPlatform add-ons
workspaceWorkspaces
projectProjects
fine_tuningFine-tuning jobs
self_hosted_modelSelf-hosted model deployments
ml_modelML model registry entries
automlAutoML jobs

Creating a Resource Group

  1. Navigate to FinOps then Resource Groups
  2. Click Create Resource Group
  3. Enter a name and description
  4. Optionally add initial resources
  5. Click Create

Managing Resources in Groups

OperationMethodDescription
List groupsresourceGroups.listList groups with optional status, search, and pagination filters
Get groupresourceGroups.getGet a single group by ID
Create groupresourceGroups.createCreate a new group
Update groupresourceGroups.updateUpdate name, description, or status
Delete groupresourceGroups.deleteDelete a group
Add resourceresourceGroups.addResourceAdd a resource (type, ID, name) to a group
Remove resourceresourceGroups.removeResourceRemove a resource from a group
Find by resourceresourceGroups.findByResourceFind all groups containing a specific resource

Access Control

  • Administrators can see all resource groups
  • Non-admin users can only see groups they own (ownerId matches their user ID)
  • Only the owner or an admin can modify or delete a group
  • Duplicate resources (same type + ID) are prevented within a group
  1. Go to the Overview tab
  2. Review the monthly cost trend chart showing:
    • Total cost per month
    • Month-over-month percentage change
    • Top AWS services contributing to costs
  3. Hover over any month to see a detailed breakdown

Service Breakdown

  1. Click the Service Breakdown tab
  2. View cost distribution across AWS services (EC2, RDS, S3, EKS, etc.)
  3. Review the detailed table with:
    • Service name
    • Current month cost
    • Previous month cost
    • Percentage change
    • Percentage of total spend

Cost Forecast

  1. Navigate to the Forecast tab
  2. View 3-month cost projections based on historical patterns
  3. Review confidence intervals (upper and lower bounds)
  4. Use the forecast to plan budgets and identify potential overruns

Cost Anomalies

  1. Click the Anomalies tab
  2. Review detected cost anomalies:
    • Date and duration of the anomaly
    • Which service caused the anomaly
    • Expected vs. actual cost
    • Dollar amount of the anomaly

Best Practices

Organize Resources

  • Create resource groups aligned with teams, projects, or environments
  • Use consistent naming conventions for groups
  • Regularly review group membership as resources are added or removed

Monitor Proactively

  • Review the cost dashboard weekly
  • Set up budgets with threshold alerts before costs become a problem
  • Use resource schedules to automatically control non-production costs
  • Track execution costs alongside infrastructure costs for full visibility

Optimize Costs

  • Apply resource schedules to stop development resources during off-hours (potential 60-70% savings)
  • Review optimization recommendations regularly
  • Use resource groups to track costs by team or project
  • Set budgets at multiple scope levels for layered cost control
tip

The most impactful quick win is setting up resource schedules for development environments. Stopping resources during nights and weekends can reduce development compute costs by 60-70%.