SREGym Logo

SREGym

Quick Start

Get up and running with SREGym quickly - setup your cluster and run your first agent

Make sure you've completed the Installation steps before proceeding with Quick Start.

Setup your cluster

You need a Kubernetes cluster to run SREGym. For detailed setup instructions, see the Cluster Setup guide.

Running an Agent

Quick Start

To get started with the included Stratus agent:

  1. Create your .env file:
cp .env.example .env
  1. Open the .env file and configure your model and API key.

  2. Run the benchmark:

python main.py

Model Selection

SREGym supports multiple LLM providers. Specify your model using the --model flag:

python main.py --model <model-id>

Available Models

Model IDProviderModel NameRequired Environment Variables
gpt-4oOpenAIGPT-4oOPENAI_API_KEY
gemini-2.5-proGoogleGemini 2.5 ProGEMINI_API_KEY
claude-sonnet-4AnthropicClaude Sonnet 4ANTHROPIC_API_KEY
bedrock-claude-sonnet-4.5AWS BedrockClaude Sonnet 4.5AWS_PROFILE, AWS_DEFAULT_REGION
moonshotMoonshotMoonshotMOONSHOT_API_KEY
watsonx-llamaIBM watsonxLlama 3.3 70BWATSONX_API_KEY, WX_PROJECT_ID
glm-4GLMGLM-4GLM_API_KEY
azure-openai-gpt-4oAzure OpenAIGPT-4oAZURE_API_KEY, AZURE_API_BASE

Default: If no model is specified, gpt-4o is used by default.

Examples

OpenAI:

# In .env file
OPENAI_API_KEY="sk-proj-..."

# Run with GPT-4o
python main.py --model gpt-4o

Anthropic:

# In .env file
ANTHROPIC_API_KEY="sk-ant-api03-..."

# Run with Claude Sonnet 4
python main.py --model claude-sonnet-4

AWS Bedrock:

# In .env file
AWS_PROFILE="bedrock"
AWS_DEFAULT_REGION=us-east-2

# Run with Claude Sonnet 4.5 on Bedrock
python main.py --model bedrock-claude-sonnet-4.5

Note: For AWS Bedrock, ensure your AWS credentials are configured via ~/.aws/credentials and your profile has permissions to access Bedrock.

Monitoring with Dashboard

SREGym provides a dashboard to monitor the status of your evaluation. The dashboard runs automatically when you start the benchmark with python main.py and can be accessed at http://localhost:11451 in your web browser.

Next Steps

Now that you've run your first agent, you can: