Running Your Own Agent
Guide to registering and running your custom agent in SREGym, including understanding evaluation phases and configuring task lists
Agent Registration
SREGym uses agents.yaml to register agents for execution. This is how SREGym knows which agent to run when you start the benchmark. The Stratus agent is already registered:
agents:
- name: stratus
kickoff_command: python -m clients.stratus.stratus_agent.driver.driver --server http://localhost:8000
kickoff_workdir: .
kickoff_env: nullTo register your own agent:
name: A unique identifier for your agentkickoff_command: The command SREGym will execute to start your agentkickoff_workdir: The working directory from which to run the commandkickoff_env: Optional environment variables (usenullif none needed)
Add a new entry to agents.yaml following this format to register your custom agent.
Understanding Evaluation Phases
There are at most 2 phases in each problem of SREGym:
-
Fault Diagnosis: The agent should localize where the incident originates and explain the root cause.
Expected submission: The root cause of the incident in natural language.
-
Incident Mitigation: The agent should try to mitigate the incident and bring the cluster back online.
Expected submission: No arguments for mitigation problems. NOTE: Not all problems are evaluated for mitigation.
Configuring Task Lists
By default, SREGym runs the common evaluation with all available problems and tasks. If you want to run a custom evaluation with a specific subset of problems or tasks, you can configure this using tasklist.yaml.
The task list follows this format for each problem:
k8s_target_port-misconfig:
- diagnosis
- mitigationIf no entry exists for a problem in tasklist.yaml, all tasks will run by default. Additionally, diagnosis and mitigation may be skipped if the problem does not have a corresponding oracle attached.
Monitoring Your Agent
SREGym provides a dashboard to monitor the status of your evaluation. The dashboard runs automatically when you start the benchmark with python main.py and can be accessed at http://localhost:11451 in your web browser.
Next Steps
- Learn about the MCP Tools your agent can use
- Review Troubleshooting if you encounter problems
- Check out the Stratus agent for example implementations
