This task can be performed using Autoagent
Autoagent: autonomous harness engineering for smarter, faster testing
Best product for this task
Autoagent
oss
AutoAgent is an open-source framework for autonomous harness engineering where a meta-agent rewrites an LLM agent’s harness, runs Harbor benchmarks, and hill-climbs on scores. You define the loop in program.md and let it iteratively optimize prompts, tools, and orchestration.

What to expect from an ideal product
- Define your optimization loop in a simple program.md file that tells the meta-agent what to improve and how to measure success
- Let the meta-agent automatically rewrite your LLM agent's harness code while you focus on other tasks instead of manual tweaking
- Run Harbor benchmarks continuously to get real performance data that shows exactly how well your agent handles different scenarios
- Use hill-climbing algorithms to systematically improve scores by testing small changes and keeping what works best
- Iterate through multiple rounds of prompt refinement, tool selection, and workflow orchestration until you reach your target performance levels
