Bloom Reviews — Discover what people think of this product.

Bloom

Bloom: instantly evaluate behaviors for safer AI development

OssOther

Bloom is an open-source framework for automated behavior evaluation of large language models, generating configurable interaction suites from seed configurations. It helps safety researchers probe behaviors like bias or sycophancy, log structured results, and inspect transcripts through an interactive viewer.

What users think of Bloom

Maker

Supporters

Idea

0.0

Product

0.0

Feedback

Roasted

Back to Rankings

More about Bloom

Bloom is an open-source framework for automated behavior evaluation of large language models, designed for safety researchers and ML engineers who need reproducible, configurable probing of complex model behaviors. Instead of relying on fixed prompt sets, Bloom grows an evaluation suite from a seed configuration that defines a target behavior, example transcripts, and interaction parameters.

Using a simple seed.yaml and behaviors.json, you can specify behaviors such as sycophancy, political bias, or self-preservation, then automatically generate rich interaction scenarios against models like claude-sonnet-4. Bloom orchestrates multi-turn conversations, logs every pipeline stage to structured JSON, and stores transcripts for further analysis.

Key capabilities include:

Scaffolded behavior exploration driven by seed configurations
Configurable diversity, modality, and interaction depth for nuanced probing
Support for multiple LLM providers via API keys in .env
Interactive transcript viewer with scoring, justifications, and filtering

By pairing seeds with version-controlled configurations, Bloom enables reproducible safety evaluations, systematic sweeps across behaviors, and rigorous comparison of model variants under evolving behavioral test suites.