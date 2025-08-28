OpenServ's BRAID framework has outperformed OpenAI's latest GPT models on reasoning benchmarks, while also making AI decision-making more transparent and auditable.

According to results shared by the company, BRAID achieved higher accuracy across multiple GPT model classes when tested on the widely used GSM8K benchmark. GPT-5, for instance, scored 64.34 with BRAID compared to 54.41 without it.

Similar improvements were seen across GPT-4o, GPT-5 mini, and GPT-5 nano.

"BRAID boosts performance across every model class, from the largest to the smallest, making strong reasoning affordable and available to more developers and more use cases," said Armağan Amcalar, CTO of OpenServ.

Unlike free-form reasoning, BRAID introduces a structured two-stage process that reduces errors and produces flowcharts documenting each step of the model's logic.

This makes outputs auditable, which Amcalar said is particularly valuable for industries like finance and healthcare where verification is critical.

In an interview with Benzinga, CEO Tim Hafner explained that the gains extend beyond benchmarks.

"In a financial workflow with steps such as pricing, allocation, and risk balancing, BRAID maintained consistency in reasoning where standard models diverged," he said.

He also noted that the framework cut the effective cost per correct answer by 25% to 40% in tests.

The approach has been independently verified by Dr. Eyup Cinar, a researcher and instructor at NVIDIA's Deep Learning Institute.

Full results will be published in a peer-reviewed journal, according to OpenServ.

Hafner acknowledged that other labs are exploring structured reasoning but argued that BRAID goes further by separating planning from execution and embedding the process into OpenServ's platform, where every agent can generate a "proof of reasoning" by default.

BRAID is now being rolled out across OpenServ's platform, which supports developers building AI agents for finance, governance, and other workflows where reliability and auditability are essential.

