In Depth

Mixture of Agents (MoA) is an approach where multiple language models work together in layers, with each layer's models refining the outputs of the previous layer. Unlike Mixture of Experts (which routes within a single model), MoA orchestrates across separate, complete models. Research has shown that even weaker models can improve upon stronger models' outputs when used as refiners in a multi-layer setup.

The MoA architecture typically works in rounds: in the first round, multiple models independently generate responses to a prompt. In subsequent rounds, models receive both the original prompt and the previous round's outputs, producing refined responses that synthesize the best elements. A final aggregator model selects or combines the best output from the last round.

MoA demonstrates an important principle: collaboration between models can exceed any individual model's capability. This has practical implications for organizations that want frontier-level quality without frontier-level costs, as combining several smaller, cheaper models through MoA can approach or match the quality of much larger models. The approach also provides natural redundancy and can reduce the impact of any single model's weaknesses.