Public Preview

SelfConsistency And Consensus documentation

Self-Consistency and Consensus

Abstract

Generate multiple candidate outputs and aggregate via voting/consensus or reranking to improve factuality and robustness.

Motivation

  • Reduce single-sample randomness and hallucinations
  • Improve reliability for reasoning and structured tasks

Architectures

  • Sample N generations with diverse decoding seeds → vote/majority
  • Rerank with secondary model/scorer (retrieval consistency, citations)
  • Committee-of-models or Mixture-of-Agents aggregation

Design Choices

  • Number of samples vs. latency/cost
  • Voting rules (majority, confidence-weighted)
  • Reranker selection and features (BM25, cross-encoder)

Pros/Cons

  • Pros: Better reliability; simple to implement
  • Cons: Higher cost/latency; risk of correlated errors

Evaluation Metrics

  • Accuracy vs. baseline; error rate reduction
  • Agreement rate among samples; calibration

Vendor/Tooling

  • Orchestration frameworks (LangChain, LlamaIndex)
  • Retrieval/rerankers from Hugging Face ecosystem

Design Checklist

  • Set budget for N and stopping criteria
  • Ensure diversity (seeds/temps/prompts)
  • Validate with task-specific rubrics

References

  • Title: Self-Consistency Improves Chain of Thought Reasoning in LLMs URL: https://arxiv.org/abs/2203.11171 Publisher/Vendor: arXiv Accessed: 2025-08-14 Version_or_release: 2022-03 (preprint)
  • Title: Mixture-of-Agents (MoA) style approaches (survey) URL: https://arxiv.org/abs/2402.05120 Publisher/Vendor: arXiv Accessed: 2025-08-14 Version_or_release: 2024-02 (preprint)