Case study14 min read

Smarter RFPs: Structured comparisons from unstructured specs

Most RFPs fail before the first supplier responds. The root cause is almost never supplier quality. It is the translation of messy internal debate into requirements that no one can objectively evaluate.


RFPs are the primary instrument procurement teams use to select suppliers. Yet in most organizations, the evaluation that follows is neither structured nor genuinely comparable. Requirements are gathered informally, criteria are defined after responses arrive, and incumbent suppliers benefit from familiarity rather than demonstrated performance.

The gap between a good sourcing event and a poor one is not the supplier pool. It is almost always what happened in the four to six weeks before the RFP was sent. Teams that produce defensible, high-quality sourcing decisions share one discipline: they convert unstructured internal opinions into measurable criteria before a single supplier is invited to respond. Everything else follows from that.

60%

of sourcing decisions later disputed internally cite unclear evaluation criteria as the primary cause

3x

more likely to switch supplier unnecessarily when incumbent is not evaluated on the same scale as challengers

40%

of RFP requirements typically rewritten or dropped when teams conduct a friction audit before drafting

The three failure modes

Weak RFPs tend to collapse in one of three predictable ways. In most cases, all three are present simultaneously. Naming them makes them avoidable.

Exhibit 1

How RFP processes break down, and what each failure costs

FAILURE MODEWHAT IT LOOKS LIKECONSEQUENCE
Vague requirements"Strong SLA performance," "competitive pricing," "responsive support"Suppliers write to impress, not to commit. Responses cannot be compared on any consistent basis.
Post-hoc scoringCriteria and weights defined after supplier responses are read by the evaluation teamEvaluators reverse-engineer scores to match a preferred outcome. The process looks rigorous but the decision was made earlier.
Incumbent asymmetryCurrent supplier judged on relationship history; new entrants judged on proposal quality aloneStructural bias prevents genuine competition and keeps underperformance invisible until it becomes a crisis.

Each failure mode compounds the others. Vague requirements make post-hoc scoring easier to justify. Post-hoc scoring makes incumbent asymmetry easy to obscure. Together, they produce decisions that cannot be explained, repeated, or defended when challenged.

From friction to requirement: the translation workflow

Requirements gathering almost always produces the wrong output. Teams collect wishlists, not requirements. Senior stakeholders articulate preferences shaped by past supplier relationships. Junior team members add items to avoid appearing disengaged. The result is a requirements list that is simultaneously too long and too vague to be useful.

The most effective alternative is to begin with operational friction: specific, real, attributable problems that exist because of how the current supplier operates. Friction is concrete where preferences are abstract. It is specific where wishlists are general. And it is directly actionable because it points to a gap between what the supplier does and what the business actually needs.

  1. Run a friction audit before any workshop. Talk to the people who interact with the supplier daily, not executive sponsors. Ask what takes longer than it should, what requires manual intervention, and what creates recurring rework. Pattern-match across respondents.
  2. Categorize every input into four buckets: functional (what the supplier must do), technical (how they must do it), commercial (pricing structure and terms), and risk (stability, certification, and continuity). Inputs that do not fit any category are preferences, not requirements, and should be set aside.
  3. Convert each requirement into a testable statement. Every requirement must answer: "how would you know if this was met or not met?" If the team cannot answer that question, the requirement is not ready for the RFP.
  4. Set minimum thresholds before scoring begins. Identify which requirements are pass-fail and eliminate any supplier that cannot clear them before formal evaluation starts. This prevents the team from being seduced by a strong proposal from a supplier who cannot fundamentally serve the business.
  5. Lock the scoring matrix before reading any responses. Weights, criteria, and thresholds must be finalized and approved by the relevant stakeholders before the evaluation team opens a single supplier document.

A consumer goods company retendering a $4 million logistics contract applied this workflow and found that nearly a third of their draft requirements were preferences tied to one stakeholder's prior experience with a different supplier. Removing them shortened the RFP significantly, improved supplier response quality, and made the final evaluation straightforward rather than contentious.

Turning language into criteria

The single highest-impact action a sourcing team can take is rewriting vague requirements into measurable ones. The version that goes into the RFP must be answerable with data, not with reassurance.

Exhibit 2

Requirement translation: from assertion to measurement

CATEGORYORIGINAL LANGUAGEMEASURABLE REQUIREMENT
Service levelStrong SLA performance99.5% uptime over trailing 12 months; documented root cause analysis within 48 hours of any incident exceeding 4 hours
CostCompetitive pricingItemized unit pricing by SKU category; total cost of ownership within 10% of current spend at equivalent volume, with open-book cost breakdown on request
DeliveryReliable on-time deliveryOn-time and in-full rate above 97% measured monthly; variance report delivered by the 5th of each month with line-level detail
SupportResponsive customer serviceNamed account contact available during business hours; maximum 2-hour response time for priority issues as defined in Schedule B; monthly review cadence
RiskFinancially stableAudited financials for prior two fiscal years; no material adverse events in the past 18 months; Dun and Bradstreet score above 75
TechnologyModern digital capabilityAPI-based integration with ERP within 90 days of contract execution; real-time data access via self-service portal; ISO 27001 certified

The rewritten requirements share a common structure: they specify a threshold, a measurement method, and a timeframe. This gives evaluators something to verify rather than something to interpret. It also signals to suppliers that vague commitments will not be sufficient, which tends to improve the quality of responses considerably.

Building the decision matrix

Create the scoring matrix before reviewing any supplier responses. It helps keep scoring fair and prevents people from shaping scores around a preferred winner. The weights should match what matters most for this sourcing event and be approved by all key stakeholders in advance.

Weights are not permanent. In a year when cost pressure dominates, commercial criteria might carry 35% of the total. In a year when the category experienced a supply disruption, risk criteria might carry equal weight. The key discipline is that the weighting decision is made consciously and recorded, not reverse-engineered after evaluation.

Exhibit 3

Sample supplier scoring matrix: incumbent vs. challenger (illustrative)

IncumbentChallenger A

Cost and commercial terms

6881

Weight: 30%

Quality and delivery

8476

Weight: 25%

Service and support

7188

Weight: 25%

Risk and compliance

7972

Weight: 20%

Applying the same standard to all suppliers

The most corrosive bias in supplier evaluation is not intentional favoritism. It is the asymmetric standard applied when incumbents are judged on demonstrated history while challengers are judged on proposed performance. Both are imperfect signals. Both deserve the same scrutiny.

Eliminating this bias requires one structural rule: every supplier answers the same questions, on the same form, scored against the same thresholds. Where an incumbent has actual performance data and a challenger has only commitments, that distinction is captured as a confidence weighting in the scoring, not used as a reason to exempt the incumbent from evaluation.

This approach also surfaces something many organizations find uncomfortable: incumbents who score poorly under objective criteria. That score is information. It either means the business's requirements have grown beyond the supplier's capability, or it means the relationship has been shielding underperformance that would not have been tolerated from any other supplier.

What this looks like as a repeatable capability

The compounding benefit of structured RFP discipline comes from repetition. Each sourcing event that uses a consistent framework produces data that improves the next one. Teams learn which criteria actually discriminate between suppliers and which generate friction without adding information. They learn which requirements correlate with post-contract performance and which are aspirational language that suppliers agree to and then quietly ignore.

Exhibit 4

Building sourcing maturity over time

MATURITY LEVELCHARACTERISTICSTYPICAL OUTPUT
Ad hocRequirements gathered informally; scoring done after responses arrive; no reuse across eventsDecisions difficult to explain; frequent internal disputes; high rework on next event
StructuredFour-category requirement framework in use; matrix built before evaluation; same form for all suppliersComparable responses; reduced evaluation time; decisions defensible to stakeholders
SystematicRequirement library reused and refined; supplier scores tracked across multiple events; friction audit standard practiceFaster sourcing cycles; institutional knowledge retained across personnel changes; post-contract performance predictable from RFP scores

Most procurement teams operate at the ad hoc level for most of their sourcing events, even when they believe they are operating at a higher standard. The gap between perceived and actual maturity is usually visible in one place: whether the scoring matrix existed before the team read the first supplier response.

What changes when this is done well

When requirements are structured and scoring is objective, the organization can see exactly where each supplier leads or lags.

The downstream effects extend beyond the immediate decision. Suppliers who understand they will be held to measurable standards tend to respond more honestly. Stakeholders who see their input reflected in weighted criteria are more likely to accept an outcome they did not personally prefer. And procurement teams that can point to a consistent, documented process are far better positioned when decisions are challenged.

Over time, the scoring matrix and requirement library become institutional assets. Each new sourcing event builds on the last. The organization accumulates knowledge about which criteria differentiate suppliers in practice, not just in theory, and which requirements generate paperwork without adding value.

That is when an RFP stops being a procurement formality and becomes a genuine instrument for competitive advantage in the supply base.

Scores and statistics in Exhibits 1 through 4 are illustrative. Scoring matrix weights should reflect current business priorities and must be finalized before any supplier responses are reviewed by the evaluation team.

Stay ahead of Procurement AI insights.

Get strategies, benchmarks, and real world examples of how top finance teams use AI to increase savings, and source smarter.