admin, Author at ScaleX Labs

ScaleX Labs | Insights Blog

M&E Foundations | Measurement Framework Design | April 2026
The Measurement Framework Crisis:
Seven Pathologies That Are Costing the Development Sector Its Credibility
Dr. Alfred Latigo | Founder & Chief Innovation Architect, ScaleX Labs LLC
Senior Advisor, Gender Transformative & Rights-Based Approaches, African Development Bank
“The development sector has spent decades building increasingly sophisticated theories of change, logical frameworks, and results measurement systems — and then populating them with errors so systematic, so widely shared, and so rarely challenged that they have become the norm. This article names them.”

Introduction: A Field With a Measurement Problem
Across five decades of designing and reviewing measurement frameworks for institutions spanning the United Nations system, the African Development Bank, bilateral donors, and international NGOs, one observation has proven consistently true: the confusion surrounding measurement frameworks is not a beginner’s problem. It is a sector-wide problem.

Project documents submitted by experienced consultants, approved by seasoned program officers, and audited by external evaluators routinely contain the same categories of error. Outputs are reported as results. Theories of change are produced as diagrams without assumptions. Indicators measure activities rather than change. Logic models are confused with logical frameworks. And performance measurement frameworks are built without any coherent connection to the results architecture they are supposed to operationalise.

These are not isolated lapses. They are systemic pathologies — patterns of error reproduced across institutions, across funding windows, and across decades, because they are never systematically named and corrected.
This article names seven of them. Each is illustrated with the kind of concrete examples that professionals will recognise from their own practice. The intention is constructive: a sector that cannot diagnose its own measurement failures cannot learn from its investments, cannot account to its beneficiaries, and cannot make credible claims about what development finance actually achieves.

First: Know Your Instruments
Before cataloguing the pathologies, it is necessary to establish a baseline. The six major measurement frameworks in development practice are distinct instruments. They are not interchangeable synonyms for the same tool, and they are not arranged in a simple hierarchy where one is more sophisticated than another. Each answers a different question, is designed for a different purpose, and carries different institutional obligations.

Framework Core Architecture Institutional Home Key Question Illustrative Example
Theory of Change Causal pathway + assumptions Strategic planning, advocacy Why and how will change happen? IF women gain secure land rights THEN they invest in soil conservation BECAUSE tenure security enables long-horizon decisions
Logical Framework Objective hierarchy with indicators, MOVs, assumptions OECD-DAC, EU, bilateral donors What are we committing to — and under what conditions? Goal > Purpose > Outputs > Activities; each row carries indicators, means of verification, and critical assumptions
Logic Model Inputs > Activities > Outputs > Outcomes > Impact USAID, CDC, United Way What goes in and what comes out? Resources + staff time produce workshops that build skills that change practices that improve household nutrition
Results Framework DO + Intermediate Results hierarchy World Bank, USAID How do results levels connect to the development goal? DO: Improved climate resilience / IR1: Enhanced adaptive capacity / IR2: Strengthened institutional systems
Performance Measurement Framework Indicator registry with baselines and targets AfDB, DFID, Global Fund How will we measure and track our commitments? Indicator name, definition, baseline, target, source, frequency, disaggregation, responsible party
RBM Framework Integrated planning-to-accountability system UN system, multilateral banks Are we managing for results or managing activities? Annual planning cycles linked to Results Chains, with adaptive management protocols and accountability reporting

A practitioner who cannot articulate these distinctions is not yet equipped to design or review a project’s measurement architecture. What follows assumes this baseline. The pathologies described below occur when practitioners work without it.

The Seven Pathologies

P1 Calling Outputs ‘Results’ — The Accountability Evasion

This is the single most damaging and most common error in development programme reporting. It occurs when a project’s completion report, annual review, or donor communication describes outputs — the products and services the project delivered — as ‘results achieved.’

In rigorous results-based management, ‘results’ is not a level in the results hierarchy. It is the umbrella term for the changes the programme produces — outcomes and impact. Outputs are the pre-conditions for results: they are necessary but they are not, in themselves, results. A project that delivered 50 training workshops has done something. Whether it has achieved anything depends on whether those workshops changed anything.
“Reporting outputs as results is not merely a labelling error. It is an accountability evasion. It substitutes evidence of delivery for evidence of change, and in doing so, it answers a question nobody asked.”
What This Looks Like in Practice
A project funded to improve smallholder farmers’ resilience to climate variability reports the following ‘results’ at year-end:
• 12 community workshops conducted
• 340 farmers trained in climate-smart agriculture techniques
• 18 demonstration plots established
• 3 farmer cooperatives registered

Every one of these is an output. Not one of them is a result. The report says nothing about whether knowledge improved, whether practices changed, whether yields stabilised, whether women’s participation in cooperative governance increased, or whether household food security improved. The project has accounted for its activities. It has evaded its accountability.
The Correct Architecture
The results hierarchy is not ambiguous:

As Commonly Written (Problematic) Professionally Correct Formulation
✘ 340 farmers trained (Output reported as Result) ✔ 340 farmers trained (Output) → leading to the Outcome: 68% of trained farmers adopted at least two climate-smart practices within six months (verified by follow-up survey)
✘ 18 demonstration plots established (Output reported as Result) ✔ 18 demonstration plots established (Output) → leading to the Outcome: neighbouring farmers who observed demonstration plots were 2.4x more likely to adopt soil conservation techniques
✘ 3 cooperatives registered (Output reported as Result) ✔ 3 cooperatives registered (Output) → leading to the Outcome: female membership in cooperative governance structures increased from 12% to 41% within 18 months of registration

P2 The Theory of Change Without Assumptions — A Diagram Is Not an Argument

The Theory of Change has become the most required and least understood instrument in contemporary development programming. Virtually every funding proposal submitted to a major donor now includes one. The majority of them are not theories of change. They are flow diagrams — boxes connected by arrows, tracing a path from activities to impact, with no argument inside them.
A Theory of Change is a causal argument. Its defining feature is not the diagram — it is the explicit articulation of the assumptions on which the causal chain depends. Remove the assumptions and you remove the theory. What remains is a schematic of what the project plans to do, which is an operational plan, not a theory of change.
“If your Theory of Change does not state, at every causal step, why the next step will follow from the previous one — and under what conditions it might not — then you have a diagram, not a theory.”

The Five Elements a Theory of Change Must Contain
• Explicit if-then logic at every causal step: not ‘A leads to B’ but ‘IF A, THEN B, BECAUSE of the following mechanism, AND ONLY IF the following condition holds’
• Pre-conditions for change: what must already be true in the context for the causal chain to function, that the project itself will not produce
• A theory of the problem: why does this problem persist? Who benefits from its persistence? What has been tried? Why has it not worked?
• Stakeholder and power analysis: whose behaviour must change, who has influence over that behaviour, and whose resistance is a credible obstacle
• Gender and social norms analysis: how do prevailing norms shape whether change is possible for women, girls, and marginalised groups — as a structural variable, not a checkbox

What a Genuine Assumption Statement Looks Like

As Commonly Written (Problematic) Professionally Correct Formulation
✘ Training builds skills [no assumption stated] ✔ IF training builds skills THEN farmers will apply them, BECAUSE the training methodology is practice-based and contextually appropriate — AND ONLY IF farmers have access to the inputs required to apply new techniques, which requires that input markets function within 10km of target villages
✘ Cooperatives will strengthen women’s voice [no condition stated] ✔ IF cooperatives are registered with gender-equitable governance rules THEN women’s voice in household economic decisions will increase, BECAUSE cooperative membership provides women with a legitimate institutional platform — HOWEVER this assumption may not hold where husbands control cooperative membership decisions, requiring a parallel social norms component

P3 The Logic Model Confused With the Logical Framework — Two Different Instruments

These two instruments share similar names and similar purposes — both organise the relationship between activities and objectives — but they have different architectures, different institutional origins, and different analytical demands. Confusing them produces documents that satisfy neither purpose.

The Logic Model
The Logic Model, popularised by the W.K. Kellogg Foundation and widely adopted by USAID and CDC, is a linear sequence: Inputs → Activities → Outputs → Outcomes → Impact. It is a useful planning tool for describing how resources will be converted into activities and how activities will produce changes. Its limitations are well-documented: it is linear where change is often non-linear, it does not accommodate feedback loops, and it does not require explicit assumption-stating.

The Logical Framework (Logframe)
The Logical Framework, developed in the 1960s for USAID and subsequently adopted by the European Commission, DFID, the AfDB, and most bilateral donors, is a matrix: a four-by-four table organising objectives into a vertical logic (Goal, Purpose, Outputs, Activities) and a horizontal logic (Narrative Summary, Objectively Verifiable Indicators, Means of Verification, Assumptions). The assumptions column — which has no equivalent in a Logic Model — is the analytical core of the Logframe. It forces the designer to state the external conditions on which the causal chain depends.
A project submitted to the African Development Bank, the European Union, or DFAT that presents a Logic Model where a Logframe is required will be returned for revision. The converse is equally problematic: a team that builds a Logframe without understanding its assumptions architecture will produce a document that looks like a Logframe but functions like a list.

As Commonly Written (Problematic) Professionally Correct Formulation
✘ Our Logframe shows: Output 1: Training delivered / Outcome 1: Skills improved / Impact: Food security enhanced [no indicators, no MOVs, no assumptions] ✔ A properly structured Logframe row: Output 1.1: 340 smallholder farmers (60% women) trained in CSA practices by Q2 Year 1 | Indicator: Number trained, disaggregated by sex | MOV: Training registers, pre/post knowledge assessments | Assumption: Farmers are available during planting off-season and transport to training sites is accessible

P4 Indicators That Measure Activities, Not Change — The SMART Deficit

The Performance Measurement Framework is the instrument that operationalises the results architecture into a measurable tracking system. It is only as good as the indicators it contains. And across the development sector, indicator quality is poor in ways that are remarkably consistent.

The most common failure is the activity-level indicator presented at the output or outcome level. This occurs when a project measures what it did — number of meetings held, workshops conducted, reports produced, policies drafted — rather than what those activities produced or changed.

A secondary failure, almost equally common, is the non-SMART indicator: vague, unmeasurable, undisaggregated, and untethered to a baseline or timeframe. These indicators create the appearance of a measurement system while making genuine measurement impossible.

Indicator Quality: A Diagnostic Table

Indicator as Written Results Level Claimed Actual Level The Problem
Number of training sessions conducted Output Activity Counts what the project did, not what it produced. An activity-level measure misrepresented as an output.
Number of farmers trained Output Output ✔ This one is correct. People reached through training is a legitimate output indicator.
Number of farmers who adopted improved practices Output Outcome Adoption is a behavioural change — a result. Labelling it as an output conflates delivery with impact and understates programme achievement.
% of women with secure land tenure Outcome Outcome ✔ Correctly stated. A change in legal/institutional status affecting women’s lives is a genuine outcome indicator.
Improved food security of project communities Impact Unmeasurable as stated No unit of measure, no baseline reference, no disaggregation, no timeframe. This is an aspiration, not an indicator.
Household dietary diversity score (HDDS) ≥4.5 for 70% of female-headed households by Year 3 Outcome Outcome ✔ SMART, disaggregated by sex and household type, with a measurable threshold and timeframe. This is professional-grade.

The Disaggregation Failure
A specific dimension of indicator weakness deserves separate attention: the failure to disaggregate. Development programmes that report aggregate beneficiary numbers — ‘340 farmers trained’ — without sex, age, disability status, or wealth quintile disaggregation are producing data that cannot answer the programme’s equity questions. They cannot tell us whether women were reached proportionally, whether benefits accrued to the poorest households, or whether the programme reproduced existing inequalities under the appearance of universal service.
In a Gender Transformative programme — which is the non-negotiable minimum design standard for ScaleX Labs — disaggregation is not optional. It is the primary evidence base for whether the programme is achieving its equity mandate.

P5 The Results Framework With No Results Chain — Structure Without Logic

A Results Framework is a hierarchical display of the development objectives and intermediate results a programme commits to achieving. At its most basic, it shows a Development Objective at the top, supported by a set of Intermediate Results, each supported by sub-results or outputs below it. The World Bank and USAID have developed the most widely used variants.
The pathology that afflicts Results Frameworks in practice is structural: they display results levels without explaining how those levels connect. A Development Objective is stated. Below it, Intermediate Results are listed. The implicit claim is that achieving the Intermediate Results will produce the Development Objective. But the logic by which this occurs — the causal mechanism, the sequencing, the conditions under which it holds — is nowhere stated.
This produces a document that looks like a measurement architecture but functions as a list of aspirations. It satisfies the formatting requirement without satisfying the analytical requirement.
“A Results Framework without a Results Chain is a destination without a map. It tells you where you want to arrive. It tells you nothing about how you will get there, why the road leads there, or what might block the path.”
The Results Chain: The Missing Link
The Results Chain — what some institutions call the Development Hypothesis — is the if-then narrative that explains how Outputs produce Immediate Outcomes, how Immediate Outcomes accumulate into Intermediate Outcomes, and how Intermediate Outcomes contribute to the Development Objective. It is the analytical engine inside the results architecture. Without it, the Results Framework is a shell.
A professionally constructed Results Chain for a climate adaptation programme might read:

IF → THEN IF women farmers receive training in climate-smart agriculture AND gain access to drought-tolerant seed varieties (Outputs) THEN their agricultural knowledge and adaptive practices will improve (Immediate Outcome) BECAUSE the training methodology is practice-based and locally validated — AND IF community-level demonstration plots reinforce peer learning (Intermediate Outcome) THEN household food security and income resilience will improve measurably within three cropping seasons (Development Objective) — PROVIDED THAT input markets remain functional and no extreme climatic event disrupts the cropping calendar during the programme period (Critical Assumption)

This is not a prose elaboration of a diagram. It is a testable causal argument. An evaluator reading it knows exactly what to verify, what to question, and what to monitor.

P6 Gender as a Cross-Cutting Theme — The Integration That Does Not Integrate

No area of measurement framework design is more consistently mishandled than gender. Across thousands of project documents reviewed over five decades of practice, a single pattern repeats: gender appears as a cross-cutting theme, described in a dedicated paragraph or column, and then systematically absent from every other element of the measurement architecture.
Gender indicators are listed separately from the main performance indicator table. The Theory of Change says nothing about how gender norms shape the causal pathways. The Logframe’s assumptions column makes no reference to the social conditions affecting women’s participation. The beneficiary table reports total numbers without sex disaggregation. And the completion report concludes that ‘gender was mainstreamed throughout’ without a single piece of evidence to support the claim.
This pattern has a name in professional M&E practice: gender washing. It satisfies the formal requirement — there is a gender section — without satisfying the substantive requirement, which is that gender analysis shapes the design, monitoring, and evaluation of the programme itself.

The Distinction That Matters: Responsive Versus Transformative
There is a critical distinction between Gender Responsive and Gender Transformative programming that most measurement frameworks collapse or ignore entirely. The difference is not one of degree but of purpose:

As Commonly Written (Problematic) Professionally Correct Formulation
✘ Gender Responsive: The project ensures equal numbers of women and men attend training [counts participation, does not analyse power] ✔ Gender Transformative: The project actively challenges the social norms that restrict women’s land rights, with indicators tracking changes in community attitudes, legal tenure status, and women’s decision-making authority within households and cooperatives
✘ Gender Mainstreamed: All project activities are designed to be accessible to women [process commitment, no outcome measurement] ✔ Gender Transformative: Norms change is an explicit programme outcome, with a dedicated indicator tracking changes in the Gender Equality Index score for target communities over the programme period
✘ Cross-cutting theme: Gender [listed in a column; absent from the results chain] ✔ Structural integration: Gender norms analysis embedded in the Theory of Change, the Results Chain, the assumptions column of the Logframe, and every outcome-level indicator in the PMF

P7 The Evaluation Framework Added at the End — Measurement as Afterthought

The seventh pathology is structural and temporal: the measurement framework is designed after the project, not with it. This occurs when a programme team completes its design — activities, budget, implementation plan — and then adds a monitoring and evaluation section to satisfy the donor requirement. The result is a measurement framework that monitors the activities already planned, rather than the results the programme is committed to achieving.

This sequencing error has cascading consequences. Baselines are not established before implementation begins, making it impossible to demonstrate change. Indicators are selected to match activities rather than outcomes, ensuring that what is measured is what was done, not what changed. Evaluation questions are not formulated until the mid-term review, by which point the data that would answer them has not been collected.

“Monitoring and evaluation is not a reporting obligation appended to a project. It is a design discipline embedded in a project from the first day of conceptualisation. A project whose measurement framework was written after its activities were designed has not designed an M&E system. It has designed a documentation system.”

What Integrated M&E Design Requires
• Baseline data collection before implementation begins — not during Year 1 as a project activity
• Indicators selected to measure the changes the Theory of Change predicts, not the activities the project plans to conduct
• Evaluation questions formulated at design stage and used to drive data collection from day one
• A learning protocol — not just a reporting schedule — that specifies how monitoring data will be used to adapt implementation
• Gender-disaggregated baseline data as a non-negotiable prerequisite for any programme making equity claims

Synthesis: What These Seven Pathologies Have in Common
Across the seven pathologies described above, a single underlying condition explains their persistence: the development sector has confused procedural compliance with analytical rigour. Projects produce Theories of Change because donors require them. They include gender sections because guidelines mandate them. They submit Performance Measurement Frameworks because completion reports demand them. But the institutional pressure is toward the production of documents, not toward the quality of thinking those documents are supposed to contain.
The result is a sector that is well-documented and poorly measured — that generates vast quantities of monitoring data from which very little is learned, that produces programme reports that describe activities with precision and results with vagueness, and that repeats the same design errors across generations of practitioners because those errors are never systematically named.

This is not a critique of individuals. The practitioners who produce these documents are, in most cases, technically capable people working under time pressure, with inadequate methodological training, and without institutional incentives to prioritise quality over compliance. The problem is systemic. Addressing it requires systemic tools.

THE SCALEX LABS COMMITMENT Every platform ScaleX Labs builds — from the AI-Powered MEL System to GrantIQ and GenTraQ — is designed to make rigorous measurement practice the path of least resistance, not the path of maximum effort. We are building tools that embed professional-grade framework architecture into the workflow of every practitioner who uses them, so that the baseline standard of the sector rises, one project document at a time.

Closing: The Standard We Are Holding
The seven pathologies described in this article are correctable. They are not the result of a lack of intelligence or commitment in the development sector. They are the result of a lack of shared, institutionally enforced standards for measurement framework quality — and the absence of tools that make those standards accessible.

ScaleX Labs is building those tools. But tools alone are insufficient. The sector also needs a culture of measurement literacy that treats framework design as a professional discipline, that names errors when it finds them, and that holds completion reports, evaluation designs, and funding proposals to the same standard that financial accounting is held to in other sectors.
That is the standard we are holding. We invite every development practitioner, programme officer, evaluator, and institutional leader who reads this article to hold it too.

“The question a development programme must answer is not ‘What did we do?’ It is ‘What changed, for whom, because of what we did, and how do we know?’ Every element of a measurement framework should exist to answer that question. Nothing else justifies its presence.”

About the Author
Dr. Alfred Latigo is the Founder and Chief Innovation Architect of ScaleX Labs LLC, a Delaware-incorporated AI innovation company building responsible AI tools for global development with a focus on the Global South. He is concurrently Senior Advisor for Gender Transformative and Rights-Based Approaches at the African Development Bank’s African Climate Change Fund (ACCF), leading monitoring, evaluation, and learning across a 25-country portfolio spanning three Climate Finance windows. His 50-year career spans FAO, the Harvard Institute for International Development, the UN Economic Commission for Africa, the AfDB, and the World Bank. He is the designer of the Gender Transformative Monitoring Tool (GTMT) and the Scalability and Replicability Assessment Matrix (SRAM), applied across 25+ African countries.
Contact: info@scalexlabs.io | www.scalexlabs.io

Author: admin

Daphne Hollinger Fowler, Global Impact Director One Day’s Wages

The Measurement Framework Crisis: Seven Pathologies That Are Costing the Development Sector Its Credibility

Recommendation Letter for ScaleX Labs

Newsletter Sign-Up

Phone

Email

Address