RealOutcomes

Scoring Methodology

RealOutcomes combines a deterministic Watchdog Score (28 website signals), rule-based accountability flags from IRS filings, and verified federal/state data—with full source citations on every org profile.

Current Version: v1.4

28 Watchdog Signals

10 Verified Sources

4 Evaluation Layers

Our Scoring Philosophy

Our scoring system is designed to be transparent, consistent,defensible, and actionable. We believe that:

Every signal should be observable and verifiable from public sources
The methodology should be publicly documented and versioned
Organizations should be able to understand and improve their scores
Scores should reflect capacity for demonstrating impact, not the impact itself
Gaming should be detectable and penalized through cap rules

Important: Our scores measure an organization's ability to demonstrate and communicate outcomes, not the actual social impact. High scores indicate strong outcome measurement practices and transparency. A low score is not a judgment of mission value.

How Evaluation Works (Four Layers)

Not everything on an org profile is the same kind of score—we separate them deliberately

An org hub may show a Watchdog gauge, FWA severity badges, money-flow charts, and an AI investigation card. Each layer uses different inputs and rules. We document all of them here so nothing is implicit.

Watchdog Score

Deterministic · config v1.0

The primary 0–100 score on org profiles. Computed from 28 observable website signals across Transparency, Outcomes Maturity, and Evidence Strength. Config-driven, versioned, and reproducible.

Inputs

Website crawl (Firecrawl)
Public pages & PDFs linked from the site

Explicitly not mixed in

FEC, USASpending, or LDA data do not change this score today
IRS BMF status does not directly adjust dimension weights (see FWA flags)

Accountability Flags (FWA)

Separate from Watchdog Score

Rule-based risk signals derived from IRS filings and master-file status. Shown as severity badges on directory and org hub. These are not added into or subtracted from the Watchdog Score formula.

Inputs

IRS BMF / Auto-Revocation List (tax status)
Form 990 ratios (overhead, fundraising, exec comp)
Board size & filing gaps from 990 XML

Explicitly not mixed in

Website marketing language (except via separate AI investigation)

Verified Source Context

Cited on every org hub

Federal awards, grants, lobbying, state charity registrations, and political connections are ingested from authoritative APIs, stored with provenance, and linked to source URLs. Used for investigation and money-flow views—not blended into the 28-signal formula yet.

Inputs

FEC · USASpending · LDA · State charity registries
IRS 990 XML (Schedules I/J/R, Part XV grants)
Person-identity normalization for board interlocks

Explicitly not mixed in

Does not silently alter scores without a documented signal rule

AI Investigation (Promise vs Reality)

Supplementary · optional

When enabled, compares website claims against 990 filings and produces a separate credibility assessment. Displayed in the AI Investigation card—not written into score_runs or the Watchdog Score gauge.

Inputs

990 financials + website crawl text
OpenAI structured analysis

Explicitly not mixed in

Not a substitute for the deterministic Watchdog Score

Watchdog Score Calculation

The overall score is a weighted average of three dimensions

Transparency

30%

Outcomes Maturity

40%

Evidence Strength

30%

Overall = (Transparency × 0.30) + (Outcomes Maturity × 0.40) + (Evidence Strength × 0.30)

Outcomes Maturity is weighted highest (40%) because distinguishing and measuring actual change is the core differentiator from "overhead culture" metrics.

Scoring Dimensions

Transparency

30% of overall

How publicly legible and accountable the organization is.

Signals (9)

Mission statement is clearly stated8%

Homepage or about page contains clear mission language

Programs/services are publicly described12%

Dedicated pages describing programs/services exist

Leadership/team is listed12%

Leadership, team, or board information is publicly available

Clear contact info is available10%

Email, phone, or address is easily found

Annual/impact reports are accessible15%

Annual report or impact report links are available

Financial docs (990/audit) are accessible18%

Form 990, audited financials, or budget information posted

Governance details are visible10%

Board, bylaws, or governance policies are referenced

Recent updates indicate active operations8%

News, blog, or content updates within the last 12 months

Donation pathway and purpose are clear7%

Donate page with clear purpose for funds exists

Total weights: 100% (normalized to 100 for scoring)

Confidence Scoring

How much data was available to make the assessment

Confidence indicates how much data we were able to analyze. A low confidence score doesn't mean the organization is bad—it means we had limited public information to work with.

High

25+ pages, 2+ PDFs, impact page found

Medium

8-24 pages, 1+ PDF

Low

<8 pages, no PDFs

Accountability Flags (FWA)

Rule-based risk signals from IRS data—separate from the Watchdog Score formula

FWA flags are generated when IRS master-file status or parsed Form 990 data crosses documented thresholds. They appear on directory rows and the org hub but are not added to or subtracted from Transparency / Outcomes / Evidence dimension scores.

irs_auto_revoked

IRS Tax-Exempt Status Revoked

critical

IRS BMF / ARL

revocation_history

History of IRS Revocation

high

Status change log

high_overhead

High Administrative Overhead (>40%)

medium–high

Form 990

fundraising_ratio

High Fundraising Expense Ratio (>35%)

medium

Form 990

exec_comp_anomaly

Executive Compensation Anomaly (>15% of expenses)

medium–high

Form 990

board_size

Governance: Board < 3 members

medium

Form 990 / XML

reporting_gap

Form 990 Reporting Gap

high

Form 990 filing history

Watchdog Flags (T / O / E)

Missing-signal warnings attached to a Watchdog Score run—not FWA accountability flags

T_FLAG_01

No Contact Channel

No email, phone, or contact form was detected on the website.

T_FLAG_02

No Program Description

No dedicated page describing programs or services was found.

T_FLAG_03

No Financial Documents

No Form 990, audit report, or financial statements were linked.

O_FLAG_01

No Outcomes Stated

The organization does not publicly state specific outcome goals.

O_FLAG_02

No Outcome Metrics

No measurable outcome metrics (beyond output counts) were found.

E_FLAG_01

No Evidence Artifacts

No evaluation reports, research PDFs, or evidence documents found.

E_FLAG_02

No Method Described

No measurement methodology (survey, pre/post, etc.) described.

E_FLAG_03

Possible Inflation

Large outcome claims were detected without supporting evidence artifacts.

Verified Source Integrations

What we ingest, why, and whether it feeds the Watchdog Score today

IRS_BMF

IRS Business Master File

Context / FWA only

Why: Authoritative registry of ~1.5M tax-exempt organizations—EIN, name, subsection, ruling year, assets/income bands.

Used for: Directory backbone, org identity, BMF-synced FWA batch rules

Source API / dataset

IRS_ARL

IRS Auto-Revocation List

Context / FWA only

Why: Monthly list of organizations whose tax-exempt status was revoked for failing to file required returns.

Used for: Verified tax-status citations, critical FWA flag (irs_auto_revoked)

Source API / dataset

IRS_990_XML

IRS Form 990 XML

Partial — T6 still checks website links to 990/audit PDFs, not parsed XML fields

Why: Line-item grants (Schedule I), compensation (J), lobbying (C/R), and private-foundation grants (990-PF Part XV).

Used for: Money flows, board roster, FWA financial ratios, grant-matching

Source API / dataset

PROPUBLICA

ProPublica Nonprofit Explorer

Context / FWA only

Why: Structured 990 summaries when XML is unavailable; financial history baseline.

Used for: Financial history, pipeline hydration before XML parse

Source API / dataset

FEC

FEC (api.open.fec.gov)

Context / FWA only

Why: PAC receipts, committee disbursements, and employer-linked contributions for political exposure.

Used for: Network graph, political connections hub, employer matching on board members

Source API / dataset

USASPENDING

USASpending.gov

Context / FWA only

Why: Federal contract and grant awards to nonprofits from USAspending API.

Used for: Federal awards table, money-flow Sankey federal lane

Source API / dataset

LDA

Senate LDA (Lobbying Disclosure)

Context / FWA only

Why: Quarterly lobbying registrations and issue codes tied to filer names/EINs where available.

Used for: Lobbying disclosures panel on org hub

Source API / dataset

STATE_CHARITY

State Charity Registrations

Context / FWA only

Why: Solicitation and registration status varies by state; adapter framework covers all 50 states incrementally.

Used for: Compliance grid on org hub with state-level citations

WEBSITE

Website crawl (Firecrawl)

Feeds Watchdog Score

Why: Only source that feeds the 28 Watchdog signals—mission, programs, outcomes language, PDFs, governance pages.

Used for: Watchdog Score (primary), AI Promise vs Reality cross-check

LAST1_CERT

Last1.app Certification Registry

Context / FWA only

Why: Third-party certification when JSON API is live; gated until registry endpoint returns structured data.

Used for: Certification badge + provenance record (no automatic Watchdog score change today)

Provenance & Source Citations

Every integrated fact is traceable on the org hub Sources panel

When data is ingested—whether from IRS BMF, FEC, USASpending, or a website crawl—we record it in org_data_provenance with the source system, retrieval time, and canonical URL. Findings that affect risk or money-flow views are stored in org_findings with confidence (verified = direct API/file match; inferred = heuristic link).

On each org profile, the Sources section lists citations you can click through to the original government dataset. We do not blend provenance-backed facts into Watchdog dimension scores unless a matching signal rule exists in config/scoring.v1.0.json.

Investigation Context (Not in Watchdog Formula)

These features help researchers and funders investigate an organization. They use verified APIs but do not change the 0–100 Watchdog Score unless we add an explicit, documented signal rule in a future version.

FEC political connections — PAC donations and employer-linked contributions with source URLs
USASpending federal awards — contracts and grants by recipient EIN/name
LDA lobbying disclosures — registrant filings and issue codes
990 Schedule I grants — line-item grants made and received, dual-source matched where possible
State charity registrations — solicitation compliance by state (50-state adapter, rolled out incrementally)
Board interlocks — person-entity normalization with confidence labels on shared board seats
Network explorer — graph of board, political, and financial edges for Investigator-tier users

Score Interpretation

What the scores mean

70-100

Strong

Excellent outcome measurement and transparency

50-69

Developing

Good foundation with room for improvement

30-49

Emerging

Basic transparency with significant gaps

0-29

Limited

Minimal public information available

Note on low scores: A low score often indicates limited public information rather than poor organizational practices. Many effective organizations simply haven't published detailed outcome data publicly. Claiming a profile on Last1.app can help improve scores.

Anti-Gaming Controls

How we prevent score manipulation

Our scoring system includes several controls to prevent organizations from gaming their scores:

Cap Rules

Dimension scores are capped when critical signals are missing. For example, Outcomes Maturity is capped at 40 if no specific outcomes are stated, regardless of other signals.

Inflation Detection

Large claims (e.g., "100% success rate") without supporting evidence artifacts are flagged (E_FLAG_03).

Consistency Check

Metrics that appear with different values across pages are noted as potential inconsistencies (E8).

Roadmap: IRS 990 XML fields and state charity registration status may become explicit Transparency signals in a future config version—only after rules are added to the versioned JSON and documented here.

Version History

Methodology changes are versioned and documented

Documented four evaluation layers: Watchdog Score, FWA flags, verified source context, and AI investigation
Full verified-source registry: IRS BMF (~1.5M orgs), ARL, 990 XML, FEC, USASpending, LDA, state charity, provenance
Provenance model: every ingested fact links to source system, URL, and retrieval timestamp on org hub
Clarified what feeds the 28-signal formula vs. accountability flags vs. investigation context
Last1 certification: badge + citation when live; does not alter Watchdog dimension math today
Monthly IRS sync cron (BMF + ARL) on Railway; BMF asset/income columns widened to BIGINT

Standards Reference

Our methodology is informed by and aligned with established nonprofit accountability standards:

Last1.org Standards•GuideStar Transparency Criteria•NTEN Outcome Measurement Frameworks•ISSB Impact Measurement Standards

How RealOutcomes and Last1.app Work Together

RealOutcomes Rates

Public watchdog scoring based on IRS data, website transparency, outcome evidence, and accountability flags.

Last1 Helps Improve

Organizations claim their profile on Last1.app to get improvement roadmaps, respond to accountability flags, and demonstrate progress.

Last1 Certification

When the registry API is live, certified orgs show a verification badge and provenance citation. This does not automatically change Watchdog dimension math today—improvement comes from published evidence on the website.

RealOutcomes and Last1.app are both operated by Last 1 Enterprises, LLC. RealOutcomes serves as the public-facing accountability layer; Last1.app is the improvement and certification platform.

Want to improve your organization's score?

Claim your profile on Last1.app to get personalized recommendations, respond to accountability flags, and access outcome measurement tools.

Get Started on Last1.app

Believe there's an error in an organization's score? Submit a correction request

Scoring Methodology

Watchdog Score

Accountability Flags (FWA)

Verified Source Context

AI Investigation (Promise vs Reality)

Scoring Dimensions

Signals (9)

v1.4June 2026Current

v1.3March 2026

v1.2March 2026

v1.1February 2026

v1.0February 2026

How RealOutcomes and Last1.app Work Together

Want to improve your organization's score?