Skip to main content
    RealOutcomes
    Sign In

    Scoring Methodology

    RealOutcomes combines a deterministic Watchdog Score (28 website signals), rule-based accountability flags from IRS filings, and verified federal/state data—with full source citations on every org profile.

    Current Version: v1.4
    28 Watchdog Signals
    10 Verified Sources
    4 Evaluation Layers
    Our Scoring Philosophy

    Our scoring system is designed to be transparent, consistent,defensible, and actionable. We believe that:

    • Every signal should be observable and verifiable from public sources
    • The methodology should be publicly documented and versioned
    • Organizations should be able to understand and improve their scores
    • Scores should reflect capacity for demonstrating impact, not the impact itself
    • Gaming should be detectable and penalized through cap rules

    Important: Our scores measure an organization's ability to demonstrate and communicate outcomes, not the actual social impact. High scores indicate strong outcome measurement practices and transparency. A low score is not a judgment of mission value.

    How Evaluation Works (Four Layers)
    Not everything on an org profile is the same kind of score—we separate them deliberately

    An org hub may show a Watchdog gauge, FWA severity badges, money-flow charts, and an AI investigation card. Each layer uses different inputs and rules. We document all of them here so nothing is implicit.

    Watchdog Score

    Deterministic · config v1.0

    The primary 0–100 score on org profiles. Computed from 28 observable website signals across Transparency, Outcomes Maturity, and Evidence Strength. Config-driven, versioned, and reproducible.

    Inputs

    • Website crawl (Firecrawl)
    • Public pages & PDFs linked from the site

    Explicitly not mixed in

    • FEC, USASpending, or LDA data do not change this score today
    • IRS BMF status does not directly adjust dimension weights (see FWA flags)

    Accountability Flags (FWA)

    Separate from Watchdog Score

    Rule-based risk signals derived from IRS filings and master-file status. Shown as severity badges on directory and org hub. These are not added into or subtracted from the Watchdog Score formula.

    Inputs

    • IRS BMF / Auto-Revocation List (tax status)
    • Form 990 ratios (overhead, fundraising, exec comp)
    • Board size & filing gaps from 990 XML

    Explicitly not mixed in

    • Website marketing language (except via separate AI investigation)

    Verified Source Context

    Cited on every org hub

    Federal awards, grants, lobbying, state charity registrations, and political connections are ingested from authoritative APIs, stored with provenance, and linked to source URLs. Used for investigation and money-flow views—not blended into the 28-signal formula yet.

    Inputs

    • FEC · USASpending · LDA · State charity registries
    • IRS 990 XML (Schedules I/J/R, Part XV grants)
    • Person-identity normalization for board interlocks

    Explicitly not mixed in

    • Does not silently alter scores without a documented signal rule

    AI Investigation (Promise vs Reality)

    Supplementary · optional

    When enabled, compares website claims against 990 filings and produces a separate credibility assessment. Displayed in the AI Investigation card—not written into score_runs or the Watchdog Score gauge.

    Inputs

    • 990 financials + website crawl text
    • OpenAI structured analysis

    Explicitly not mixed in

    • Not a substitute for the deterministic Watchdog Score
    Watchdog Score Calculation
    The overall score is a weighted average of three dimensions

    Transparency

    30%

    Outcomes Maturity

    40%

    Evidence Strength

    30%

    Overall = (Transparency × 0.30) + (Outcomes Maturity × 0.40) + (Evidence Strength × 0.30)

    Outcomes Maturity is weighted highest (40%) because distinguishing and measuring actual change is the core differentiator from "overhead culture" metrics.

    Scoring Dimensions

    Transparency
    30% of overall
    How publicly legible and accountable the organization is.

    Signals (9)

    T1
    Mission statement is clearly stated
    8%

    Homepage or about page contains clear mission language

    T2
    Programs/services are publicly described
    12%

    Dedicated pages describing programs/services exist

    T3
    Leadership/team is listed
    12%

    Leadership, team, or board information is publicly available

    T4
    Clear contact info is available
    10%

    Email, phone, or address is easily found

    T5
    Annual/impact reports are accessible
    15%

    Annual report or impact report links are available

    T6
    Financial docs (990/audit) are accessible
    18%

    Form 990, audited financials, or budget information posted

    T7
    Governance details are visible
    10%

    Board, bylaws, or governance policies are referenced

    T8
    Recent updates indicate active operations
    8%

    News, blog, or content updates within the last 12 months

    T9
    Donation pathway and purpose are clear
    7%

    Donate page with clear purpose for funds exists

    Total weights: 100% (normalized to 100 for scoring)

    Confidence Scoring
    How much data was available to make the assessment

    Confidence indicates how much data we were able to analyze. A low confidence score doesn't mean the organization is bad—it means we had limited public information to work with.

    High

    25+ pages, 2+ PDFs, impact page found

    Medium

    8-24 pages, 1+ PDF

    Low

    <8 pages, no PDFs

    Accountability Flags (FWA)
    Rule-based risk signals from IRS data—separate from the Watchdog Score formula

    FWA flags are generated when IRS master-file status or parsed Form 990 data crosses documented thresholds. They appear on directory rows and the org hub but are not added to or subtracted from Transparency / Outcomes / Evidence dimension scores.

    irs_auto_revoked
    IRS Tax-Exempt Status Revoked
    critical
    IRS BMF / ARL
    revocation_history
    History of IRS Revocation
    high
    Status change log
    high_overhead
    High Administrative Overhead (>40%)
    medium–high
    Form 990
    fundraising_ratio
    High Fundraising Expense Ratio (>35%)
    medium
    Form 990
    exec_comp_anomaly
    Executive Compensation Anomaly (>15% of expenses)
    medium–high
    Form 990
    board_size
    Governance: Board < 3 members
    medium
    Form 990 / XML
    reporting_gap
    Form 990 Reporting Gap
    high
    Form 990 filing history
    Watchdog Flags (T / O / E)
    Missing-signal warnings attached to a Watchdog Score run—not FWA accountability flags
    T_FLAG_01

    No Contact Channel

    No email, phone, or contact form was detected on the website.

    T_FLAG_02

    No Program Description

    No dedicated page describing programs or services was found.

    T_FLAG_03

    No Financial Documents

    No Form 990, audit report, or financial statements were linked.

    O_FLAG_01

    No Outcomes Stated

    The organization does not publicly state specific outcome goals.

    O_FLAG_02

    No Outcome Metrics

    No measurable outcome metrics (beyond output counts) were found.

    E_FLAG_01

    No Evidence Artifacts

    No evaluation reports, research PDFs, or evidence documents found.

    E_FLAG_02

    No Method Described

    No measurement methodology (survey, pre/post, etc.) described.

    E_FLAG_03

    Possible Inflation

    Large outcome claims were detected without supporting evidence artifacts.

    Verified Source Integrations
    What we ingest, why, and whether it feeds the Watchdog Score today
    IRS_BMF

    IRS Business Master File

    Context / FWA only

    Why: Authoritative registry of ~1.5M tax-exempt organizations—EIN, name, subsection, ruling year, assets/income bands.

    Used for: Directory backbone, org identity, BMF-synced FWA batch rules

    Source API / dataset
    IRS_ARL

    IRS Auto-Revocation List

    Context / FWA only

    Why: Monthly list of organizations whose tax-exempt status was revoked for failing to file required returns.

    Used for: Verified tax-status citations, critical FWA flag (irs_auto_revoked)

    Source API / dataset
    IRS_990_XML

    IRS Form 990 XML

    Partial — T6 still checks website links to 990/audit PDFs, not parsed XML fields

    Why: Line-item grants (Schedule I), compensation (J), lobbying (C/R), and private-foundation grants (990-PF Part XV).

    Used for: Money flows, board roster, FWA financial ratios, grant-matching

    Source API / dataset
    PROPUBLICA

    ProPublica Nonprofit Explorer

    Context / FWA only

    Why: Structured 990 summaries when XML is unavailable; financial history baseline.

    Used for: Financial history, pipeline hydration before XML parse

    Source API / dataset
    FEC

    FEC (api.open.fec.gov)

    Context / FWA only

    Why: PAC receipts, committee disbursements, and employer-linked contributions for political exposure.

    Used for: Network graph, political connections hub, employer matching on board members

    Source API / dataset
    USASPENDING

    USASpending.gov

    Context / FWA only

    Why: Federal contract and grant awards to nonprofits from USAspending API.

    Used for: Federal awards table, money-flow Sankey federal lane

    Source API / dataset
    LDA

    Senate LDA (Lobbying Disclosure)

    Context / FWA only

    Why: Quarterly lobbying registrations and issue codes tied to filer names/EINs where available.

    Used for: Lobbying disclosures panel on org hub

    Source API / dataset
    STATE_CHARITY

    State Charity Registrations

    Context / FWA only

    Why: Solicitation and registration status varies by state; adapter framework covers all 50 states incrementally.

    Used for: Compliance grid on org hub with state-level citations

    WEBSITE

    Website crawl (Firecrawl)

    Feeds Watchdog Score

    Why: Only source that feeds the 28 Watchdog signals—mission, programs, outcomes language, PDFs, governance pages.

    Used for: Watchdog Score (primary), AI Promise vs Reality cross-check

    LAST1_CERT

    Last1.app Certification Registry

    Context / FWA only

    Why: Third-party certification when JSON API is live; gated until registry endpoint returns structured data.

    Used for: Certification badge + provenance record (no automatic Watchdog score change today)

    Provenance & Source Citations
    Every integrated fact is traceable on the org hub Sources panel

    When data is ingested—whether from IRS BMF, FEC, USASpending, or a website crawl—we record it in org_data_provenance with the source system, retrieval time, and canonical URL. Findings that affect risk or money-flow views are stored in org_findings with confidence (verified = direct API/file match; inferred = heuristic link).

    On each org profile, the Sources section lists citations you can click through to the original government dataset. We do not blend provenance-backed facts into Watchdog dimension scores unless a matching signal rule exists in config/scoring.v1.0.json.

    Investigation Context (Not in Watchdog Formula)

    These features help researchers and funders investigate an organization. They use verified APIs but do not change the 0–100 Watchdog Score unless we add an explicit, documented signal rule in a future version.

    • FEC political connections — PAC donations and employer-linked contributions with source URLs
    • USASpending federal awards — contracts and grants by recipient EIN/name
    • LDA lobbying disclosures — registrant filings and issue codes
    • 990 Schedule I grants — line-item grants made and received, dual-source matched where possible
    • State charity registrations — solicitation compliance by state (50-state adapter, rolled out incrementally)
    • Board interlocks — person-entity normalization with confidence labels on shared board seats
    • Network explorer — graph of board, political, and financial edges for Investigator-tier users
    Score Interpretation
    What the scores mean
    70-100

    Strong

    Excellent outcome measurement and transparency

    50-69

    Developing

    Good foundation with room for improvement

    30-49

    Emerging

    Basic transparency with significant gaps

    0-29

    Limited

    Minimal public information available

    Note on low scores: A low score often indicates limited public information rather than poor organizational practices. Many effective organizations simply haven't published detailed outcome data publicly. Claiming a profile on Last1.app can help improve scores.

    Anti-Gaming Controls
    How we prevent score manipulation

    Our scoring system includes several controls to prevent organizations from gaming their scores:

    Cap Rules

    Dimension scores are capped when critical signals are missing. For example, Outcomes Maturity is capped at 40 if no specific outcomes are stated, regardless of other signals.

    Inflation Detection

    Large claims (e.g., "100% success rate") without supporting evidence artifacts are flagged (E_FLAG_03).

    Consistency Check

    Metrics that appear with different values across pages are noted as potential inconsistencies (E8).

    Roadmap: IRS 990 XML fields and state charity registration status may become explicit Transparency signals in a future config version—only after rules are added to the versioned JSON and documented here.

    Version History
    Methodology changes are versioned and documented

    • Documented four evaluation layers: Watchdog Score, FWA flags, verified source context, and AI investigation
    • Full verified-source registry: IRS BMF (~1.5M orgs), ARL, 990 XML, FEC, USASpending, LDA, state charity, provenance
    • Provenance model: every ingested fact links to source system, URL, and retrieval timestamp on org hub
    • Clarified what feeds the 28-signal formula vs. accountability flags vs. investigation context
    • Last1 certification: badge + citation when live; does not alter Watchdog dimension math today
    • Monthly IRS sync cron (BMF + ARL) on Railway; BMF asset/income columns widened to BIGINT

    Standards Reference

    Our methodology is informed by and aligned with established nonprofit accountability standards:

    Last1.org StandardsGuideStar Transparency CriteriaNTEN Outcome Measurement FrameworksISSB Impact Measurement Standards

    How RealOutcomes and Last1.app Work Together

    1

    RealOutcomes Rates

    Public watchdog scoring based on IRS data, website transparency, outcome evidence, and accountability flags.

    2

    Last1 Helps Improve

    Organizations claim their profile on Last1.app to get improvement roadmaps, respond to accountability flags, and demonstrate progress.

    3

    Last1 Certification

    When the registry API is live, certified orgs show a verification badge and provenance citation. This does not automatically change Watchdog dimension math today—improvement comes from published evidence on the website.

    RealOutcomes and Last1.app are both operated by Last 1 Enterprises, LLC. RealOutcomes serves as the public-facing accountability layer; Last1.app is the improvement and certification platform.

    Want to improve your organization's score?

    Claim your profile on Last1.app to get personalized recommendations, respond to accountability flags, and access outcome measurement tools.

    Get Started on Last1.app

    Believe there's an error in an organization's score? Submit a correction request