
Case Study
Distressed Asset Intelligence Platform
Automated scoring and prioritization across 335,000+ NYC commercial properties
The Problem
Manual Research Cannot Scale to Market Size
New York City contains over one million commercial and residential properties spread across five boroughs. Within that universe, a meaningful but unknown number of assets are under financial or physical distress at any given moment, behind on taxes, accumulating code violations, facing litigation, or carrying unsustainable debt loads.
Identifying these properties has traditionally been an analyst driven process. A researcher manually pulls records from city databases, cross references court filings, checks violation histories, and builds a picture of each property one at a time. A skilled analyst might evaluate 20 to 30 properties per day with any depth. At that rate, covering the full market would take decades.
The result is that most distressed asset sourcing operates on fragments of the available data. Firms rely on personal networks, one off tip sheets, or narrowly scoped searches. Opportunities surface late, inconsistently, or not at all. The data exists to do better: NYC publishes millions of records across housing violations, building permits, property transactions, and court filings. But no one had assembled it into a single, scorable view of the entire market.
Why This Matters Now
According to McKinsey's 2025 research on AI adoption, 80% of companies cite data limitations as the primary roadblock to scaling agentic AI systems. The challenge is not building models, it is assembling the structured data pipelines those models need to operate on. In commercial real estate, this data problem is particularly acute: the information is public but scattered across dozens of agencies with incompatible formats and identifiers.
Source: McKinsey Global Survey on AI, 2025
The Approach
Build the Data Infrastructure First
Rather than starting with a model and hoping the data would follow, FGP took the opposite approach: build a complete data ingestion layer that normalizes every relevant NYC public dataset around a single universal property key, then design scoring logic on top of that foundation.
The system ingests data from the NYC Department of Finance, Housing Preservation and Development, Department of Buildings, Automated City Register Information System (ACRIS), the County Clerk, and HMDA census tract lending data. Every record is normalized to a 10 digit Borough Block Lot (BBL) identifier, the universal key that links a tax lien sale record to the same property's building permits, court filings, and mortgage history.
Each property is then scored across 13 distinct distress signals. Signals are weighted by severity, adjusted for building size through per unit normalization, decayed by recency, and amplified when multiple signals converge through combo multipliers. The output is a single, rank orderable distress score that reflects both the depth and concentration of stress on each asset.
The design prioritizes transparency. Every score includes a full signal breakdown showing exactly which factors contributed and at what weight. An analyst reviewing a flagged property can immediately see whether the score is driven by active litigation, delinquent taxes, code violations, overleveraged debt, or some combination. No black box.
The Research Supports This Sequence
McKinsey's analysis of AI driven operations finds that workflow redesign, not model sophistication, is the strongest predictor of measurable AI impact. Organizations that restructured their data and process architecture before deploying AI saw significantly higher returns than those that layered AI onto existing workflows. The Distressed Asset Intelligence Platform followed this principle: the data pipeline and scoring architecture were designed as a complete system, not an add on to manual research.
Source: McKinsey, “Why agents are the next frontier of generative AI,” 2025
System Architecture
How the Platform Works
Data Ingestion
23 datasets from
NYC open data APIs
Normalization
BBL key unification
across all sources
Signal Scoring
13 weighted signals
with decay & normalization
Tiered Output
Ranked properties
with full breakdown
Operational Throughput
335,454 properties
Every scored property evaluated across all 13 distress signals simultaneously. Each signal applies tiered scoring, per unit normalization, and temporal decay, over 4.3 million individual signal calculations per run.
Time Compression
Weeks → Hours
What previously required weeks of manual research to evaluate a few hundred properties now executes across the entire NYC market in a single automated run. Full pipeline ingestion and scoring completes in hours, not months.
Automation Rate
95%+ Automated
From API ingestion through scoring and tiered output, the pipeline runs end to end without manual intervention. Human review is reserved for the final stage: evaluating top tier opportunities that the system surfaces.
System Scale
23 Sources, 13 Models
Data drawn from 23 distinct datasets across 6 NYC agencies. Scoring uses 13 independent signal models with 4 combo multipliers, 3 temporal decay schedules, and per unit normalization across 3 signal types. Total pipeline processes ~15 GB of raw data into ~2 GB of structured, queryable output.
The 13 Distress Signals
Each property is evaluated against every signal. Points are weighted by severity, with the most actionable indicators carrying the highest weight.
Results
Tiered Output for Prioritized Action
The platform produces a fully scored and tiered view of every property in the NYC market. Rather than delivering a flat list, the system segments properties into actionable tiers based on distress severity, allowing different teams and strategies to focus on the segment that matches their risk appetite and deal structure.
The primary output segment, properties scoring 30 or above, contains 6,535 assets showing meaningful convergence of multiple distress signals. Within that group, 1,750 properties score above 40, representing severe multi signal distress with combo multiplier amplification. The system also tracks signal convergence across the full market: 139,000 properties show two or more simultaneous signals, and 21,020 exhibit five or more converging indicators of deep distress.
| Score Band | Classification | Properties | Interpretation |
|---|---|---|---|
| 40+ | Severe | 1,750 | Maximum signal density with combo amplification |
| 30 – 39.9 | High | ~4,785 | Strong financial and physical distress convergence |
| 20 – 29.9 | Elevated | ~14,500 | Multiple converging distress signals |
| 10 – 19.9 | Moderate | ~65,000 | Two or more meaningful signals present |
| 1 – 9.9 | Low | ~249,000 | Early or isolated signal activity |
Context: What This Type of Build Typically Costs
Industry benchmarks for mid market AI implementations, systems involving data pipeline construction, custom scoring models, and production deployment, typically range from $60,000 to $150,000. FGP built this system internally using open source tools at near zero infrastructure cost, demonstrating the kind of efficiency that becomes possible when the build is led by practitioners who understand both the domain and the technology.
Source: Sparkout Tech Solutions, “AI Implementation Cost Analysis,” 2025
How This Applies to Your Business
The Same Patterns, Applied to Your Operations
The Distressed Asset Intelligence Platform is FGP's internal production system, built by the same team that works with clients. Every component of this system maps directly to the services FGP offers through its AI and automation practice. The patterns used here, structured data ingestion, scoring model design, automated pipeline orchestration, and tiered output for human review, are the same patterns we apply to client engagements across industries.
Identifying the Opportunity
Before building anything, FGP mapped the full landscape of NYC's public data sources, evaluated data quality and coverage, and identified which signals would carry predictive value for distressed asset identification. This is the same discovery process we run with clients: understanding what data exists, where the gaps are, and which workflows have the highest automation potential.
Constructing the System
The pipeline itself, 23 data source integrations, 13 scoring models, normalization logic, temporal decay, and combo multipliers, represents a full workflow build. FGP designed, developed, and deployed this system end to end. For clients, this phase involves the same pattern: translating an assessed opportunity into a working, automated system with measurable output.
Keeping It Running
The platform runs on a recurring schedule, ingesting updated data from city APIs, recalculating scores, and producing fresh tiered output. This ongoing operation, monitoring data quality, adjusting signal weights as market conditions shift, and ensuring system reliability, mirrors the managed operations FGP provides to clients who need their systems maintained and optimized over time.
Get Started
Ready to Build Your Intelligence Layer?
Whether you are looking to automate a manual research process, build a scoring system for your market, or scale an existing data pipeline, FGP can help you get there.
Start a Conversation