NYCID is a private property intelligence platform that generates comprehensive dossiers for New York City residential properties — fusing municipal open data, building records, environmental assessments, and market signals into a single verified record per property. Built for real estate operators, investors, and underwriters who need depth beyond what any listing platform provides.
New York City publishes extraordinary amounts of property data across dozens of disconnected agencies — DOB permits, DOF assessments, HPD violations, DEP environmental records, ACRIS ownership transfers, DCP zoning — but no system unifies them into a coherent property-level view. NYCID does.
New York City is the most data-rich property market in the world. The Departments of Buildings, Finance, Housing Preservation and Development, Environmental Protection, City Planning, and the ACRIS system collectively publish more data about more properties than any other municipality on earth. The problem is that each agency designed its data independently, uses different property identifiers, formats addresses differently, and publishes at different cadences with different schemas.
Existing platforms — StreetEasy, PropertyShark, CoStar — provide listing and transaction data, but not deep operational intelligence: permit history, violation patterns, environmental flags, assessment trends, ownership chain, zoning constraints. Assembling that intelligence for a single property manually takes hours across a dozen different government portals. For a portfolio of properties, it's impractical at any useful scale.
The first architectural decision — and the one that made everything else possible — was to use the Borough-Block-Lot (BBL) identifier as the universal join key across all data sources. The BBL is the one identifier that NYC property law requires to be consistent across agencies, because it's how the tax system tracks ownership. Every other identifier (address, building ID, parcel number) varies by agency. BBL doesn't.
The problem is that not every dataset has BBL, and some datasets have it wrong. For those cases, a fuzzy address matching fallback chain resolves ambiguous references to probable BBL candidates, which are then verified through geocoding. This two-path identity resolution strategy handles the full range of data quality across the 12+ sources while maintaining a 97% overall match rate.
The architecture is designed for both scale and freshness — the 8.1M property registry needs to stay current as agencies publish updates, and individual dossier requests need to resolve in under 3 seconds. These two requirements drove the key architectural decisions: BigQuery for the registry (batch-optimized, analytical), Firestore for user state and saved dossiers (real-time, query-optimized), and Cloud Run for the dossier generation engine (stateless, scalable).
See the full portfolio — production AI systems across asset management, insurance intelligence, and voice biometrics.