
B2C platform that analyzes county property data to find comparable properties with lower valuations, generating appeal reports helping US homeowners save thousands on property taxes. Architected full stack from ML algorithms to production deployment serving 50+ counties.
Multi-layer system with specialized components:
Designed custom distance metric combining Hamming distance for categorical features (location, property type, construction quality) and weighted Euclidean distance for numerical features (square footage, lot size, age). Normalized to [0,1] range with county-specific α tuning based on local assessment methodology. Formula: similarity = α * hamming + (1-α) * euclidean.
Reduced search time from 30+ seconds to <2 seconds through spatial indexing (KD-trees, ball-trees), pre-computed similarity matrices for common searches, and county-specific model weight caching. Implemented nearest neighbor search with scikit-learn optimized for property data characteristics.
Built ETL system handling diverse county formats (CSV, Excel, PDF scraping) with automated validation catching >90% errors, heuristic-based imputation for missing values, and pluggable parsers for rapid county onboarding. Enriched property data with geographic features (school districts, crime rates, market trends).
Microservices design isolating property search, report generation, and payments. Kubernetes horizontal pod autoscaling based on traffic patterns, PostgreSQL read replicas for report-heavy workloads, and CDN for static assets. Deployed with GitOps CI/CD ensuring zero-downtime updates.
Average user saves $2,500 annually with 70%+ appeal success rate. Platform automates 95% of report generation with legal compliance for state-specific requirements. Supports 50+ counties processing millions of property records.