
Machine learning platform for J.G. Wentworth, a debt resolution company, predicting customer behavior and optimizing marketing spend across their multi-million dollar acquisition pipeline. Built as outsourced ML engineer through CML Insights, processing 30K+ monthly leads with real-time scoring.
End-to-end ML system with integrated data sources:
Developed Measure of Match scoring from scratch, evaluating prediction power by measuring distribution overlap between positive and negative classes. For numerical features, creates histograms calculating probability densities with overlap metric Σ min(P_pos[bin_i], P_neg[bin_i]). For categoricals, uses frequency distributions with top-N grouping for high cardinality. Handles missing values explicitly since missingness itself is predictive in financial data.
Built gradient boosting classifiers: enrollment prediction (14-day window, tertile scoring for sales prioritization), contact propensity (70+ features with top predictors: total debt 20%, marketing channel 12%, FICO 4%), and cancellation prediction (90-day churn risk with intervention triggers). Applied stratified sampling for 8% enrollment rate imbalance, class weights in training, and probability calibration for realistic score distributions.
Designed polynomial saturation modeling (degree-2 polynomials via scikit-learn) capturing diminishing returns in CPL and realizable effect as lead volume scales. Built constrained optimization with scipy solving maximize Σ(revenue - cost) subject to budget, saturation limits, and contractual minimums. Validated through A/B tests and holdout analysis, delivering $500K+ reallocation recommendations (±20% adjustments across channels).
Integrated TransUnion credit reports (trade lines, balances, inquiries), Census economic indicators (DP03 by ZIP), and Salesforce interactions into 70-80 feature pipeline. Handled different refresh cadences (daily leads, monthly census), schema evolution, PII anonymization, and high cardinality through target encoding and geographic hierarchies (ZIP→county→state).
25% improvement in contact-to-enrollment conversion, 15% reduction in customer acquisition cost, 10% decrease in 90-day churn. System processes 30K+ leads monthly with automated scoring and monthly model retraining maintaining accuracy as market conditions shift.