Lead the launch of accounts recommender / similar companies (2017) using domain specific word2vec (CNN) and model features extracted from corporate descriptions, and news articles including: Company specialities/ offerings (harvested via descriptions) Company popularity Location Revenue Employee count News agents count News count Co-occurence of companies in 'email watch list' Co-occurence of companies in news…
Read more
Identity (contact) matching and resolution service
Launched contact matching service @94.5% (+17%) accuracy and @89.8% (+23%) recall to enable CRM contact "Refresh" product launch in Q3'2018. Record linking model features include: Executive name Employment titles Skill based similarity Job level, job function based similarity Title co-occurrence based similarity Email match Canonical LI URL Executive biography Education Age Location Other social handles
Read more
Company Data Web Harvester
Increase firmographic richness (completion) using web crawls, machine assisted modules, and human-in-the-loop pipelines for top 1M companies to 90% coverage across all key attributes (2018): addresses, industry classifications, revenue, employee counts, and URLs. Lead to industry leading company data quality @90% accuracy. Examples of breadth and depth of primary sources includes: US Department of Labor…
Read more
Company entity matching (and deduplication)
Scoped, provided outside in knowledge through partnerships / OEM vendors, and developed monetization strategy for Entity matching / record linkage as-a-service. Migrated from rule based record linkage to ML based models (random forest & logistic regression) using hadoop and spark applications with 20% increase in F1 and 40% increase in recall while maintaining 90% precision. …
Read more