Case Studies > Improving home insurance pricing with synthetic geolocation data

Improving home insurance pricing with synthetic geolocation data

Customer Company Size
Large Corporate
Region
  • America
Country
  • United States
Product
  • Synthetic Geolocation Data
Tech Stack
  • Synthetic Data Generation
  • Data Modeling
  • Public Databases
Implementation Scale
  • Enterprise-wide Deployment
Impact Metrics
  • Cost Savings
  • Productivity Improvements
  • Digital Expertise
Technology Category
  • Analytics & Modeling - Data-as-a-Service
  • Analytics & Modeling - Predictive Analytics
Applicable Industries
  • Finance & Insurance
Applicable Functions
  • Business Operation
  • Quality Assurance
Use Cases
  • Regulatory Compliance Monitoring
Services
  • Data Science Services
  • System Integration
About The Customer
The customer is a large insurance company operating across the United States, providing home insurance to a diverse range of clients. The company faces the challenge of pricing insurance policies accurately due to the varying climate features and risk profiles of different regions. They are also bound by strict regulations such as CCPA and HIPPA, which prevent them from using personal data like customer addresses in their risk assessment models. This limitation has made it difficult for the company to accurately assess risk and set appropriate pricing for their insurance policies.
The Challenge
Home insurance pricing was a risky business for our client. The insurance company catered to homes across the United States in areas with vastly different climate features and risk profiles. CCPA and HIPPA forbade the data science team to use the customers’ personal data, such as their addresses, in their modeling, so they could not assess risk and reflect that in their pricing.
The Solution
The insurance company served modeling teams with synthetic geolocation data. The team could use synthetic home addresses to look up five climate features, such as fire and flood hazards, in public databases. The pricing model trained on synthetic data scored as good as the model trained on real data. Using synthetic home addresses eliminated the risk of re-identification and unlocked new insights. The team established a synthetization framework tailored to modeling based on privacy-risk classification and shortened time-to-data from 6 months to 3 days. The process kept 100% utility of the data, perfectly retaining the statistical dispersion of the original and providing an as-good-as real data alternative for training.
Operational Impact
  • Using synthetic home addresses eliminated the risk of re-identification and unlocked new insights.
  • The team established a synthetization framework tailored to modeling based on privacy-risk classification.
  • The time-to-data was significantly shortened from 6 months to 3 days.
  • The process kept 100% utility of the data, perfectly retaining the statistical dispersion of the original.
  • The synthetic data provided an as-good-as real data alternative for training.
Quantitative Benefit
  • 15M synthetic home addresses generated.
  • 60x shorter time-to-data.
  • 100% utility of the data retained.

Case Study missing?

Start adding your own!

Register with your work email and create a new case study profile for your business.

Add New Record

Related Case Studies.

Contact us

Let's talk!
* Required
* Required
* Required
* Invalid email address
By submitting this form, you agree that IoT ONE may contact you with insights and marketing messaging.
No thanks, I don't want to receive any marketing emails from IoT ONE.
Submit

Thank you for your message!
We will contact you soon.