Case Studies > Building Karmic’s Data Infrastructure

Building Karmic’s Data Infrastructure

Company Size
11-200
Region
  • America
Country
  • United States
Product
  • Amazon Redshift
  • Stitch Data
  • Apache Airflow
  • Postgres
Tech Stack
  • ETL
  • Data Warehousing
  • Cloud Platform
  • Serverless Architecture
Implementation Scale
  • Enterprise-wide Deployment
Impact Metrics
  • Customer Satisfaction
  • Digital Expertise
  • Productivity Improvements
Technology Category
  • Analytics & Modeling - Big Data Analytics
  • Application Infrastructure & Middleware - Data Exchange & Integration
  • Platform as a Service (PaaS) - Data Management Platforms
Applicable Industries
  • Finance & Insurance
  • Retail
Applicable Functions
  • Business Operation
  • Sales & Marketing
Services
  • Cloud Planning, Design & Implementation Services
  • Software Design & Engineering Services
  • System Integration
About The Customer
Karmic Labs delivers the future of expense management with a platform for employers, banks, and retailers to manage debit card and fund distribution amongst their customers and members. At a time of momentous growth for Karmic, their ability to build a strong, scalable data infrastructure became increasingly critical. Echoing what Airbnb refers to as “Data Democratization,” Karmic’s Data Science Product Manager, Yang Wang, explains that, “the more accessible data is, the faster we can iterate, and the further we can get in the game.” Yang joined Karmic when that data infrastructure was largely nonexistent, but it soon became one of this team’s highest priorities to fill that gap. “The second you build a software, you want to know what works and what doesn’t. We desperately needed more high-level analysis,” he said.
The Challenge
One of the biggest challenges Yang faced was in choosing and leveraging third-party tools. “How do you weigh vendors when you don’t really know what your needs are, and how those needs will change over time?” For a data warehousing solution, Yang ended up siding with Amazon Redshift, as it met all of his needs for storage, speed, and security. But to get data into Redshift, he needed an ETL solution to match it. Stitch Data was the first provider that caught his eye and that he later implemented - but it wasn’t long before his team outgrew it. “Plug-and-play tools like Stitch work great for straightforward workflows, but we needed more customization and access under the hood to not only comply with our security requirements, but also stay competitive with companies that have more developed data infrastructures” said Yang. “The fact that we didn’t have control over transformations forced us to consider other, more comprehensive options.”
The Solution
In his research for other options, Yang came across Astronomer’s Managed Apache Airflow module. While he hadn’t heard of Apache Airflow, his research proved that the open-source software had a strong community behind it and was a good fit for the job. “There were no other managed Airflow services out there, and we didn’t have the DevOps resources to run it ourselves” he said. Not long thereafter, he migrated his workflows to our Cloud platform. Karmic now uses Apache Airflow on Astronomer to sync their application database (Postgres) to their data warehouse (Amazon Redshift). Directly on our platform, Yang created a dynamic workflow that both automates that process and complies with Karmic’s security requirements. Due to the sensitive nature of their business, Karmic requires a whitelisted IP and SSH for some database connections. Since Astronomer’s Cloud Airflow service runs in a serverless architecture where each task instance runs in a separate container, there was no immediately obvious place to store the key file needed for an SSH connection (in this case, for Postgres). But by working with Astronomer, Karmic was able to configure a custom Airflow hook that opens an SSH tunnel in each task instance that requires access to that database - and closes that tunnel once the task finishes.
Operational Impact
  • With Astronomer, Karmic can trust that their data in Redshift remains reliable for both external and internal reporting.
  • At an organizational level, Astronomer allows Yang to fulfill a two-pronged goal: to make sure that data is widely available and, more importantly, accessible.
  • For Karmic, having reliable data in Redshift is the gateway to leveraging complementary analytics tools used by the rest of the team.
  • Astronomer Airflow not only allows for Karmic’s ETL processes to comply with security requirements, but also reliably ensures that data is in a place where the entire company can use it - and learn from it.
  • Since implementing Astronomer, Karmic has been able to more fully embrace data-centric solutions, better understand and serve their users, and stay agile at a time of constant product iteration.

Case Study missing?

Start adding your own!

Register with your work email and create a new case study profile for your business.

Add New Record

Related Case Studies.

Contact us

Let's talk!
* Required
* Required
* Required
* Invalid email address
By submitting this form, you agree that IoT ONE may contact you with insights and marketing messaging.
No thanks, I don't want to receive any marketing emails from IoT ONE.
Submit

Thank you for your message!
We will contact you soon.