Gathr > Case Studies > Power massive scale, real-time data processing by modernizing legacy ETL frameworks

Power massive scale, real-time data processing by modernizing legacy ETL frameworks

Gathr Logo
Product
  • Gathr
  • ClearInsight
  • Kafka
Tech Stack
  • Spark
  • Kafka
  • HBase
  • Solr
  • Redis
Implementation Scale
  • Enterprise-wide Deployment
Impact Metrics
  • Cost Savings
  • Innovation Output
  • Productivity Improvements
Technology Category
  • Analytics & Modeling - Real Time Analytics
  • Application Infrastructure & Middleware - Data Exchange & Integration
  • Platform as a Service (PaaS) - Data Management Platforms
Applicable Industries
  • Security & Public Safety
Applicable Functions
  • Discrete Manufacturing
  • Quality Assurance
Use Cases
  • Edge Computing & Edge Intelligence
  • Predictive Maintenance
  • Real-Time Location System (RTLS)
Services
  • Cloud Planning, Design & Implementation Services
  • Data Science Services
About The Customer
The customer is a leading security and intelligence software provider. They focus on creating powerful intelligence and investigation technologies for federal and state-level security agencies. Their solutions enable the security agencies to understand the cyber threats through intercepting communication data, data integration, and advanced data analytics by leveraging artificial intelligence models on big data. They were looking to modernize their existing big data applications and needed a scalable solution that could process 1.5 billion transactions generated per day from multiple real-time feeds.
The Challenge
Enterprises need to analyze large volumes of data from various sources in real-time to make strategic business decisions. They often create custom frameworks to process these large data sets, which can lead to technical debt and dependency on IT teams who understand the historical choices made during the initial platform designs. This can risk impacting businesses and increase customization costs. The customer, a leading security and intelligence software provider, wanted to modernize their existing big data applications. They were looking for an easy-to-use and scalable solution that could process 1.5 billion transactions generated per day from multiple real-time feeds. They needed a near-zero-code solution for ETL processing jobs that could perform real-time ingestion and complex processing, ensure high throughput while indexing and storing, and detect anomalies in transactions.
The Solution
The customer implemented applications that run on a scalable Spark compute engine as structured streaming data pipelines using Gathr. Gathr's vast library of components for data acquisition, processing, enrichment, and storage was used for the ETL solution. The entire data flow was created and orchestrated in Gathr's Web Studio using a low-code methodology. Key technologies and components involved were Kafka for streaming data in real-time, Gathr’s out-of-the-box ETL components for data processing, and Gathr processors and storage components to create a Polyglot architecture. ClearInsight, a rapid application development platform, was also used. The new solution replaced the legacy solution built using various real-time and non-blocking I/O processing frameworks and addressed challenges such as lengthy development cycles, inability to alter processing behavior via configuration changes, time-consuming debugging and rectification process, lack of version management and hot-swap features, complex operations, and stringent SLAs on data availability and query results.
Operational Impact
  • Replaced roughly ~1 million lines of code in ~3 weeks using Gathr frameworks.
  • Achieved a high throughput of 100000+ transactions/second, enabling processing of 1.5 billion records per day.
  • Reduced the overall release cycle from 8 months to 8 weeks.
  • Reduced the release cycle for new changes from 3 weeks to 3 days.
  • Saved overall project cost by designing the solution on commodity hardware.
Quantitative Benefit
  • Reduced codebase by approximately 1 million lines.
  • Increased throughput to 100000+ transactions/second.
  • Reduced release cycle from 8 months to 8 weeks.
  • Reduced release cycle for new changes from 3 weeks to 3 days.
  • Saved on project costs by using commodity hardware.

Case Study missing?

Start adding your own!

Register with your work email and create a new case study profile for your business.

Add New Record

Related Case Studies.

Contact us

Let's talk!
* Required
* Required
* Required
* Invalid email address
By submitting this form, you agree that IoT ONE may contact you with insights and marketing messaging.
No thanks, I don't want to receive any marketing emails from IoT ONE.
Submit

Thank you for your message!
We will contact you soon.