About Customer

The customer aims to be the best end-to-end logistics platform and revolutionize the global transport logistics sector. They are committed to delivering a better quality of life for drivers. With over 8 million customers and revolutionizing one delivery at a time, they enable numerous businesses to transport anything on demand.

AWS Managed Transactional Databases Add to Time and Cost

The customer’s AWS managed transactional database (OLTP) did not meet the demand spikes or support the analytics team with daily critical business reporting. As the logistics company started looking for a more scalable solution, it went through the capabilities of Snowflake and learned that its runtime was quite promising. It was challenging for their in-house team to scale compute resources in the data center. Some challenges were the time-intensive nature of the materialized view for daily reporting and generating metrics reports, let alone the associated intricacy and high cost.

Set on to find a scalable data warehouse solution and Snowflake being their preferred platform, the customer turned to Blazeclan for creating a Proof of Concept (POC). The POC involved embracing Snowflake for their data analytics workloads and migrating a Python-based report to the Data Cloud, which would be a unique engagement in the history of Snowflake adoption.

What Gives for Inefficiencies in the Existing Environment

In the customer’s cloud data center, the OLTP data is stored in PostgreSQL and utilized for analytics workloads. However, due to this setup in the production environment, there were several performance bottlenecks along with delays in creating daily metrics reports, a crucial need for sovereign business decision-making.

How Blazeclan’s Approach Will Benefit the Customer

Data and cloud experts from Blazeclan engaged closely with the customer’s team to understand the core challenges and peripheral requirements concerning the data ecosystem. The Blazeclan team has created a proof of concept, which will

  • Use an automated, incremental load pipeline to refresh data every 6 hours from Postgres to Snowflake, making scalability against demand significantly cost-effective.
  • Decouple storage from the compute resources, enabling independent scalability for both.

+60%

Storage Costs Saved

30%

Reduced execution time of daily metric reports

+90%

Time Saved in Data Refresh

Solution Proposed in POC Offered to the Customer

After conducting a detailed assessment of the customer’s existing environment, Blazeclan created a proof of concept that involved –

  • OLTP data migration from PostgreSQL to Snowflake Cloud Data Platform
  • Converting the daily reporting format from materialized view to Snowflake view and loading data into the Snowflake table
  • Creating an efficient data pipeline for faster processing of daily metrics reports by converting the existing Python scripts into Snowpark scripts

The Approach to Implementing the Proposed Solution

  • Historical data migration from PostgreSQL to Snowflake Cloud Data Platform
  • Data ingestion from OLTP to Amazon S3 Bucket with Blazeclan’s file ingestion framework developed using AWS Glue, Python, and Apache Spark
  • Reading Amazon S3 files from the ingestion layer using Blazeclan’s database ingestion framework and loading the data into the Snowflake table for analytics purposes
  • Completing one-time historical load for all in-scope databases and carrying out the incremental load daily, every 6 hours
  • Data pipeline creation for daily incremental load and orchestrating the same with Amazon Managed Workflows for Apache Airflow (MWAA)

Tech Stack

AWS Glue AWS Secrets Manager Amazon S3
Amazon Managed Workflows for Apache Airflow (MWAA) MySQL Snowflake

Service Tags: , , , , , , ,