Case Study
How a U.S. food retailer moved from legacy to modern data platform to handle 2.5 TB of data per month
CLIENT
This U.S.-based food retailer subsidiary was recently created as part of a merger between two industry leaders. With retail and ecommerce operations, the company employs several thousand people and generates more than $50 billion in annual revenue.
CHALLENGE
Tackling legacy data systems to drive data-driven decision-making
Our client wanted to replace its outdated reporting platform with a modern cloud-based data lake to improve reporting, analytics, and archival capabilities. The company was struggling to fetch, consolidate, transform, curate, and analyze data from its multivariate data sources, including mainframe, DB2, Informix, SQL Server, and Oracle. Additionally, the legacy data systems prevented the company from designing sophisticated data analytics pipelines, severely limiting business insights. The company needed an experienced data engineering partner to support business intelligence.
TRANSFORMATION
Cloud-based data analytics leads to comprehensive information management processes
UST created a modern, centralized data lake using Azure Data Lake Storage (ADLS) and ADLS Gen 2 to enable big data analytics and improve data management and storage. Now, the grocery retailer can:
- Seamlessly ingest data into the new data lake—using Azure Data Factory to extract and consolidate data from the company’s disparate sources
- Create data pipelines—to schedule and orchestrate the movement and transformation of data in the new data lake
- Automate data transformation—using Azure Databricks and Azure HDInsights, so the data is ready for analysis
- Split data processing workloads—across many processors to scale up or down server capacity as needed to meet analytics needs
- Receive automated failure notifications—to avoid wasted time and effort, using Azure Logic Apps
- Easily run reports on the transformed data—that’s stored in a data warehouse connected to Microsoft Power BI
- Manage data pipelines, versions, and code—using Azure DevOps Services
- Store, test, and debug large Hadoop source tables—using Hive and Hadoop Distributed File System (HDFS)
IMPACT
Unleashing data-driven decisions across the company
With the data lake in place, the company has:
- Increased data system scalability and agility—The data lake architecture can handle massive data volumes of 2.5 TB per month and support 300-400 concurrent users.
- Improved reporting efficiency—Users can easily generate reports with various data visualization tools for business intelligence, enabling faster and more insightful decision-making across the organization.
- Empowered employees—Democratized data, with accessible reporting and analytics tools, puts the power of insights in everyone's hands.
RESOURCES
https://www.ust.com/en/what-we-do/digital-transformation/data-analytics