Case Study
How UST CodeCrafter helped a global CPG company modernize its data analytics 88% faster with GenAI
OUR CLIENT
Founded several decades ago, this multinational food and beverage company has grown to become one of the largest CPG companies in the world. With products sold in hundreds of countries, its portfolio includes iconic brands with operations that encompass manufacturing, marketing, and distribution. The company employs approximately 200,000 people and generates more than $100 billion in revenue annually.
THE CHALLENGE
Complicated migration of a legacy database to PySpark
For years, the company relied on an outdated analytics system. With approximately 1,600 legacy database scripts that were created over many years by different in-house and offshore teams, the company’s analytics team struggled to gain valuable business insights—particularly about product sales in its 50 key markets around the world. The cumbersome system had become expensive to operate, and lengthy run-times slowed new product launches.
Meanwhile, the company’s board mandated a cloud-based data modernization initiative, prompting the IT team to migrate data and reports from the legacy analytics system to modern PySpark. Given the inconsistent coding patterns and embedded business logic in the scripts, the project was estimated to take two years with six full-time developers and a large quality assurance team. To compound the situation, performance ceilings and scheduling constraints on existing on-premises hardware would cause prolonged dual-run periods, adding cost and complexity to the project.
After a previous vendor converted just sixty scripts in nine months—an unacceptable project pace—the company realized it needed a different strategy with a new vendor.
THE TRANSFORMATION
An AI-driven approach to large-scale code migration: code-refactoring using generative AI
UST utilized UST CodeCrafter, an innovative AI-assisted modernization framework and process that combines a proprietary Abstract Syntax Tree (AST) parsing engine and control flow graphs with chain-of-thought large language model (LLM) generative AI (GenAI) prompts to refactor the legacy code into reusable, modern PySpark templates. The engagement accelerated code conversion while preserving business logic from the original scripts. Human-in-the-loop validation, automated test harnesses, and synthetic data ensured reliability from day one.
Within three weeks, the UST team converted approximately 1,600 legacy database scripts to PySpark jobs running on the company’s Databricks environment. Nine additional weeks covered regression testing, CI/CD readiness, and performance tuning—with zero unplanned downtime.
The thorough, iterative AI-driven process delivered:
- Standardized, optimized, reusable PySpark templates—Following a week-over-week learning loop, the data engineering team expertly utilized the AI-based CodeCrafter accelerator to continuously improve the script conversion quality.
- Auto-generated unit tests and synthetic test data—Automated test cases verified individual units of code to detect and resolve defects. A built-in synthetic data generator validated the converted scripts at scale.
- End-to-end regression test packs—This comprehensive testing approach helped ensure the converted data analytics scripts functioned seamlessly on the target Databricks environment.
- CI/CD deployment pipelines—The deployment-ready codebase supported the seamless promotion of data pipelines to staging and production, enabling same-day data enhancements instead of the typical week-long change window.
- Detailed knowledge transfer—Our thorough documentation, playbooks, and hands-on training ensured the company’s data analytics team could create new, standardized PySpark scripts in a single day.
The global CPG company now has a modern, cloud-native analytics solution that accelerates business insights and facilitates data-driven decision-making.
THE IMPACT
Delivering a data migration project 88% faster with GenAI and UST CodeCrafter
The successful AI-assisted code modernization effort quickly met the board’s mandate to deliver cloud-based, AI-ready data analytics to supply chain, finance, and other key business teams. The standardized, optimized, modern PySpark code and CI/CD data pipelines paved the way for rapid experimentation, giving business teams fresh insights into product demand and supply variability—to thwart competitive pressures and boost global sales. The engagement also delivered these impressive results:
- 88% faster migration timeline—UST’s AI-driven approach collapsed a 24-month manual project schedule into twelve weeks, proving that GenAI combined with human-in-the-loop expertise can accelerate technology modernization at scale.
- 40% improvement in runtime performance for critical data pipelines—The cloud-native data pipelines accelerate data insights, enabling company and line of business leaders to keep pace with the dynamic global CPG marketplace.
- Significant migration cost savings—The company achieved this significant cost savings because the CodeCrafter modernization framework and iterative process enabled UST’s concentrated team of AI and data engineering experts to complete the task 21 months faster than initially estimated.
- Reallocated four full-time employees to more strategic initiatives—This productivity boost enables the company to focus on innovation and competitive advantage rather than tedious data management tasks.
Ready to compress your application modernization timelines? Talk to a GenAI expert today to find out how UST CodeCrafter can transform your migration projects.
RESOURCES
https://www.ust.com/en/alpha-ai
https://www.ust.com/en/what-we-do/digital-transformation/data-analytics
https://www.ust.com/en/insights/overcoming-generative-ai-adoption-challenges-in-enterprises