Masthead Data x RealTruck

One of the main reasons RealTruck chose Masthead was its unique architecture, which does not access our data. Masthead, as a Google Cloud Partner, complementary to Google Cloud BigQuery, is compliant with data privacy and security regulations at the architectural level, ensuring that our data remains secure, helping us to gain granular control of the entire data platform

Chris Wall,
 Director of BI & Analytics
35%
improved pipeline health
Industry
Motor Vehicle Manufacturing
Niche
Ecommerce
Company size
5000 employees
Headquarter
Ann Arbor, Michigan
Data stack
Google BigQuery, Google Dataform, Python, Data Transfer Services, DOMO, Airbyte

How RealTruck drives Data Reliability and Cut Compute Costs with Masthead and BigQuery

The Challenge

RealTruck, a leader in aftermarket accessories for trucks and off-road vehicles, stands out with their omnichannel approach, integrating over 12,000 dealers and a robust online presence at RealTruck.com. Operating from 47 locations across North America, the company faces complex data challenges due to its extensive offline network and diverse customer touchpoints.


1. Limited visibility of pipeline performance: The use of numerous data sources and various solutions for data ingestion into the data platform made it difficult to track pipeline failures or data system errors. This limitation hindered Real-Truck’s ability to maintain reliable data. 

2. Gaining cost control: Google Cloud BigQuery enabled the RT team to develop a decentralized data platform, boosting agility in creating data pipelines and assets. However, this approach required a more refined management of resources to ensure cost-effectiveness, given the scalable processing power. To sustain efficiency, the team sought granular visibility into every process and its associated costs.
 3. Automatically identifying data anomalies across the data platform: Tables in Google Cloud BigQuery are sources for data products used by business users and require vigilant monitoring for issues like freshness, volume spikes, or missing values. Addressing these challenges is key to building trust in the data platform among business users.

The Solution

To overcome these challenges, Real Truck implemented Masthead Data to enhance the reliability of BigQuery data pipelines and assets in their data platform. Masthead enables the RealTruck data team to:


1. Automate Observability on Pipeline Performance: By using various ingestion tools added complexity to the RealTruck data platform, Masthead provides visibility into syntax errors and system problems, enabling RealTruck to detect these errors in pipelines and data environment in real-time and address them before they impact downstream data products and users.


2. Anomaly Detection at Scale: Masthead’s unique approach to monitoring time series tables for freshness, volume, and schema changes using logs allows RealTruck to have an overarching view of all tables’ health in BigQuery without increasing compute costs.


3. Identify Metrics Outliers: Through its integration with Google Dataplex integration, Masthead allows the RealTruck team to implement rule-based data quality checks to catch anomalies in metrics.


4. Control Compute and Storage Costs: The customer leveraged Masthead’s Compute Cost Insights functionality for BigQuery. Gaining granular visibility into the BigQuery storage cost, costs of each pipeline, and the overall cost of using any 3d party solution in RealTruck’s BigQuery has enabled the identification and cleanup of orphan processes and expired assets. This makes the cost of the data platform more manageable and transparent.


5. Troubleshoot Data Downtime Within Minutes: With real-time alerts and robust column-level lineage and data dictionary features, Masthead allows the RealTruck data team to trace errors or anomalies and assess the impact of it on pipelines and tables in BigQuery. Having column-level lineage allowed the data team to respond quickly to any errors or anomalies and collaborate more effectively in resolving issues.

Results

RealTruck managed to stay on top of a vast data infrastructure dealing with 5,000+ associates operating from 72 facilities across four countries, as well as 12,000+ dealers. The efficient use of Masthead in combination with BigQuery allowed RealTruck to improve its pipeline health by 35%. RealTruck also gained valuable insights into expired assets and orphaned processes. By eliminating them, RealTruck managed to cut compute resource consumption by 10%

RealTruck gained observability of a vast data infrastructure: 5,000+ associates operating from 72 facilities across four countries and 12,000+ dealers. Using Masthead in combination with BigQuery allowed RealTruck to improve its data pipeline health by 35%. RealTruck gained insights into expired assets and orphaned processes. By eliminating them, RealTruck optimized compute resources by 10%