Masthead Helps Mnemonic Cut Pipeline Compute Costs
The Challenge
Mnemonic indexes and processes billions of transactions and terabytes of data from blockchains per month and builds complex data and machine learning (ML) pipelines on top of the ingested data to generate actionable insights and make this data available to customers via its API. With some of its largest customers making up to a billion API calls each month, Mnemonic faces significant pressure on its data infrastructure. To handle this demand, Mnemonic relies on PostgreSQL, BigQuery, and Dataform to ensure data remains highly available and reliable in real time. However, under these conditions, observing all pipelines becomes a challenge. Accurately tracking CPU capacity slots consumed by regular queries to monitor BigQuery fees poses a difficulty for Mnemonic. Any bugs or unexpected spikes in CPU consumption can have a severe impact on customer fees, potentially reaching >$10000 of dollars per month.
Solution
Masthead Data was the perfect data observability solution for Mnemonic with its ability to track BigQuery fees. This empowers Mnemonic to closely monitor the costs associated with each pipeline and data modeling, granting them complete control over pricing dynamics. In just 24 hours after onboarding, Masthead detected inefficient pipelines that were consuming a significant load of compute power, causing great spikes in Mnemonic’s BigQuery costs. Masthead also helped our client identify numerous irrelevant queries each consuming more than $500 per month.
Coupled with Masthead’s data lineage and real-time data anomaly detection, BigQuery cost tracking functionality equips Mnemonic with the ability to identify any factors influencing their cloud pricing, enabling them to take appropriate measures to minimize expenditures. Furthermore, Mnemonic achieves comprehensive observability of its data pipelines, enabling them to deliver the most accurate, up-to-date, and reliable data to customers.
Additionally, Masthead is secure by design, operating solely based on cloud logs and metadata. Because Masthead’s non-intrusive approach poses no hazards to the company’s or customers’ data, the Mnemonic team deployed Masthead within 15 minutes and saw tangible results in less than 24 hours after onboarding. Also, Masthead doesn’t rely on running SQL queries to handle Mnemonic’s data, so it doesn’t consume BigQuery CPU capacity. As a result, Masthead doesn’t contribute to Mnemonic’s cloud costs, allowing the company to effectively manage their expenses.
Results
Masthead helped Mnemonic achieve data observability and reduce cloud compute costs in Google BigQuery by 20% within just 24 hours after the onboarding. With Masthead as their partner, Mnemonic now has the capability to meticulously track all factors influencing their cloud expenses and take appropriate actions to mitigate or control them. Thanks to Masthead, Mnemonic can maintain the reliability of their data without running costly SQL queries.
” A 24-hour onboarding was enough for finding a couple of pesky queries that were consuming substantial compute cost in our project. Masthead helped us save 20% of our BigQuery costs in 24 hours after onboarding by pinpointing an irrelevant process” ”“ Andrii Yasinetsky, CEO and Co-Founder at Mnemonic .