January 6, 2025

Migrating from SQL-First Vendor to Masthead: How Log-Based Data Observability Reduced Costs and Improved Coverage

Yuliia Tkachova
Co-founder & CEO, Masthead Data

Last year we won the client from SQL-first solution and successfully migrated them to Masthed.

Context

For ethical reasons, I’m not going to name the account or SQL-first vendor. I have great appreciation for both parties – the vendor and our newly onboarded customer. I’m writing this post because we had an experience that demonstrated how our log-based approach provides better full coverage (across data, pipelines, and compute) compared to the SQL-first vendor.

The main hurdle was that our new client deeply loved their SQL-first solution, as it provided great value in saving data team time when issues occurred. The data team was therefore skeptical about Masthead’s value proposition. However, the team was very interested in Masthead’s Data FinOps capabilities, as they wanted to optimize their data warehouse costs and improve their migration strategy.

Breaking Through Initial Skepticism

Honestly, the demo was not going very well, at some point one of the senior team mates from client side said out loud that they do not have interest in Masthead as they have everything they need in their current solution.

I quickly realized that we need to go beyond feature comparison and provide the value for the end user in our case: data engineer. I quickly regrouped and directly asked the biggest sceptic on the call, “What if we will free this much budget for them so they would be able to hire one more data engineer on their team?” He lit up in his face. It very much felt as: The challenge accepted. 

The Proof of Value Process

We needed to decide how to do the Proof of Value (PoV), since the team realized that Masthead requires neither read data permissions nor access to information_schema. They simply forwarded the Terraform script to the DevOps team, with no paperwork whatsoever. (Mind-blowing!)

Detailed Feature Comparison

The team onboarded Masthead and began comparing their existing SQL-first vendor solution with Masthead.

Lineage and Visibility

Masthead was onboarded via Terraform in 20 minutes by their DevOps team, and within 2 hours, they had visibility of all their pipelines and tables with all cost insights.

In Masthead, they were able to see full column-level and pipeline lineage across projects.
The SQL-first solution only provided table-level lineage, with no cross-project visibility, column lineage, or pipeline lineage. 

I’m including a screenshot of Masthead’s lineage visualization. We work hard to make it both usable and visually appealing. The team explicitly emphasized that the lineage looks great and they enjoy using it.

Masthead Data Column level lineage
Masthead column level lineage


Pipeline Observability

They instantly gained visibility into all pipelines, errors, cost spike anomalies, and execution failures (including dbt, Fivetran, scheduled queries, and custom jobs running in Python orchestrated via Airflow). They didn’t need to integrate Masthead with any of these technologies – we automatically mapped them in the UI, showing pipeline destinations, owners, execution frequency, tags, source code, last execution time, hash values, and associated costs. 

With their previous vendor, they only had visibility into Fivetran and dbt pipelines where integration was possible. However, this provided neither full visibility nor cost and frequency tracking; it merely duplicated the error reporting already available in the Fivetran and dbt UIs.

Masthead data pipline observability

Anomaly Detection Approach

At Masthead, we provide automatic anomaly detection for all time-series tables, while with the SQL-first vendor, our customer needed to manually select which tables to observe. The number of observed tables directly affected the client’s monthly costs. At Masthead, we firmly believe that observability should not be limited to specific tables, unlike basic monitoring that can be enabled through services like Google Cloud Monitoring. When data issues occur, they have a blast radius effect, making it crucial to measure each node (table) to understand the full impact. The vendor they were using had these limitations.

The difference between Masthead and the SQL-first solution was tremendous. We provided fully automated anomaly detection out of the box, while with the SQL-first solution, they needed to enable each rule manually. In contrast to Masthead’s approach, they needed to prioritize which tables and datasets to focus on first. While Masthead completed all learning on day one, the SQL-first solution required a 14-day onboarding period to learn the data.

We do not request permissions to read client’s data (bigquery.tables.getData) or need access to INFORMATION_SCHEMA, as we parse logs and report on the system state in real-time. In contrast, the SQL-first vendor ran scheduled queries to check if the values in INFORMATION_SCHEMA had changed. Our approach is not only more efficient but also faster at identifying any breakages or misalignments in the client’s data environment. Any freshness or volume issues are immediately triggered by logs in Masthead, while the SQL-first vendor relies on scheduled queries to check if the data looks okay.

For example, in client’s case, a table was updated every 3 hours, but the SQL-vendor’s freshness checks were scheduled every 6 hours to avoid false alerts. This meant that the table could miss two update cycles before an issue was detected. For some tables that were updated multiple times per day, the checks were only performed once daily. This clearly demonstrates that the SQL-first approach has substantial disadvantages and limitations, while Masthead maintains full observability based on real-time events in the client’s data environment.

Cost Insights and Optimization

Regarding cost insights, this was a feature the client was particularly interested in from the start. Since we work with all pipelines running in a client’s environment without requiring integration, we can show current costs within any chosen timeframe and identify optimization opportunities. 

To be fair, the SQL-first vendor was also providing cost insights by showing spend per service account, but this didn’t have as much impact as our granular understanding of how to restructure compute costs as in Masthead.

On the day they onboarded Masthead, the client discovered an opportunity to optimize their BigQuery compute costs by 80% through restructuring their data workflows. Initially, they were skeptical as it seemed too good to be true. However, after thorough verification, they were thrilled with the insights we provided.

Masthead Pipeline Cost Optimization
Masthead identifies the optimal BigQuery pricing model (On-demand or Editions) to reduce pipeline compute costs

Another significant finding was related to dead-end tables. Because Masthead observes all processes and tables within the client’s data environment, we can identify tables that have recurring compute costs but aren’t being used. For example, they discovered pipelines that were running daily without any downstream consumption. This insight alone revealed potential compute savings of 10%, not including the associated storage cost reductions.

Masthead Dead-End Tables
Masthead tracks BigQuery compute costs for unused tables with no activity in reporting, console, or downstream pipelines

Data Quality Management

Regarding Data Quality, the SQL-first vendor had access to client data and ran SQL rules to check specific metrics in tables for clients, such as MRR per channel and Daily_Churn, etc. Surprisingly, despite having unlimited access to client data and the ability to run any data quality checks, the client had enabled only a handful of data quality rules; everything else consisted of data anomaly checks like freshness and volume.

After discussing this with the data team, one of the senior data engineers stated that Masthead’s out-of-the-box anomaly detection provided sufficient visibility. Another team member indicated that he only needed a couple of quality checks – a fraction of what they had with the SQL-first vendor – as Masthead’s complete coverage with automated anomaly checks replaced many manually built checks.

This was a surprising result for us. Together with the client, we agreed to move the necessary quality checks to Dataplex. In this arrangement, still Masthead does not have access to client data but would receive aggregated results of quality checks executed by Dataplex.

BI Integration

One disadvantage of Masthead’s current offering for the client was the absence of BI tool integration, which the SQL-first vendor already had. We swiftly agreed to provide this integration within Q1 2025, as we are very interested in expanding our offering and tailoring it to our clients’ needs.

Total Cost of Ownership Analysis

Another fascinating discovery emerged regarding compute costs. Since the SQL-first vendor was running frequent SQL queries to identify anomalies and quality issues, it consumed significant client BigQuery compute resources. What no one had previously realized was that the annual compute cost generated by the SQL-first vendor actually amounted to 70% of its annual price tag.

This was a major revelation for the team, as they discovered several opportunities to optimize their total cost of ownership using Masthead. This comprehensive visibility is possible because of our unique capability to observe all pipelines, regardless of the solution used in the client’s environment.
Masthead UI: Cost Attribution Across Client's Data Stack
Masthead UI: Cost Attribution Across Client’s Data Stack

Results and Comparative Analysis

SQL-first vendorMasthead
Data Anomaly100+ rules (less than 10% are data quality rules; the rest are data anomaly detection rules performed by querying client’s data)Full coverage across all BigQuery projects and instances connected. Fully automated anomaly detection available within 2 hours of deployment.
Pipeline ObservabilityLimited: Only available with Fivetran and dbt code integrationUnlimited, fully automated regardless of technology used: dbt, Python, scheduled queries, Airflow, etc. No-code, full pipeline visibility available within 30 minutes of deployment.
Cost insightShows service account consumption data, matching the information available in the Google Cloud billing tab.Insights into BigQuery compute restructuring with 80% optimization potential, plus identification of dead-end tables to reduce compute costs for unused resources.
Data Quality A core vendor capability that enabled data teams to implement 10-15 quality rules, but requires direct access to query client’s data.Direct integration with Dataplex allowed clients to migrate all necessary data quality rules without giving Masthead any permissions to read their data.
Data LineageTable-level visibility only, with no column view, pipeline view, or cross-project availability.Client receive full cross-project, column-level, and pipeline-level lineage within 30 minutes of deployment. 
Total Cost of ownershipBecause the solution runs checks via SQL queries in the client’s environment, it generates BigQuery compute costs even for incremental and well-clustered data. For this particular client, checking less than 1% of tables in their data warehouse generated compute costs equal to 70% of the vendor’s annual fee – amounting to thousands of dollars.At Masthead, by leveraging logs, we avoid querying client’s data, which makes our system extremely efficient as it doesn’t generate additional compute costs for clients. This approach allows us to observe the client’s entire data warehouse for data and pipeline anomalies in real-time without additional costs.
Data accessTo run data quality or data anomaly checks, the SQL-first vendor requires access to client’s data in BigQuery through read permissions. This means the data could potentially be exposed or compromised in case of a security incident.We don’t require any permissions to read client’s data since we don’t execute SQL at all. This significantly simplifies the onboarding process, as architecturally we have no access to client’s data.

Conclusion

I don’t think this is a complete comparison. Although it might appear biased, this was actually one of our fastest deals, despite starting with significant pushback and skepticism from the data team.

However, I believe SQL-first vendors in the data observability space are doing a great job. Their solution approach – querying data – works best with data warehouses like Snowflake or Redshift, as these systems store logs in tables something like  BigQuery’s INFORMATION_SCHEMA. For data teams using BigQuery and Google Cloud stack, both we and our customers see Masthead as the most effective approach for delivering real-time insights about data anomalies, pipeline issues, and data FinOps.

If you’d like to learn more or have questions, please feel free to contact us.