Masthead data observability platform is designed to observe all data pipelines that data teams are using. Masthead platform uses ML to conclude and learn data, determine data invalidity, impact and notifies concerned. Once Masthead detects data anomalies, it notifies data team that helps to avoid wasteful business decisions based on inaccurate data.
Masthead data observability platform there for you:
Data never leaves your environment. Data security matters more than ever, Masthead architecture is designed not to extract the data from your environment.Seamless, no-code on-boarding. Masthead is an out-of-the-box solution that integrates with your existing stack.Got you covered end-to-end in real-time. A single view of your data pipelines quality and health covered across your environment in real-time mode. Masthead is using historical data to learn and act in case of an anomaly.Know fast the problem and its root cause. Instantly know what happened to your data and where. When issue happens Masthead notifies and helps identify precisely the root cause of the data downtime.
To enable data observability in your environment, Masthead platform requires to:● Collect metadata from your data warehouse or data lake to enable observability of data via logs● Monitor logs of your data to learn how your data travel across data warehouse and environment
To accomplish this Masthead data observability platform is shown below on the high-level overview diagram.

Illustration

Below is architecture diagram, which components Masthead utilize

Illustration

Field Metrics that monitors automatically are tracking and reporting on anomalies across listed types of fields in table below. These metrics are most common for tracking in all tables that data warehouse or data lake has to overcome data quality issues. By implementing monitoring on these metrics Masthead is able to notify dedicated teams once anomaly detected.

The following list includes all metrics automatically tracked with Field Metrics. 

Hello, world!
Metric Description Field type Analytics Anomaly detection
% Null Percentage of rows that have a NULL value All Yes Yes
# Unique Distinct values by total number of values All Yes Yes
% Zero Percentage of rows that have the value 0 Numeric Yes Yes
% Negative Percentage of rows that have a negative value Numeric Yes Yes
Mean Mean value across all rows Numeric Yes Yes
Std Standard deviation of numeric values Numeric Yes Yes
Max Maximum across all rows Numeric Yes Yes
Min length Minimum count of characters over non-null values Strings Yes Pending
Mean length Mean count of characters over non-null values Strings Yes Pending
Std length Standard deviation of strings values Strings Yes Pending
Max length Maximum count of characters Strings Yes Pending
% Whitespace Percentage of rows where the string contains only whitespace characters Strings Yes Pending
% Integer Percentage of rows where the string represents a valid integer number Strings Yes Yes
% "Null"/"None" Percentage of rows where the string is 'null', 'None' etc Strings Yes Yes
% Float Percentage of rows where the string represents a valid floating point number Strings Yes Pending
% UUID Percentage of rows where the string represents a valid UUID Strings Yes Pending
Illustration


Illustration

Text element

Text element