March 14, 2023

Wayfair’s Journey to Data Mesh

Yuliia Tkachova
Co-founder & CEO, Masthead Data

Wayfair is a global marketplace platform and one of the most notable players in the $800B+ home goods market. The Wayfair marketplace boasts more than 22 million active customers and more than 23,000 suppliers selling 30+ million products. The Wayfair brand is recognized in North America and Western Europe. What started as a collection of 240+ e-commerce websites primarily selling furniture is now a multi-billion-dollar brand aiming to embrace a data-driven approach across all departments. In this article, we tell you the story behind Wayfair’s ongoing transformation into a data-driven company with a data mesh architecture.

We recently had a conversation with Nachiket Mehta, a Data and Analytics executive at Wayfair, about the company’s data journey. Nachiket shared insights into their data organization’s evolution, including the technology they adopted, the hiring of team members, and the challenges they encountered along the way.

Phase 1: Origins of the company’s data transformation

Business context: The modern story of Wayfair dates back to 2011, when a group of e-commerce websites known as CSN Stores decided to rebrand itself and become Wayfair, a global e-commerce platform aiming to take a leading market position.

Objectives:

  • Starting the transformation towards a data-driven architecture that would help the company become a multi-billion dollar e-commerce giant
  • Establishing basic data-driven practices across the entire organization instead of having all data-driven practices confined to a centralized BI team

Challenges: The new data-driven culture was novel to most Wayfair teams and stakeholders. Most of the employees working with data were primarily BI-focused. The other challenge was the wide organizational structure. There was a need to power with data numerous domain teams, each working with numerous subdomains and using dozens of applications.

Since Wayfair’s early days, data had primarily been used by the centralized BI team, which was working with their respective business partners but lacked the cross-domain collaboration across data producers and consumers. The primary technology stack for the BI team included Vertica and SSAS cubes, powering Excel sheets for end users.

This structure worked at the earliest stages of Wayfair’s development and even helped the company achieve a market value of $7-8 billion. However, growth goals created new demands for Wayfair’s data strategy. To correctly organize business activities and take maximum advantage of data to become a $50B+ company, Wayfair had to make all departments work with domain-specific data and exchange that data with other departments for whom it was relevant.

Courtesy of Masthead Data team

Phase 2: Data revolution

Business context: Wayfair’s activities could be roughly divided into five to seven domains, each having a dozen subdomains and involving the use of a few dozen applications. The data mesh approach, which presumes data practices across all teams, perfectly fit this organizational structure.

Objectives:

  • Getting teams across departments to embrace the data-driven culture
  • Moving to a more sophisticated data management stack to improve data management and velocity
  • Increasing the number of data experts within the company
  • Having most teams move their data from on-premises and monolithic databases to the cloud

Challenges: Wayfair teams were running a huge number of processes that had to be transformed. The teams had to be expanded with more data experts.

Cultivating a data culture became a strategic objective for Wayfair in late 2018, and in 2019, a new CTO with solid experience in data-driven decision-making joined the company. That’s when Wayfair’s data revolution began. Wayfair focused on hiring more data experts to create more efficient and flexible data architecture. The company also started creating separate teams with different data competencies within each domain. In particular, the centralized BI team became federated, with domain-driven teams each having a separate responsibility area and working closely with data producers.

Courtesy of Masthead Data team

There were also significant changes in Wayfair’s data technology stack. In late 2019,Wayfair teams started using Looker for self-service data exploration, Data Studio for visualization, and AtScale to continue supporting the end users with Excel familiarity.  . At some point, Wayfair’s data-driven teams had been using Tableau for BI reporting, but they quickly switched to Google Data Studio (today known as Looker Studio) for its business-related benefits. Wayfair’s data management technology stack also included Vertica. Meanwhile, the data engineering teams embarked on a journey to be one of the first tech teams to move from an on-premises and monolithic architecture to a Google Cloud platform. However, the issues such as the data producers treating data as a product and establishing the data contracts remained.

Phase 3: Data velocity

Business context: Wayfair teams embraced two data storage and management approaches involving monolithic and on-premises data architectures. Wayfair set cost optimization goals, leading them to focus on developing a cloud-based microservices architecture. For this reason, Wayfair’s teams started updating their data technology stack.

Objectives:

  • Migrating to the cloud
  • Transitioning towards an event-driven data architecture
  • Optimizing the company’s data technology stack

Challenges: Different Wayfair data teams used different technology stacks. Many data producers were using cost-inefficient monolithic databases.

In 2020, Wayfair embraced the data velocity initiative promoted by the new CTO. Some teams started moving their data-storage from Vertica to Google BigQuery. By 2021, moving to an event-driven data management approach, using Kafka / PubSub and schema registry, had become a key objective in the company’s data-driven strategy. Wayfair enhanced  Scribe, a service that serves as the company’s primary event data platform. As a platform processing many terabytes of data from multiple single-purpose databases, Scribe powers teams across the entire company.

In parallel, the supply chain data engineering team developed Lighthouse platform, an event-driven data catalog, and published the internal Supply Chain Data Ontology whitepaper to align supply chain operational events/milestones with the systematic events produced by data producers. This fundamental work also helped the supply chain technology team start a Knowledge Graph, showing graphical connections between data producers and data consumers.

To make the data architecture more consistent and cost-efficient, Wayfair’s data teams embraced Google BigQuery, Cloud Storage, and Dataflow. Ontology best practices, along with end-to-end process maps, led to the development of event contracts with data producers. Now, many Wayfair data producers are emitting events directly from their applications and are migrating their monolithic databases to single-purpose databases.

Wayfair’s current data technology stack includes native Google Cloud services such as PubSub, BigQuery, Cloud Storage, BigTable, Dataflow, and Cloud Composer, along with other open source technologies including Apache Kafka, Apache Flink, Debezium, Meltano, and dbt. Because some data producers have not yet moved from monolithic databases to single-purpose databases, Wayfair data engineers use custom-built CDC solutions and Meltano, which provides data extraction features that mitigate the inconsistencies between database types. Another increasingly popular technology helping Wayfair’s data teams with micro-batch data transformations is dbt. Kafka/PubSub/Flink and Meltano handle data extraction and loading, while dbt, Kronos (custom-built solution), and Dataflow enable data transformations and loading of data into various sinks. In addition to the analytics tools mentioned above, the GraphQL API framework enables Wayfair’s curated data to be consumed via APIs by other software applications.

Future plans: Data mesh driven by an event stream

Business context: Wayfair aims to become a company where each team treats data as a precious asset that brings many practical benefits. According to Wayfair’s plans, all departments will have data specialists capable of processing data and getting insights from it. Additionally, this culture should ensure great data velocity, with each event triggering data flows that carry relevant data to its consumers.

Objectives:

  • Transitioning all data teams to an event-driven data architecture as part of the company’s data mesh initiative
  • Having all data teams switch from monolithic databases to single-purpose databases within one year

Challenges: The company’s data architecture still has some inconsistencies. Some teams need time to move from familiar monolithic databases to single-purpose databases.

Although there’s still a long path ahead, Wayfair has taken solid steps towards creating a vibrant data architecture where various events enhance automated data distribution among corresponding data consumers.

Conclusion

Data mesh is a decentralized sociotechnical concept, and its implementation is difficult. Apart from technological transformations, it involves shifting the technical perspective of the employees of the company embracing a data mesh. These shifts require a multi-year roadmap for their implementation.

At first, creating a data mesh wasn’t a goal for Wayfair’s teams, but it turned out to be the perfect data architecture approach to complement Wayfair’s business goals and organizational structure to close the gap between the operational and analytical planes.

Now, Wayfair and its organizational structure benefit from the data culture leading the company to a data mesh. Although there’s still a long road ahead, Wayfair has already taken many steps toward implementing a data mesh architecture and reaping its many benefits.