Google’s BigQuery is one of the most widely used data warehouse solutions in the world, with a market share of 13.21% in 2022 according to 6Sense. BigQuery enables organizations across the globe to efficiently store, organize, process, and analyze data. It provides two pricing models based on resource consumption (specifically, storage and compute capacity).
In this article, we cover only the analysis cost, so let’s start by clarifying how BigQuery charges for running queries.
BigQuery has two primary query analysis pricing models:
Recently, BigQuery has introduced a new flat-rate pricing model involving three separate editions. These are:
Commencing from July 5, 2023, BigQuery editions will completely substitute annual or monthly flat-rate pricing plans, as well as the Flex Slots pricing plan. Customers who are currently using these pricing plans can begin transitioning their capacity to the appropriate edition based on their specific business needs and, later, move to other edition tiers as their requirements evolve.
How can you choose which plan works best for you? Let’s consider several scenarios:
The on-demand pricing model is perfect to get started. For example, say you start building your data architecture with small data warehouses storing and processing insignificant amounts of data. The on-demand pricing model allows you to efficiently track compute resources consumption at low data volumes and keep an eye on all your cloud spending.
If you are following best practices for controlling costs, then most likely you will only be billed for what you are using. We strongly recommend setting up custom cost controls to limit the amount of data queried for any period of time that you can define.
As you use BigQuery, your costs constantly grow. If you’re using an on-demand pricing model, you should look into BigQuery reservations as a way to reduce costs.
BigQuery offers flat-rate pricing for customers who prefer a fixed cost for queries rather than paying the on-demand price per terabyte of data processed. You can also combine these two pricing models.
When you’re just getting started with BigQuery, you can use the Flex Slots commitment. With Flex Slots, you can rapidly increase or decrease the scale of your data warehouse, sometimes for as brief a period as 60 seconds. In other words, you can use the Flex Slots commitment for quickly handling a big data load inside the data warehouse on Black Friday or during some popular media event. However, the Flex Slots commitment will be decommissioned on July 5, 2023 so you will have to move to a corresponding BigQuery edition.
If you’re uncertain about committing to a long-term slot reservation for your workloads, Standard Edition will be a perfect substitute for Flex Slots. It will allow you to test a dedicated reservation for a brief period, helping you to determine whether a more extended slot commitment is suitable for your needs.
It’s worth noting that you can adopt a combination of pricing models to address various workloads. For example, if you have multiple workloads centered around BigQuery — such as data ingestion, ELT-style data transformation, and reporting or ad-hoc querying — you can choose different pricing models to optimize costs for each workload.
Lastly, suppose you have a good estimate of the amount of data that your ELT jobs will process on a daily basis. In that case, you may consider using on-demand pricing for ELT workloads, as you can predict the number of bytes processed. As a result, you can choose to execute your ELT workloads in a project that is not assigned to a reservation and instead utilize on-demand resources.
On top of the previously mentioned scenarios, your data warehouse has power users who are continuously querying your data and heavily ingesting new data in BigQuery. The more data you have, the more data pipelines are created. In case you are dealing with a great number of data pipelines, it may be a good time to choose a flat-rate pricing model.
If your data science and advanced analytics tasks involve custom Python pipelines or Dataform jobs, you may be able to achieve cost savings by running those workloads in a Google Cloud project that is assigned a slot reservation.
But be careful! When one of our customers started using our new feature that monitors query insights, they observed that some daily scheduled queries were executing ten times longer than usual on a specific day, resulting in a delay in creating necessary tables and views. It turned out that a combination of other scheduled high-load queries were reserving slots for their execution and created a long queue of SQL queries, resulting in a longer than usual execution time.
Once your infrastructure has reached a level of maturity that requires you to address the challenge of merging siloed data from multiple sources into a single consolidated source, you may need to use multiple pricing models to achieve cost efficiency and meet your objectives. These could include:
You can learn more about BigQuery pricing models on the Google Cloud website. Masthead has rolled out cost attribution functionality that shows how much every solution that you run in your cloud contributes to your cloud costs. Check out our changelog to know more about our cost attribution features.
Post Tags :
CTO & co-founder, Masthead