Google BigQuery: Key Components and Capabilities
BigQuery is a fully managed serverless data warehouse provided by Google Cloud, designed to enable fast SQL queries and interactive analysis of massive datasets. It is scalable and can handle petabytes of data, integrating with various data analysis and business intelligence tools.
By leveraging Google’s infrastructure, BigQuery provides users with a highly reliable platform for data analysis without the need for managing hardware or software. This lets organizations focus on extracting valuable insights from their data, rather than worrying about the underlying technology.
In the first part of this article series, I outlined the features and capabilities included in BigQuery editions (Standard, Enterprise and Enterprise Plus, as well as on-demand pricing) and presented some suggestions for optimizing BigQuery usage to minimize costs. In this article, I’ll dive into the biggest question you may be asking: what does it cost?
This is part of a series of articles about Google Cloud cost. For more insight, read our guide to GCP compute pricing.
BigQuery Pricing Key Components
Pricing in BigQuery is determined by compute and storage costs.
Compute
Compute costs in BigQuery are primarily associated with the processing power used when executing SQL queries. Pricing is based on the amount of data processed by the query, not on the time it takes to execute. This approach aligns costs directly with usage, allowing for more predictable budgeting.
BigQuery offers a choice between on-demand pricing, which charges for the amount of data processed per query, and flat-rate pricing, which allows unlimited queries up to a certain capacity.
Storage
The storage costs in BigQuery are divided into two main categories: active storage for data that is frequently accessed and long-term storage for older, less frequently accessed data. Active storage is charged based on the amount of data stored per month, while long-term storage offers a reduced rate for data not modified for 90 consecutive days.
This pricing structure encourages efficient data management, allowing companies to store large amounts of data in BigQuery cost-effectively. Users can manage storage costs by deleting unnecessary data or moving old data to long-term storage.
Pricing for Specific Capabilities
In addition to compute and storage, BigQuery has designated pricing for the following features and capabilities:
- BigQuery Omni: Enables querying of data across Google Cloud, Amazon Web Services (AWS) and Microsoft Azure without managing the data replication and movement, priced based on the amount of data processed and region. Charges for cross-cloud data transfer and managed storage apply.
- BigQuery ML (Machine Learning): Allows creating and executing machine learning models directly within BigQuery. Charges are based on the amount of data processed during model training, evaluation and prediction, with separate pricing for built-in models and external models (e.g., using Intel Tiber AI Studio).
- BI (Business Intelligence) Engine: An in-memory analysis service that provides fast SQL query performance for data stored in BigQuery. Pricing is based on the amount of BI Engine memory reserved per project, facilitating real-time analysis and dashboarding.
- Data Transfer Service: Automates data movement into BigQuery from Google Cloud services, Software as a Service (SaaS) applications and other cloud services. Price is based on the amount of data transferred, with some services offering free transfers.
- Data ingestion: Refers to loading data into BigQuery from various sources. Pricing varies depending on the method (streaming or batch loads) and the volume of data ingested.
- Data extraction: Involves exporting data from BigQuery to other Google Cloud services or directly to the user. Priced based on the amount of data extracted.
- Data replication: Ensures data is replicated and consistent across multiple regions or multicloud environments. Pricing depends on the volume of data replicated and the regions involved.
- External services: When integrating BigQuery with external services or applications, pricing may involve API call costs, additional data processing or specific integration features, depending on the service used.
BigQuery Compute Pricing
Let’s compare the on-demand and reserved capacity pricing models for BigQuery compute.
Note: BigQuery pricing provided in this and the following sections is correct as of the time of this writing and is subject to change. For up-to-date pricing and additional details, see the official pricing page.
On-Demand Compute Pricing
Relevant for product editions: On-Demand
In this model, charges are based on the number of terabytes (TBs) processed. Users benefit from a free tier, which allows the first 1TB of data processed each month to be free of charge.
Beyond this free tier, the cost is $6.25 per TB of data processed. This pricing model is particularly useful for users with varying workload sizes, as it offers the flexibility to pay only for the amount of data processed, without any upfront costs or commitments.
Users can access up to 2,000 concurrent slots, which are shared across all queries within a single project. There are instances where BigQuery might temporarily exceed this limit to expedite smaller queries, though, at times, fewer slots may be available due to high demand in a particular region.
It’s important to note that charges are rounded up to the nearest megabyte, with a minimum of 10MB of data processed per table referenced and per query executed. Additionally, partitioning and clustering tables can significantly reduce the data processed, thereby lowering costs.
Capacity Compute Pricing
Relevant for product editions: Standard, Enterprise, Enterprise Plus
Capacity compute, or reserved capacity, pricing introduces a more predictable cost structure for BigQuery queries by charging for compute capacity in terms of slots (virtual CPUs) over time. This model is appropriate for users with consistent or high-volume query workloads, as it allows reservation of compute capacity ahead of time.
BigQuery offers this model through its editions, providing options for autoscaling and commitments of one or three years. BigQuery editions offer slot capacities in three tiers: Standard, Enterprise and Enterprise Plus, with costs varying by edition:
- Standard edition: Charges $0.04 per slot hour for pay-as-you-go, without any commitment.
- Enterprise edition: Starts at $0.06 per slot hour for pay-as-you-go, with reduced rates for one-year ($0.048 per slot hour) and three-year ($0.036 per slot hour) commitments.
- Enterprise Plus edition: Increases the pay-as-you-go rate to $0.1 per slot hour, with one-year ($0.08 per slot hour) and three-year ($0.06 per slot hour) commitments offering savings for longer-term planning.
BigQuery Storage Pricing
Active Storage
Relevant for product editions: On-Demand, Standard, Enterprise, Enterprise Plus
This storage covers any table or table partition that has been modified within the last 90 days. For active storage, BigQuery charges $0.02 per GB per month, with the first 10GB provided free of charge each month. This pricing applies to what is known as active logical storage.
For active physical storage — where the physical resources used are considered — the rate is $0.04 per GB per month, also with the first 10GB free each month. This dual structure allows users to store and modify data efficiently while managing costs, especially for data that is frequently accessed or updated.
Long-Term Storage
Relevant for product editions: On-Demand, Standard, Enterprise, Enterprise Plus
Long-term storage pricing is applicable for any table or table partition not modified for 90 consecutive days. Once data qualifies for long-term storage, the cost is automatically reduced by approximately 50%.
The charge for long-term logical storage drops to $0.01 per GB per month, and for long-term physical storage, it’s $0.02 per GB per month, with the first 10GB free each month in both cases. This reduced pricing facilitates cost-effective data management for older, less frequently accessed data.
Importantly, each partition of a table is evaluated separately for eligibility for long-term storage rates, ensuring that storage costs are minimized based on the actual activity and data modification patterns.
Pricing for Additional BigQuery Capabilities
Let’s review pricing for two of the BigQuery capabilities that have a special pricing structure — BigQuery Omni and BigQuery ML. Other special capabilities are beyond the scope of this article, consult the official pricing page for details.
BigQuery Omni Pricing
Let’s compare the pricing models for BigQuery Omni, which offers flexible deployment options across different cloud platforms, allowing you to query data stored in AWS, Azure, or Google Cloud without moving or duplicating datasets.
On-Demand Compute Pricing
Relevant for product editions: On-Demand
In this model, charges are based on the amount of data your queries process. This pricing structure allows a broad pool of concurrent slots across all queries in a single project. BigQuery Omni occasionally bursts beyond this limit to quicken smaller queries, but there might be times when fewer slots are available due to high demand in specific locations.
The on-demand (per-TB) query pricing varies by region:
- AWS North Virginia (aws-us-east-1): $7.82 per TB
- Azure North Virginia (azure-eastus2): $9.13 per TB
- AWS Seoul (aws-ap-northeast-2): $10.00 per TB
- AWS Oregon (aws-us-west-2): $7.82 per TB
- AWS Ireland (aws-eu-west-1): $8.60 per TB
Omni Cross Cloud Data Transfer
Relevant for product editions: Enterprise, Enterprise Plus
When leveraging Omni’s Cross Cloud capabilities, such as Cross Cloud Transfer, Create Table as Select and Cross Cloud Joins, data transferred from AWS or Azure to Google Cloud incurs additional charges. During the Preview, no charges apply for Cross-Cloud Materialized Views, Create Table as Select and Cross Cloud Joins.
The rates for data transfer are (per GB from AWS or Azure to Google Cloud):
- AWS North Virginia to Google Cloud North America: $0.09
- Azure North Virginia to Google Cloud North America: $0.0875
- AWS Seoul to Google Cloud Asia: $0.126
- AWS Oregon to Google Cloud North America: $0.09
- AWS Ireland to Google Cloud Europe: $0.09
Omni Managed Storage
Relevant for product editions: Enterprise, Enterprise Plus
For Omni’s Cross Cloud Materialized Views, charges apply for the creation of local materialized views on BigQuery Managed Storage on AWS. The rates for physical storage used by the local materialized view are:
- Active physical storage in AWS North Virginia: $0.05 per GB per month
- Long-term physical storage in AWS North Virginia: $0.025 per GB per month
There are similar rates across other supported regions including Azure North Virginia, AWS Seoul, AWS Oregon and AWS Ireland.
BigQuery ML Pricing
BigQuery Machine Learning (ML) extends the capabilities of BigQuery by enabling users to create and execute machine learning models within Google Cloud’s data warehouse environment.
Built-In Models Pricing
Relevant for product editions: On-Demand
BigQuery ML built-in models are designed to be trained directly within BigQuery, covering a range of algorithms including linear regression, logistic regression, K-means clustering, principal component analysis (PCA) and time series forecasting models like ARIMA_PLUS.
For creating these models, BigQuery offers a pricing structure that charges $312.50 per TB of data processed. The first 10GB of data processed by the CREATE MODEL statements each month are included in BigQuery’s free tier, offering an initial cost-saving benefit.
This pricing applies to a variety of operations, from logistic and linear regression model creation to K-means and PCA model creation. Each type of model creation is identified and billed separately under BigQuery’s billing system to distinguish between ML model creation and regular BigQuery operations.
Capacity-Based Pricing
Relevant for product editions: Enterprise, Enterprise Plus
BigQuery ML also offers pricing based on compute capacity (number of slots) for customers who prefer a more predictable cost model over the on-demand approach. The Enterprise and Enterprise Plus Editions allow customers to use all features of BigQuery ML under a capacity-based pricing model.
These editions cater to users with consistent or high-volume machine learning workloads, enabling efficient budget management and resource allocation for ML tasks.
Evaluation, Inspection and Prediction Pricing
Relevant for product editions: On-Demand, Enterprise, Enterprise Plus
For all model types, BigQuery ML charges $6.25 per TB for operations related to model evaluation, inspection and prediction. This cost is included in BigQuery’s general analysis free tier, which covers the first TB of data processed per month.
This approach enables users to leverage BigQuery ML’s predictive capabilities while maintaining cost-effectiveness, particularly for iterative processes like model evaluation and prediction.
Conclusion
Understanding and optimizing Google BigQuery costs is crucial for effectively managing your data warehouse expenses. BigQuery offers a range of product editions — Standard, Enterprise and Enterprise Plus — each tailored to different usage patterns and business needs. On-demand pricing provides flexibility for unpredictable workloads, while capacity-based pricing offers predictability and potential savings for consistent high-volume usage.
Implementing best practices such as previewing queries, capping maximum bytes billed and leveraging tools like the Google Cloud Pricing Calculator can help significantly reduce BigQuery costs. By carefully planning and monitoring usage, organizations can harness the full power of BigQuery while keeping expenses under control.