Google BigQuery

Data Warehouse · read · scheduled · service account

Live

DataFlowX queries your BigQuery datasets directly using service account credentials and the BigQuery Storage Read API — zero data movement, no duplication. Run scheduled refreshes or on-demand queries against petabyte-scale tables and surface results in any DataFlowX dashboard instantly.

Service account authAES-256 key encryptionCost-aware queriesSOC 2 certified
BigQuery Query MonitorHealthy
Last query4 min ago
analytics.events2.1B840 GB5 min
sales.opportunities3.2M1.2 GB15 min
finance.revenue_monthly48k12 MB1 hr
ops.supply_chain_logs920M380 GB30 min

Supported Data Patterns

DataFlowX works with every BigQuery object type — from standard tables to partitioned views and ML models.

Tables & Views

  • Standard tables
  • External tables
  • Materialized views
  • Logical views
  • Authorized views

Partitioned Tables

  • Date/timestamp partitions
  • Integer range partitions
  • Partition pruning
  • Partition expiry

Datasets

  • Cross-region datasets
  • Dataset access controls
  • Dataset labels
  • Multi-region replication

Scheduled Queries

  • Native scheduled query jobs
  • DataFlowX-managed refresh
  • Incremental query modes
  • Cost forecasting

BigQuery ML

  • Model predictions
  • ARIMA forecasts
  • Logistic regression outputs
  • Custom BQML model results

Analytics Hub

  • Linked datasets
  • Shared data exchanges
  • Cross-project queries
  • Publisher analytics

How It Works

🔍

Storage Read API

High-throughput reads via the BigQuery Storage Read API deliver 10× faster performance compared to standard SQL EXPORT jobs — no intermediate files.

⏱️

Flexible refresh cadence

Choose from 5-minute micro-batches to daily snapshots. DataFlowX only queries changed partitions where possible, keeping costs low.

🔑

Service account auth

Attach a GCP service account JSON key with the minimum required roles: BigQuery Data Viewer + BigQuery Job User. Keys are encrypted at rest with AES-256.

💸

Cost-aware dry runs

DataFlowX runs a dry-run before every query to estimate byte cost. You see the estimated cost in the UI before executing, preventing surprise bills.

📐

Column-level security

DataFlowX respects BigQuery column-level security policies and data masking rules — your security configuration is always honoured.

🔗

Cross-source joins

Once in DataFlowX, join BigQuery data with Salesforce, Jira, Stripe, or any other connector — without writing SQL or moving data.

Connect in 3 Steps

Typical setup time: under 10 minutes.

1

Create a GCP service account

In the GCP Console, navigate to IAM → Service Accounts. Create a new service account and grant it two roles: BigQuery Data Viewer and BigQuery Job User.

💡 Use the principle of least privilege — Data Viewer + Job User is all DataFlowX needs.

2

Generate & upload JSON key

In the service account detail page, create a JSON key file. Copy-paste its contents into the DataFlowX connector setup screen. The key is encrypted immediately using AES-256.

💡 You can rotate the key at any time from the DataFlowX connector settings page.

3

Select datasets & set schedule

Browse your GCP project's datasets. Pick the tables or views you want to surface. Choose your refresh cadence. DataFlowX will show an estimated monthly cost before you confirm.

💡 Partition filters are applied automatically to minimise bytes scanned.

What Teams Build With This Integration

🔬

Data & Analytics

  • Self-serve BI on warehouse data
  • Anomaly detection on event streams
  • Cohort and funnel analysis
  • Custom metric definitions
💵

Finance

  • Revenue recognition dashboards
  • Budget vs actual tracking
  • Gross margin analysis
  • Forecast model outputs
🎯

Product

  • Feature adoption funnels
  • Retention and churn metrics
  • A/B test result dashboards
  • User journey visualisations

Frequently Asked Questions

Does DataFlowX move or copy my BigQuery data?

No. DataFlowX queries BigQuery in place using the Storage Read API. Data is never stored in DataFlowX's own storage layer — only query result sets (aggregated metrics) are cached for dashboard rendering.

How does DataFlowX handle partitioned tables?

DataFlowX automatically applies partition filters based on your chosen date range and refresh cadence. This means only new or changed partitions are scanned, minimising bytes processed and BigQuery costs.

Can I connect multiple GCP projects?

Yes. You can add multiple BigQuery connections — one per GCP project — and join data across them inside DataFlowX dashboards without any additional infrastructure.

What happens if my BigQuery query costs exceed my budget?

DataFlowX's dry-run cost estimation will warn you before executing any query that would exceed your configured cost threshold. You can set a monthly byte-processed budget and receive alerts when you approach it.

Is BigQuery ML supported?

Yes. If your service account has the BigQuery Connection User role in addition to the base roles, DataFlowX can read BigQuery ML model prediction tables and surface forecasts and classification scores in dashboards.

Ready to connect BigQuery?

Surface petabyte-scale warehouse data in your dashboards in minutes.