Google BigQuery
DataFlowX queries your BigQuery datasets directly using service account credentials and the BigQuery Storage Read API — zero data movement, no duplication. Run scheduled refreshes or on-demand queries against petabyte-scale tables and surface results in any DataFlowX dashboard instantly.
Supported Data Patterns
DataFlowX works with every BigQuery object type — from standard tables to partitioned views and ML models.
Tables & Views
- Standard tables
- External tables
- Materialized views
- Logical views
- Authorized views
Partitioned Tables
- Date/timestamp partitions
- Integer range partitions
- Partition pruning
- Partition expiry
Datasets
- Cross-region datasets
- Dataset access controls
- Dataset labels
- Multi-region replication
Scheduled Queries
- Native scheduled query jobs
- DataFlowX-managed refresh
- Incremental query modes
- Cost forecasting
BigQuery ML
- Model predictions
- ARIMA forecasts
- Logistic regression outputs
- Custom BQML model results
Analytics Hub
- Linked datasets
- Shared data exchanges
- Cross-project queries
- Publisher analytics
How It Works
Storage Read API
High-throughput reads via the BigQuery Storage Read API deliver 10× faster performance compared to standard SQL EXPORT jobs — no intermediate files.
Flexible refresh cadence
Choose from 5-minute micro-batches to daily snapshots. DataFlowX only queries changed partitions where possible, keeping costs low.
Service account auth
Attach a GCP service account JSON key with the minimum required roles: BigQuery Data Viewer + BigQuery Job User. Keys are encrypted at rest with AES-256.
Cost-aware dry runs
DataFlowX runs a dry-run before every query to estimate byte cost. You see the estimated cost in the UI before executing, preventing surprise bills.
Column-level security
DataFlowX respects BigQuery column-level security policies and data masking rules — your security configuration is always honoured.
Cross-source joins
Once in DataFlowX, join BigQuery data with Salesforce, Jira, Stripe, or any other connector — without writing SQL or moving data.
Connect in 3 Steps
Typical setup time: under 10 minutes.
Create a GCP service account
In the GCP Console, navigate to IAM → Service Accounts. Create a new service account and grant it two roles: BigQuery Data Viewer and BigQuery Job User.
💡 Use the principle of least privilege — Data Viewer + Job User is all DataFlowX needs.
Generate & upload JSON key
In the service account detail page, create a JSON key file. Copy-paste its contents into the DataFlowX connector setup screen. The key is encrypted immediately using AES-256.
💡 You can rotate the key at any time from the DataFlowX connector settings page.
Select datasets & set schedule
Browse your GCP project's datasets. Pick the tables or views you want to surface. Choose your refresh cadence. DataFlowX will show an estimated monthly cost before you confirm.
💡 Partition filters are applied automatically to minimise bytes scanned.
What Teams Build With This Integration
Data & Analytics
- Self-serve BI on warehouse data
- Anomaly detection on event streams
- Cohort and funnel analysis
- Custom metric definitions
Finance
- Revenue recognition dashboards
- Budget vs actual tracking
- Gross margin analysis
- Forecast model outputs
Product
- Feature adoption funnels
- Retention and churn metrics
- A/B test result dashboards
- User journey visualisations
Frequently Asked Questions
Does DataFlowX move or copy my BigQuery data?
No. DataFlowX queries BigQuery in place using the Storage Read API. Data is never stored in DataFlowX's own storage layer — only query result sets (aggregated metrics) are cached for dashboard rendering.
How does DataFlowX handle partitioned tables?
DataFlowX automatically applies partition filters based on your chosen date range and refresh cadence. This means only new or changed partitions are scanned, minimising bytes processed and BigQuery costs.
Can I connect multiple GCP projects?
Yes. You can add multiple BigQuery connections — one per GCP project — and join data across them inside DataFlowX dashboards without any additional infrastructure.
What happens if my BigQuery query costs exceed my budget?
DataFlowX's dry-run cost estimation will warn you before executing any query that would exceed your configured cost threshold. You can set a monthly byte-processed budget and receive alerts when you approach it.
Is BigQuery ML supported?
Yes. If your service account has the BigQuery Connection User role in addition to the base roles, DataFlowX can read BigQuery ML model prediction tables and surface forecasts and classification scores in dashboards.
Ready to connect BigQuery?
Surface petabyte-scale warehouse data in your dashboards in minutes.