Ask AI

You are viewing an unreleased or outdated version of the documentation

Integrating BigQuery & dbt with Dagster+ Insights#

BigQuery costs can be integrated into the Dagster Insights UI. The dagster-cloud package contains utilities for capturing and submitting BigQuery cost metrics about data operations to Dagster+.

If you use dbt to materialize tables in BigQuery, use this guide to integrate BigQuery cost metrics into the Insights UI. For instructions on integrating direct BigQuery queries, see Integrating Direct BigQuery Usage with Dagster+ Insights.


Prerequisites#

To complete the steps in this guide, you'll need:

  • A Dagster+ account on the Pro plan

  • Access to the Dagster+ Insights feature

  • BigQuery credentials which have access to the INFORMATION_SCHEMA.JOBS table (e.g. BigQuery Resource viewer role). These credentials should be provided used by your dbt profile. For more information on granting access to this table, see the BigQuery documentation.

  • To install the following libraries:

    pip install dagster dagster-cloud dagster-dbt dagster-gcp
    

    Note: If you already have dagster-cloud installed, make sure you're using version 1.7.0 or newer.


Step 1: Instrument your Dagster code#

First, append with_insights() to the dbt CLI call in your Dagster @dbt_assets function:

@dbt_assets(...)
def my_asset(context: AssetExecutionContext, dbt: DbtCliResource):
    # Chain `with_insights` after any other metadata fetch, e.g. `fetch_row_count`
    yield from dbt_resource.cli(["build"], context=context).stream().with_insights()

This passes through all underlying events and emits additional AssetObservations with BigQuery cost metrics. These metrics are obtained by querying the underlying INFORMATION_SCHEMA.JOBS table, using the BigQuery client from the dbt adapter.


Step 2: Update dbt_project.yml#

Next, add the following to your dbt project's dbt_project.yml:

query-comment:
  comment: "bigquery_dagster_dbt_v1_opaque_id[[[{{ node.unique_id }}:{{ invocation_id }}]]]"
  append: true

This allows you to add a comment, containing the dbt invocation ID and unique ID, to every query recorded in BigQuery's INFORMATION_SCHEMA.JOBS table. Using this data, Insights will attribute cost metrics in BigQuery to the corresponding Dagster jobs and assets.


Step 3: View BigQuery usage in the Dagster UI#

Typically within 24 hours, the BigQuery metrics should be available in the Insights tab in the Dagster UI:

BigQuery metrics in the Insights tab of the Dagster UI

The BigQuery cost metric is based off of the bytes billed for queries wrapped with with_insights, based on a unit price of $6.25 USD per TiB.