Ask AI

You are viewing an unreleased or outdated version of the documentation

Dagster Pipes tutorial#

In this guide, we’ll show you how to use Dagster Pipes with Dagster’s built-in subprocess PipesSubprocessClient to run a local subprocess with a given command and environment. You can then send information such as structured metadata and logging back to Dagster from the subprocess, where it will be visible in the Dagster UI.

To get there, you'll:

This guide focuses on using an out-of-the-box PipesSubprocessClient resource. For further customization with the subprocess invocation, use open_dagster_pipes approach instead.


Prerequisites#

To use Dagster Pipes to run a subprocess, you’ll need to have Dagster (dagster) and the Dagster UI (dagster-webserver) installed. Refer to the Installation guide for more info.

You'll also need an existing Python script. We’ll use the following Python script to demonstrate. This file will be invoked by the Dagster asset that you’ll create later in this tutorial.

Create a file named external_code.py and paste the following into it:

import pandas as pd


def main():
    orders_df = pd.DataFrame({"order_id": [1, 2], "item_id": [432, 878]})
    total_orders = len(orders_df)
    print(f"processing total {total_orders} orders")


if __name__ == "__main__":
    main()

Ready to get started?#

When you've fulfilled all the prerequisites for the tutorial, you can get started by creating a Dagster asset that executes a subprocess.