Lakeflow Job

Introduction

  • A Workflow automation for Databricks, providing orchestration for data processing workloads so that you can coordinate and run multiple tasks as part of a larger workflow. You can optimize and schedule the execution of frequent, repeatable tasks and manage complex workflows.

  • Jobs consist of one or more tasks, and support custom control flow logic like branching (if / else statements) or looping (for each statements) using a visual authoring UI.

Job

Trigger

Scheduled

Triggers a job run based on a time-based schedule. See Run jobs on a schedulearrow-up-right.

Table update

Triggers are job run when source tables are updated. See Trigger jobs when source tables are updatedarrow-up-right.

File arrival

Triggers a job run when new files arrive in a monitored Unity Catalog storage location. See Trigger jobs when new files arrivearrow-up-right.

Continuous

To keep the job always running, trigger another job run whenever a job run completes or fails. See Run jobs continuouslyarrow-up-right.

None (manual)

Runs are triggered manually with the Run now button or programmatically using other orchestration tools. See Trigger a single job runarrow-up-right

Control Flow

Retries

  • Retries specify how many times a particular task should be re-run if the task fails with an error message. Errors are often transient and resolved through restart.

  • If you specify retries for a task, the task restarts up to the specified number of times if it encounters an error.

Conditional Task

  • You can use the Run if task type to specify conditionals for later tasks based on the outcome of other tasks. You add tasks to your job and specify upstream-dependent tasks.

  • If/else task type to specify conditionals based on some value.

For each Task

  • For each task to run another task in a loop, passing a different set of parameters to each iteration of the task.

Task Dependencies

  • Run if dependencies field allows you to add control flow logic to tasks based on other tasks' success, failure, or completion.

  • Dependencies are visually represented in the job DAG as lines between tasks.

Monitoring & Observability

Troubleshooting

Identify the Point of Failure

  • Click the link in the Start time column for the failed run.

  • In the Graph View, the failed task will be highlighted in red.

  • Timeline View: Use this to see if the failure was due to a timeout or a dependency bottleneck.

Analyze Logs and Errors

  • Click the failed task node in the graph to open the Task run details side pane.

  • Analyze the error messages and logs

Fixing and Repairing

  • Repair failed or canceled multi-task jobs by running only the subset of unsuccessful tasks and any dependent tasks

  • Edit task directly from the run details to update the configuration or notebook path.

Notification

  • Job notifications are a built-in mechanism to alert you or your team about the status of your workflows. They can be configured at both the Job level (for the entire workflow) and the Task level (for individual steps).

Trigger Events

You can set up notifications to trigger based on the following lifecycle events:

  • Start: When a job or task run begins.

  • Success: When a run completes successfully (Note: "Succeeded with failures" is considered a success state).

  • Failure: When a run fails.

  • Duration Warning: When a run exceeds a pre-configured time threshold (useful for identifying "hanging" jobs).

  • Streaming Backlog: Specifically for streaming jobs, triggered when the backlog exceeds a certain threshold for 10 minutes.

Notification Destinations

Databricks allows you to send these alerts to:

  • Email addresses: Direct emails to individuals or distribution lists.

  • System Destinations: Integration with third-party tools (must be configured by an admin), including:

    • Slack

    • Microsoft Teams

    • PagerDuty

    • Generic HTTP Webhooks: To send custom JSON payloads to any external API.

Last updated