Lakeflow Job
Introduction
A Workflow automation for Databricks, providing orchestration for data processing workloads so that you can coordinate and run multiple tasks as part of a larger workflow. You can optimize and schedule the execution of frequent, repeatable tasks and manage complex workflows.
Jobs consist of one or more tasks, and support custom control flow logic like branching (if / else statements) or looping (for each statements) using a visual authoring UI.
Job
Trigger
Scheduled
Triggers a job run based on a time-based schedule. See Run jobs on a schedule.
Table update
Triggers are job run when source tables are updated. See Trigger jobs when source tables are updated.
File arrival
Triggers a job run when new files arrive in a monitored Unity Catalog storage location. See Trigger jobs when new files arrive.
Continuous
To keep the job always running, trigger another job run whenever a job run completes or fails. See Run jobs continuously.
None (manual)
Runs are triggered manually with the Run now button or programmatically using other orchestration tools. See Trigger a single job run
Control Flow
Retries
Retries specify how many times a particular task should be re-run if the task fails with an error message. Errors are often transient and resolved through restart.
If you specify retries for a task, the task restarts up to the specified number of times if it encounters an error.
Conditional Task
You can use the Run if task type to specify conditionals for later tasks based on the outcome of other tasks. You add tasks to your job and specify upstream-dependent tasks.
If/else task type to specify conditionals based on some value.
For each Task
For eachtask to run another task in a loop, passing a different set of parameters to each iteration of the task.
Task Dependencies
Run if dependencies field allows you to add control flow logic to tasks based on other tasks' success, failure, or completion.
Dependencies are visually represented in the job DAG as lines between tasks.
Monitoring & Observability
Troubleshooting
Identify the Point of Failure
Click the link in the Start time column for the failed run.
In the Graph View, the failed task will be highlighted in red.
Timeline View: Use this to see if the failure was due to a timeout or a dependency bottleneck.
Analyze Logs and Errors
Click the failed task node in the graph to open the Task run details side pane.
Analyze the error messages and logs
Fixing and Repairing
Repair failed or canceled multi-task jobs by running only the subset of unsuccessful tasks and any dependent tasks
Edit task directly from the run details to update the configuration or notebook path.
Notification
Job notifications are a built-in mechanism to alert you or your team about the status of your workflows. They can be configured at both the Job level (for the entire workflow) and the Task level (for individual steps).
Trigger Events
You can set up notifications to trigger based on the following lifecycle events:
Start: When a job or task run begins.
Success: When a run completes successfully (Note: "Succeeded with failures" is considered a success state).
Failure: When a run fails.
Duration Warning: When a run exceeds a pre-configured time threshold (useful for identifying "hanging" jobs).
Streaming Backlog: Specifically for streaming jobs, triggered when the backlog exceeds a certain threshold for 10 minutes.
Notification Destinations
Databricks allows you to send these alerts to:
Email addresses: Direct emails to individuals or distribution lists.
System Destinations: Integration with third-party tools (must be configured by an admin), including:
Slack
Microsoft Teams
PagerDuty
Generic HTTP Webhooks: To send custom JSON payloads to any external API.
Last updated