The most commonly used code-editor-based debugging approach is generally not very helpful in the case of Durable Functions as it makes issue tracking difficult. This article helps to analyze a durable orchestration using Azure Storage Tables. Also, the blog will help you write an efficient workflow.
Note: This blog assumes you are familiar with Durable Functions, Task Hub, and Storage Explorer. Also, assuming Azure Table Storage is the back-end data store and C# is used for coding. There may be slight differences in case you are using another storage mechanism or language.
The Durable Functions are great for running stateful workflows using Orchestration and Azure Functions. Durable Functions use the Durable Task Framework.
Durable Task Framework:
It uses event sourcing to manage the state, checkpoints, and replays of orchestrations. As the function's state is logged, it is easier to track down what happened during execution.
How Durable Orchestration works?
The orchestration functions have the sole responsibility of defining the workflow. It delegates workflow steps to the
Activity Function. The function with the
DurableOrchestrationClient hook starts the orchestration. The orchestrator function is invoked on starting a new workflow, schedules the first activity, and sleeps. When an activity function task completes, the orchestration function is activated. The workflow execution restarts and the next activity gets scheduled.
Events such as
Orchestrator Completed etc. are stored in the Storage Table. When the orchestration wakes up, it replays the workflow from the beginning though it does not rerun the already completed activity function. Instead, it checks the input for the activity function against a table with the same input and execution
Id. If the entry is present, it gets the output from the table and continues execution. Hence, debugging becomes more straightforward if we know how the orchestration events and instances are tracked in tables.
The diagram below shows the execution of the durable orchestration.
How to read Azure storage task hub tables?
host.json to configure the event execution tracking tables. We will have two tables in Table Storage. In our case, those tables are
DurableHubHistory . We added the
Hub word in the table name for easy identification.
It stores the orchestrator instance runtime status. Each row in this table represents a new instance of the orchestrator. Below are a few columns which we will be using while tracing errors.
It contains a record of historical events related to the orchestrator. The
PartitionKey is the same as the orchestration InstanceId, and the
RowKey is the sequential key generated for the event. Here you can see several columns to help you identify the root cause of the orchestration failure.
Let's consider the below example of durable orchestration:
Execution of the orchestration begins with the
Starter_DurableDemo, that calls
Starter_Orchestrator function calls
O_DurableSubOrchestrator. On the execution of
Activity_Function, two events, namely
TaskCompleted, get added to the history table. When the
sub-orchestrator is called,
SubOrchestrationInstanceCreated event is added to the
HubHistory table, and an instance entry is made in the
Below is a snippet of the
PartitionKey represents the
InstanceId of the
O_DurableSubOrchestrator functions. You can use
InstanceId to query the history table.
If the Activity / SubOrchestrator fails, the output of the orchestration instance will contain the details of the failure. The details include the name of the failed function. You can use the InstanceId and function name to get the exact error from the history table.
Things to remember:
- Since there is no compile-time type checking while calling activity or orchestrator function, specify the same return type as defined in the function definition.
- If an exception occurs, which
Activity/SubOrchestratorsdo not handle, it returns to the orchestration function, updating its state with an error. To avoid a complete failure of your orchestration, you should handle exceptions in
Activity/SubOrchestratorsand write compensation logic for them.
- Activity and Orchestrator functions are typical Azure functions with a maximum timeout (To know more about hosting options and function timeouts, you can refer to this link). So even if a Durable Orchestration workflow is long-running, the individual functions are still time-constrained.
- If your orchestration has a long list of items to process, processing the items in the list will cause performance issues, because it loads previous historical events into memory for each item. To work around this, you can use
IDurableOrchestrationContext. This method truncates the saved history of the orchestration instance and restarts the orchestration.
I hope some of these tips will help you debug and write an efficient durable orchestration workflow.