Camunda External Tasks - Error-Handling and Retry-Behavior

Business Process Management

The External Task pattern turns the push mechanism of service tasks into a pull principle. When running a process with Camunda platform, one or more separate applications fetch External Tasks via the Rest API of the process engine, execute them, and return some results. In a previous article, we have already discussed potential advantages and disadvantages of using External Tasks, and we have shown how to use Camunda's official Java client. In this article, we discuss the error behavior of External Tasks, how you can simplify it by externalizing common behavior, and finally provide a Spring Boot Starter.

A German version of this blog post can be found here.

Errors? Errors do not occur! (The Happy Path)

Ideally, errors never occur during the execution of software so one does not need to even think about error handling at all. However, no matter how much a developer hopes for it: error handling should never be excluded from software architecture. That is why it is important to prepare for any possible error, thrown by the application itself or caused by invalid user-inputs perhaps unreliable surrounding systems. Especially the latter is hard to influence by any developer. The keyword is "resilience".

We use the following example process to ellaborate our intentions: A clerk composes an e-mail, specifies the address, and finally the e-mail is being sent by the service task "Send E-mail".

send e-mail-process

External-Task-Worker

The associated External Task Worker in this scenario could be implemented as follows. It would obtain the recipient's address and the e-mail content from the process context, which was previously created in a user task. Finally, a mail service sends the data and after that the task is completed: taskService.complete(task). This implementation is valid and works - at least as long as no errors occur.

Note
Our examples are implemented based on the official Java client from Camunda. If you use a custom implementation, similar considerations are necessary, but the implementation may vary. Since version 7.15, Camunda als provides a Spring Boot Starter besides their Java client for External Tasks. With this starter, the previously required configuration becomes obsolete, and External Task Workers can be controlled through an application.yaml and the annotation @ExternalTaskSubscription. More information is available in the Camunda-Blog. In our code samples we also use this approach.

What could go wrong?

In the previous example, no error handling is considered at all. This could lead to the service task being completed, although the e-mail has never been sent. This would happen if the mail service reacted inappropriately to errors and the complete-method were called mistakenly. However, it could also lead to the task never being completed. And in this case, the e-mail might be sent several times, namely if any error occurs after sending the mail but before the task is completed. The External Task would be released by the process engine after the configured locking time, some External Task Worker executes it again, but might run into the same error again: an infinite loop - at least until the error is detected and fixed. Referring to conventional Java Delegates: Implementing some error behavior explicitly might be a good idea as well, but it is not mandatory, because in the worst case the process engine creates an incident by default, which is displayed in the cockpit.

Implement error behavior

Handling technical errors might look as follows: The entire business logic resides within a try block. Errors of the mail service are caught and handled in the catch block. The handleFailure() method requires values for the remaining number of retries and the waiting time until the next retry as well as an error cause. In contrast to a Java Delegate, when implementing an External Task Worker you have to take care of these issues yourself - it is not sufficient to configure some retry behavior in the process model. As soon as there are no retries left an incident is created at the corresponding process step. The remaining retries can be obtained from the task object using getRetries() (Attention: This value can be null). Based on this, a fixed, ascending error behavior is implemented in our next example, which is repeated up to five times, each delayed by one additional minute.

Recipient unknown

Besides technical errors, business errors might occur, e.g. if the data is incorrect. Such business errors will not be fixed even by multiple retries, because the data always stays the same. In this case, the retry behavior can be skipped completely, and the External Task Worker might end with a BPMN error. In our example, we extend the process model to include a user task if the recipient cannot be found.

send-email-with-error

Business errors

The different error behavior in the case of an invalid e-mail address is implemented by an additional catch block: As soon as the mail service throws a RecipientNotFoundException, the logic to determine the next retry is skipped, and the External Task is completed using the handleBpmnError() method. Neither the number of remaining retries nor a time-delay is necassary. Instead, an error code and a message as well as optional variables can be returned. Caution: BPMN errors must be handled in the process model, like the error boundary event in our example. If a BPMN error is thrown which is not mapped in the model, a process instance terminates without any further notification.

Configure error behavior

Using conventional Java Delegates, process architects and developers are used to speficy the retry behavior as part of the process model itself. As soon as a service task is marked as "Asynchronous Before", the text field for "Retry Time Cycle" appears which can be used to specify when and how often the execution of a task should be retried, if the execution of the task fails. Using the ISO-8601 standard, a possible value looks like "R3/PT5M", which means three retries each after five minutes. Also possible: "PT5M,PT30M,PT1H" - First retry after five minutes, a second retry after 30 minutes if necessary, and finally another retry after one hour.

This feature can also be emulated for External Tasks by using the extension-properties of a task element. The retry behavior must be specified using a defined name, so it can be obtained at runtime by calling getExtensionProperties(). Depending on whether it is a custom notation or the official ISO-8601 standard, an according logic must be provided that calculates the subsequent retry.

Note
Values for extension-properties being available at runtime must be enabled for each worker that should use this feature. Either within the application.yaml (available in the Spring Boot Starter) or in the original worker configuration of the Spring Boot application.

camunda.bpm.client:
base-url: http://localhost:8080/engine-rest
subscriptions:
send-mail:
include-extension-properties: true

Automate retry behavior (Retry-Aspect)

We used the previously described approach to generalize the desired behavior in a separate Spring Boot Starter. This way, the retry behavior does not have to be added to each External Task Worker individually as sort of boilerplate code, but it is added automatically by simply adding a dependency to the project itself.

More information about this project is available on GitHub: https://github.com/viadee/external-task-retry-aspect

<dependency>
<groupId>de.viadee.bpm.camunda</groupId>
<artifactId>external-task-retry-aspect-spring-boot-starter</artifactId>
<version>${version.retry-aspect}</version>
</dependency>

External-Task-Retry-Aspect

By using our Spring Boot Starter, all failures during an External Task execution leads at least to some sort of error behavior, even without any other measure implemented. By default this is three retry attempts each after five minutes each. Both, the default behavior as well as the retry behavior per task, are configurable of course.

Furthermore, business errors can be simply created by throwing an ExternalTaskBusinessError, which corresponds to the call of the handleBpmnError() method from above. Additionally, it is possible to entirely skip the error behavior to create an incident immediately by using an InstantIncidentException. Both features are optional, so that a developer might focus on the business logic exclusively without implementing any try-catch blocks. The default error behavior will still be present, because in any case a rudimentary treatment takes place in the background, so endless loops are avoided for sure. Nonetheless, errors can still be handled explicitly if the default behavior is insufficient.

Conclusion

The External Task pattern still offers great potential in different categories. We have already highlighted this in our previous article. A decoupling between process engine and execution through External Tasks is a future-proof concept and is becoming more and more popular. This is also indicated by the new Spring Boot Starter for External Task clients provided by Camunda since version 7.15. However, developers have to be aware of the differences compared to the classic variant with Java Delegates: Previously, in the worst case, things such as forgetting atry-catch block could leads to unnecessary retries and potentially an incident visible in the cockpit. But using External Tasks, lacking or faulty error handling might lead to endless loops or worse. The External-Task-Retry-Aspect represents a workaround for this drawback, adds basic error handling capabilites to an External Task Workers and can be added simply in a Spring Boot Starter manner.

Code samples on GitHub: https://github.com/viadee/bpmnExternalTaskWorkerExample

External-Task-Retry-Aspect: https://github.com/viadee/external-task-retry-aspect

Update 15. September 2022: Please also read our new blogpost about Camunda External Task Workers and Quarkus.

PROCESS MANAGEMENT DIRECTLY WITHIN Confluence

Do you know our viadee BPMN Modeler? It is a lightweight extension for the Enterprise Wiki System Atlassian Confluence. With the BPMN Modeler it is possible to carry out methodically profound process management directly within Confluence. The numerous and valuable advantages of this approach resulted from the extensive project experience of our BPM experts and were the motivation for the development of the plug-in.

Visit our BPMN Modeler page or go directly to the Atlassian Marketplace.

Back to blog overview