Distributed Task Synchronization: Leveraging ShedLock in Spring

In this article, learn how to execute tasks in distributed systems using ShedLock, a useful tool for coordinating tasks in complex Spring applications.

In today’s distributed computing landscape, coordinating tasks across multiple nodes while ensuring they execute without conflicts or duplication presents significant challenges. Whether managing periodic jobs, batch processes, or critical system tasks, maintaining synchronization and consistency is crucial for seamless operations.

The Problem

Let’s say we need to run some tasks on a schedule, whether it’s a database cleanup task or some data generation task. If you approach the problem directly, you can solve this problem using the @Schedules annotation included in Spring Framework. This annotation allows you to run code at fixed intervals or on a cron schedule. But what if the number of instances of our service is more than one? In this case, the task will be executed on every instance of our service.

ShedLock

ShedLock makes sure that your scheduled tasks are executed at most once at the same time. The library implements a lock via an external store. If a task is executed on one instance, the lock is set, all other instances do not wait, and skip the execution of the task. This implements “at most once execution.” The external store can be relational databases (PostgreSQL, MySQL, Oracle, etc.) working viaJDBC, NoSQL (Mongo, Redis, DynamoDB), and many others (the full list can be found on the project page).

Let’s consider an example of working with PostgreSQL. First, let’s start the database using Docker:

Shell

docker run -d -p 5432:5432 --name db \
    -e POSTGRES_USER=admin \
    -e POSTGRES_PASSWORD=password \
    -e POSTGRES_DB=demo \
  postgres:alpine

Now it is necessary to create a lock table. On the project page, we need to find the SQL script for PostgreSQL:

SQL

CREATE TABLE shedlock(
    name VARCHAR(64) NOT NULL,
    lock_until TIMESTAMP NOT NULL,
    locked_at TIMESTAMP NOT NULL, 
    locked_by VARCHAR(255) NOT NULL, 
  PRIMARY KEY (name));

Here:

  • name – Unique identifier for the lock, typically representing the task or resource being locked
  • lock_until – Timestamp indicating until when the lock is held
  • locked_at – Timestamp indicating when the lock was acquired
  • locked_by – Identifier of the entity (e.g., application instance) that acquired the lock

Next, create a Spring Boot project and add the necessary dependencies to build.gradle:

Groovy

implementation 'net.javacrumbs.shedlock:shedlock-spring:5.10.2'
implementation 'net.javacrumbs.shedlock:shedlock-provider-jdbc-template:5.10.2'

Now describe the configuration:

Java

@Configuration
@EnableScheduling
@EnableSchedulerLock(defaultLockAtMostFor = "10m")
public class ShedLockConfig {
    @Bean
    public LockProvider lockProvider(DataSource dataSource) {
        return new JdbcTemplateLockProvider(
                JdbcTemplateLockProvider.Configuration.builder()
                        .withJdbcTemplate(new JdbcTemplate(dataSource))
                        .usingDbTime()
                        .build()
        );
    }
}

Let’s create an ExampleTask that will start once a minute and perform some time-consuming action. For this purpose, we will use the @Scheduled annotation:

Java

@Service
public class ExampleTask {
    @Scheduled(cron = "0 * * ? * *")
    @SchedulerLock(name = "exampleTask", lockAtMostFor = "50s", lockAtLeastFor = "20s")
    public void scheduledTask() throws InterruptedException {
        System.out.println("task scheduled!");
        Thread.sleep(15000);
        System.out.println("task executed!");
    }
}

Here, we use Thread.sleep for 15 seconds to simulate the execution time of the task. Once the application is started and the task execution starts, a record will be inserted into the database:

Shell
docker exec -ti <CONTAINER ID> bash psql -U admin demo psql (12.16) Type "help" for help. demo=# SELECT * FROM shedlock; name | lock_until | locked_at | locked_by -------------+----------------------------+----------------------------+--------------- exampleTask | 2024-02-18 08:08:50.055274 | 2024-02-18 08:08:00.055274 | MacBook.local

If, at the same time, another application tries to run the task, it will not be able to get the lock and will skip the task execution:

Shell

2024-02-18 08:08:50.057 DEBUG 45988 --- [   scheduling-1] n.j.s.core.DefaultLockingTaskExecutor
: Not executing 'exampleTask'. It's locked.

At the moment of lock acquired by the first application, a record is created in the database with a lock time equal to lockAtMostFor from the lock settings. This time is necessary to ensure that the lock is not set forever in case the application crashes or terminates for some reason (for example, evicting a pod from one node to another in Kubernetes). After the successful execution of the task, the application will update the database entry and reduce the lock time to the current time, but if the task time execution is very short, this value cannot be less than lockAtLeastFor from the configuration. This value is necessary to minimize clock desynchronization between instances. It ensures that your scheduled tasks are executed only once concurrently.

Conclusion

ShedLock is a useful tool for coordinating tasks in complex Spring applications. It ensures that tasks run smoothly and only once, even across multiple instances. It is easy to set up and provides Spring applications with reliable task-handling capabilities, making it a valuable tool for anyone dealing with distributed systems.

The project code is available on GitHub.

You may also like