How to implement a Distribute Lock

Leafw included in Java

2020-09-13 3944 words 19 minutes

Contents

Every Java developer is no stranger to locks, which frequently come up in both work and interviews. However, for most small-scale projects, or single-machine applications, Java’s juc (java.util.concurrent) is generally sufficient. Yet as application scales grow and we move into distributed systems, relying solely on tools like synchronized and lock becomes inadequate. This article will discuss several common methods to implement distributed locks in such systems.

Database Method for Implementing Distributed Locks

First, let’s talk about the database method, something everyone is very familiar with. Here, we use MySQL as the database. The method might vary slightly with other databases, but the general idea is the same. I will assume that everyone is familiar with the ACID of transactions and the isolation mechanism, so I won’t waste time going over these.

In single-machine applications, we use locks, like sync, to lock a certain resource, ensuring that only one thread can operate it at any given time. Similarly, in distributed systems, we want to ensure that at any given time, only one thread on a single machine can process the task.

Table Record Method

The simplest way to implement such a lock is through table records. We could design the table as follows.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


create table lock_table
(
 id int auto_increment comment 'primary key',
 value varchar(64) null comment 'resource which need be locked',
 constraint lock_table_pk
  primary key (id)
);

create unique index lock_table_value_uindex
 on lock_table (value);

As you can see, the ‘value’ field here is unique. So when we need to acquire a lock, we insert a record into this table. If the insertion is successful, we can consider the lock to be acquired. If another transaction tries to insert the same data at this point, the insertion will fail. To release the lock, simply delete this record.

Obviously, this design has several issues:

It is not re-entrant. The same transaction cannot re-acquire the lock before releasing it.
There is no expiration time. If the unlocking fails, the lock will be retained indefinitely.
It’s non-blocking. If the lock is not acquired, the thread will not wait upon failure. To acquire it again, you must re-trigger the lock acquisition operation.
It’s highly dependent on the availability of the database. If the database goes down, the business becomes unavailable.

However, there are solutions to these problems:

Add a ‘count’ field and a field to record the request ID. If the same request comes in again, just increase the count instead of inserting a new record.
Add an expiration time field and use a scheduled task for data cleanup.
Use a while loop or similar method to constantly trigger the acquisition of the lock.
Set up a backup database to avoid having only a single node, thus ensuring high availability.

Although the above method is feasible and simple, if there are many transactions competing for the lock constantly, there will be continuous failures to acquire the lock and subsequent exceptions. Its performance is obviously not very good, and in a normal scenario, we would not use this method.

Optimistic Lock Method

In databases, we have a categorization of locks as optimistic and pessimistic locks. As the name suggests, optimistic locks always assume the best scenario, thinking that others will not modify the data when fetching it, so no lock is applied. However, when updating, it checks whether anyone else has updated the data during this period. This can be achieved using a version number mechanism and the CAS (Compare-and-Swap) algorithm. Here, you can add a built-in ‘version’ to the table, and carry this version number every time you update. For example, consider the following table:

1
2
3
4
5
6
7
8


create table lock_table
(
 id int auto_increment comment 'primary key',
 value varchar(64) null comment 'locked resource',
  version int default 0 null comment 'version'
 constraint lock_table_pk
  primary key (id)
);

Under normal circumstances, if there is no version number, our modification statement should be:

1

update lock_table set value = #{newValue} where id = #{id};

However, under concurrent conditions, when we set a new value, the old value might have been modified. MySQL’s default transaction isolation mechanism is RR (Repeatable Read), which means the same transaction will read the same data every time, so modifications from other transactions won’t affect the data it reads. This could lead to a situation where the change in value at the time of updating is not from “abc” to “newValue”, but possibly from “unknownValue” to “newValue”. This phenomenon can also be referred to as lost update, and it’s one of the potential problems that concurrent transactions may bring.

With the version number added, the SQL statement to update the value would then become:

1

update lock_table set value = #{newValue}, version = #{version} + 1 where id = #{id} and version = #{version}；

If it encounters a situation where the data has been modified by other transactions, due to holding an older version number, it will not find the corresponding data during the update, causing the update to fail.

The optimistic lock, unlike the table record method, does not depend on the database’s own lock mechanism when detecting data conflicts, so it does not affect performance. However, it does require adding additional fields to the table, which increases the redundancy in the database design. Furthermore, when there’s a high level of concurrency, the value of ‘version’ will change frequently, causing many requests to fail and affecting system availability. Therefore, optimistic locks are generally used in scenarios with many reads, few writes, and not a very high level of concurrency.

Pessimistic Lock

The meaning of pessimistic lock is easy to understand. It always assumes the worst-case scenario — that every time we fetch data, others will modify it, so it locks the data every time it is fetched. This means others who want the data will be blocked until they can acquire the lock. To use a pessimistic lock, you need to turn off MySQL’s default auto-commit mode, i.e., set autocommit=0. There are two ways to implement a pessimistic lock: shared locks and exclusive locks. A shared lock is a lock on a resource shared by different transactions. Adding “lock in share mode” after the executing statement means that a shared lock has been added to some resources. Before a read operation, you must apply to acquire a shared lock. If the lock is successfully added, other transactions can continue to add shared locks but cannot add exclusive locks. Exclusive locks, on the other hand, mean that only one lock can be held on a resource across multiple different transactions. Similar to shared locks, you just add “for update” after the statement to be executed. For the above scenario, the statement when we query this data would be:

1

select id, value from lock_table where id = #{id} for update;

At this point, if other transactions try to modify this data, they will be blocked until the current transaction commits, ensuring data safety. So the drawback of pessimistic locks is that each request will incur the overhead of adding a lock, and requests that have not acquired a lock will be blocked waiting to acquire the lock. In a high concurrency environment, this can lead to a large number of requests being blocked, affecting system availability.

In addition, if this query does not specify a primary key (or index), or the primary key is not clear (like id>0), and data can be found, then it will trigger a table lock. Although we specify using a row lock when we have a primary key, MySQL may sometimes decide not to use the index if it determines that using the index is slower than a full table scan. This usually occurs when the table data is relatively small.

Implementing Distributed Locks with Redis

Next, let’s talk about implementing distributed locks with Redis. This is far more common than using databases, as databases are relatively fragile, and putting high concurrency pressure on the database can easily cause it to crash. The most common command to implement distributed locks in Redis is ‘setnx’, which stands for ‘set if not exists’. As shown below, if the key does not exist, then it sets the key (returning 1), otherwise, it fails (returning 0).

1
2
3
4
5
6


127.0.0.1:6379> setnx name aa
(integer) 1
127.0.0.1:6379> get name
"aa"
127.0.0.1:6379> setnx name bb
(integer) 0

Since locks generally need to have an expiration time, ‘expire’ can be used to set this. However, it’s clear that if you use ‘setnx’ first and then ‘expire’, these two operations are separate and do not possess atomicity. If some exceptional situation occurs during the setting of the expiration time causing the command not to execute or fail, then the lock will never be released. There are two solutions to this problem. One is to use Lua scripting to simultaneously include ‘setnx’ and ‘expire’ commands. Lua commands can ensure that these two operations are atomic, meaning they will either succeed or fail together. Another method is to use another command:

set key value [EX seconds][PX milliseconds][NX|XX]

EX seconds: Set expiration time in seconds
PX milliseconds: Set expiration time in milliseconds
NX: Set the value only when the key does not exist
XX: Set the value only when the key exists

The second method is relatively more common and recommended, but it is still not absolutely safe. Let’s take an example, where Thread A has acquired a lock with the key named ‘lock_key’ and an expiration time of 10 seconds. A takes the lock and goes to work. When it reaches 10 seconds, the lock needs to be released, but A is still working and hasn’t finished yet. Due to the expiration time, Thread B successfully acquires the lock. At the 11-second mark, A has finished its work and needs to release the lock. But at this time, the lock actually belongs to B, so it ends up releasing B’s lock. This results in both threads holding the lock and executing their own business code between the 10th and 11th seconds, causing a lock release error. Here is the pseudocode:

1
2
3
4
5
6


// Lock
redisService.set("lock_key", requestId, 10);
// business code
dosomething();
// release
redisService.release("lock_key")

The key point of the lock release error in the above problem is clearly that when the lock is released, it is directly deleted based on the key, but it does not check whether the key is still its own. So, if we judge whether its value is still its own when deleting, we can prevent the situation of mis-release. Similarly, we can use Lua scripts to control this logic judgment, because we need to first judge whether the value of this key is the one originally assigned, and then delete it. These are two steps that need to be guaranteed to be atomic, so we use Lua to handle it. As for how to write the script, you can check it yourself. I honestly have not studied Lua yet, and I won’t copy it from other people’s articles. If you are interested, please look it up yourself.

As for the other problem mentioned above, where two threads hold the lock at the same time between the 10th and 11th seconds, this does not meet our original requirement. Since we designed the lock, we must hope that only one thread can hold it at the same time. If we haven’t finished handling the business within the valid period of the lock, it would be more appropriate to extend the lock time until the business is completed and then release it. At this time, we need to use a common tool in Redis, Redisson. It can use a mechanism called a watchdog to delay the release of the lock. To use it, you just need to set the lockWatchdogTimeout property, which is 3000ms by default. This parameter is only used when the lock does not set the leaseTimeout parameter, so please note that we cannot use the lock() method, but use the tryLock() method, as shown below. Its underlying implementation has passed -1 for leaseTime, which makes sense.

1
2
3
4


public RFuture `<Boolean>` tryLockAsync(long threadId) {
   // the second param is leaseTime
  return this.tryAcquireOnceAsync(-1L, -1L, (TimeUnit)null, threadId)
}

The following is the explanation of this parameter in the Redisson documentation. Please note that the lock will definitely be released in the end, so it cannot absolutely guarantee that two threads will not get the lock at the same time. The default is 30 seconds. If the thread does not release the lock until 60 seconds, it will actively invalidate this lock. However, in actual production, if a business holds a lock for a minute and has not been processed yet, it is necessary to consider whether to extend the expiration time or the instability of the code itself or the environment.

lockWatchdogTimeout Default value: 30000

Lock watchdog timeout in milliseconds. This parameter is only used if the lock has been acquired without leaseTimeout parameter definition. The lock will expire after the lockWatchdogTimeout if the watchdog did not extend it to the next lockWatchdogTimeout time interval. This prevents against infinity locked locks due to Redisson client crash or any other reason when the lock can’t be released in a proper way.

In addition to the above mis-deletion problem, there is actually another problem. When we usually use Redis, it is not only one machine, but a cluster. For example, under the sentinel mode, the master node has got a certain lock, but suddenly the master node is down. At this time, failover will be carried out, and a certain slave will be upgraded to a new master, then this slave will also get the same lock, which leads to two clients holding the same lock at the same time. So for the Redis distributed lock, the official website of Redis has articles about Redis implementing distributed locks. The core algorithm is called the RedLock algorithm. It first assumes that there are 5 independent Redis master nodes, which are distributed on different machines or virtual machines. Then the client gets the lock in the following steps:

Get the current millisecond timestamp
Try to get the lock from these 5 instances in turn, using the same key and random value. In this step, when we request a lock in each instance, each client must set a timeout that is smaller than the lock’s release time. For example, if the automatic release time of the lock is 10s, then the timeout can be set to 5~50 milliseconds. This can prevent customers from constantly trying to get locks from the remaining blocked instances. If an instance is not available, we should try to connect to the next instance as soon as possible.
The client calculates the time spent getting the lock, that is, calculates the difference between the current time and the time obtained in the first step. Only when the client can obtain the lock in most instances (N/2 +1, here N=5 so it refers to 3), and the total lock acquisition time is less than the valid time of the lock, the lock is considered to be successfully obtained.
If the lock is successfully obtained, its actual validity time is considered to be the initial validity time minus the time spent calculated in the third step
If the client fails to get the lock for some reason (such as not getting the lock in N/2+1 instances or the final valid time is a negative value), then it will try to release the lock on all instances (even if some instances do not have a lock)

Redisson also has a red lock implementation algorithm RedissonRedLock, but it has a great impact on performance, unless necessary, it is generally not adopted, and it may be solved by modifying the design of the business or adopting other technical solutions to solve this extreme situation.

Implementing distributed locks using ZooKeeper

Next, let’s introduce the last way to implement distributed locks, ZooKeeper. This one is definitely very familiar to everyone, but for many developers, it is used in conjunction with Dubbo as a registry. However, its functions are far more than that. It is actually a typical distributed data consistency solution, and distributed applications can implement functions such as data publishing/subscription, load balancing, naming services, distributed coordination/notification, cluster management, master election, distributed locks, and distributed queues based on it. Here I will first talk about two basic concepts that will be mentioned later when introducing distributed locks. Of course, ZooKeeper has many other concepts and knowledge points. This is not the focus of this article, so I won’t say it.

Basic concepts

Data node: znode

One is the data node znode. Actually, in Zookeeper, there are two types of nodes. One is the machine that constitutes the cluster, which we call the machine node; the other is the so-called data node znode, which refers to the data unit in the data model. ZooKeeper stores all data in memory. The data model is a tree, with paths divided by slashes, which is a znode. Each znode will save its own data content, and also save a series of attribute information.

At the same time, znode can also be divided into two categories, one is persistent nodes, which means that once this znode is created, unless the znode is actively removed, this znode will always be saved on ZooKeeper. The other is temporary nodes, as the name suggests, its life cycle is bound to the client session, once the client session fails, then all temporary nodes created by this client will be removed.

In addition, ZooKeeper allows users to add a special attribute to each node: SEQUENTIAL. Once the node is marked with this attribute, when this node is created, ZooKeeper will automatically append an integer to its node name, which is an auto-incrementing number maintained by the parent node. This means that whether it is a persistent node or a temporary node, it can be set to be orderly, that is, a persistent sequence node and a temporary sequence node.

Version

For each znode, ZooKeeper will maintain a Stat data structure for it. Stat records the three data versions of this ZNode, namely version (the current znode version), cversion (the current znode child node version), and aversion (the current znode’s ACL(Access Control Lists, Zookeeper’s access control strategy) version).

Event Listener Watcher

Another concept is Watcher, event listener. ZooKeeper allows users to register some Watchers on specific nodes, and when some specific events are triggered, ZooKeeper server will notify the event to the interested client, which is an important feature of ZooKeeper implementing distributed coordination services.

Lock implementation method

After introducing the basic concepts, let’s start to implement distributed locks using ZooKeeper. ZooKeeper represents a lock through a data node. Next, let’s talk about several ways to implement locks with ZooKeeper, also according to the classification of optimistic locks and pessimistic locks.

Optimistic Lock The main theory of optimistic lock is to use the CAS algorithm, so likewise, using ZooKeeper also needs to use the concept of version number.

Every time data is updated, it will carry a version number. If the version number is inconsistent with the current one, the update will fail, and it needs to try to update again. This part of the logic is relatively simple so it’s not described in detail.

Pessimistic Lock

Exclusive Lock

The concept of an exclusive lock was explained when introducing how databases implement distributed locks. It can also be referred to as a write lock or exclusive lock, which is a basic type of lock. If transaction T1 adds an exclusive lock to data object O1, then during the entire lock period, only transaction T1 is allowed to perform read and update operations on O1. Other transactions cannot perform any type of operation on this data object until T1 releases the exclusive lock. The following is a schematic diagram of an exclusive lock:

When you need to obtain an exclusive lock, all clients will try to call the create() interface to create a temporary subnode /exclusive_lock/lock under the /exclusive_lock node. ZooKeeper guarantees that among all clients, only one client can create successfully, and this client is deemed to have obtained the lock. Meanwhile, all clients who have not obtained the lock need to register a Watcher listener for subnode changes on the /exclusive_lock node, so as to monitor the changes of the lock node in real-time.

Lock release occurs in two situations: one is when the client machine currently holding the lock goes down, the temporary node on ZooKeeper will be removed. Another situation is after the normal execution of business logic, the client will actively delete the temporary node it created. No matter when the lock node is removed, ZooKeeper will notify all clients that have registered the Watcher listener for subnode changes on the /exclusive_lock node. These clients, upon receiving the notification, re-initiate the acquisition of the distributed lock, repeating the “acquiring lock” process.

In general, the entire process of the exclusive lock can be represented by the following diagram:

Shared Lock

Next, let’s talk about shared locks, also known as read locks. If transaction T1 adds a shared lock to data object O1, the current transaction can only perform read operations on O1, and other transactions can only add shared locks to this data object until all shared locks on this data object are released. Similar to the exclusive lock, a lock is represented by a data node on ZooKeeper, which is a temporary sequential node similar to “/shared_lock/[Hostname]-request type-serial number”. This node represents a shared lock.

When acquiring a shared lock, all clients will create a temporary sequential node under the /shared_lock node. If it is a read request, a node such as /shared_lock/192.168.0.1-R-000000001 will be created. If it is a write request, a node such as /shared_lock/192.168.0.1-W-000000002 will be created. The creation of a new node will automatically be arranged in the sequential list according to the timestamp of creation. Therefore, by observing the node sequence, we can discern the chronological order of the requests and grant locks accordingly.

As long as there is no exclusive lock or write operation before the current read operation, the read operation can be considered to have obtained the lock. Write operations are given priority, which means that as long as there is a write operation in the queue, subsequent read operations will have to wait until the write operation is complete, irrespective of their position in the queue.

In terms of releasing locks, shared locks are similar to exclusive locks. It also occurs in two situations: one is when the client machine currently holding the lock goes down, the temporary node on ZooKeeper will be removed. Another situation is after the normal execution of business logic, the client will actively delete the temporary node it created. ZooKeeper will notify all clients that have registered the Watcher listener for subnode changes on the /shared_lock node. These clients, upon receiving the notification, re-initiate the acquisition of the distributed lock, repeating the “acquiring lock” process.

The entire process of a shared lock can be represented by the following diagram:

Conclusion

Distributed locks are an important part of distributed systems, and they are necessary to ensure data consistency and reliability. Although they can be complex to implement, tools such as ZooKeeper provide robust and efficient solutions. Whether to use pessimistic locks or optimistic locks depends on the specific application and its requirements. Both types have their advantages and drawbacks, and the choice between them should be made based on a thorough understanding of their workings and implications.