Discussion:
[redis-db] AOF behavior on NFS server
rpbear
2018-11-14 13:25:40 UTC
Permalink
hello world,
We've developed a NFS service and let the redis AOF files store on our
NFS server.We found that the AOF rewrite will failed sometimes,I'm not sure
this is a bug of redis or the nfs client.
Reproduce this issue is very simple,start the redis and configure the
data directory on a NFS volume,and with "appendonly yes":
[***@10-9-182-230 ~]# redis-cli -h 127.0.0.1

127.0.0.1:6379> SET 1 1

OK

127.0.0.1:6379> BGREWRITEAOF

Background append only file rewriting started

127.0.0.1:6379> SET 2 2

OK

127.0.0.1:6379> BGREWRITEAOF

Background append only file rewriting started

127.0.0.1:6379> SET 3 3

OK

127.0.0.1:6379> BGREWRITEAOF

Background append only file rewriting started

127.0.0.1:6379> SET 4 4

OK

127.0.0.1:6379> BGREWRITEAOF

Background append only file rewriting started

127.0.0.1:6379> SET 5 5

OK

127.0.0.1:6379> BGREWRITEAOF

Background append only file rewriting started

127.0.0.1:6379> SET 6 6

OK

127.0.0.1:6379> BGREWRITEAOF

Background append only file rewriting started

127.0.0.1:6379>

127.0.0.1:6379> BGREWRITEAOF

Background append only file rewriting started

127.0.0.1:6379> SET 7 7

OK

127.0.0.1:6379> SET 8 8

(error) MISCONF Errors writing to the AOF file: Stale file handle

127.0.0.1:6379> SET 9 9

(error) MISCONF Errors writing to the AOF file: Stale file handle

127.0.0.1:6379>


Finally,write to the aof will report "Stale file handle",I read the redis
(v3.2.12) source code and the rewrite aof logic has these steps:
1. fork and the child write to the "temp-rewriteaof-1076.aof" file,and
renamed this file to "temp-rewriteaof-bg-1076.aof" after finished writing.
2. after step 1 finished,the parent will open the
"temp-rewriteaof-bg-1076.aof",say which has fd=10, then rename
"temp-rewriteaof-bg-1076.aof"
to "appendonly.aof", and assigned the fd =10 to aof_fd.
3. subsequent aof data will be written into aof_fd = 10.

The problem is that,our NFS semantic of rename will change the FileHandle
of the file which has been renamed,so after rename,the fd = 10 will
refers to an dangling file and all aof write should fail.Very surprised is
that sometimes it will success while sometimes fail.I found the nfs client
send the fd of "temp-rewriteaof-1076.aof" instead of
"temp-rewriteaof-bg-1076.aof" when it successed.But in the redis code it
seems impossible
to behavior like this.

Anyone meet the same problem?I don't have the experience to read the nfs
client code which sits in the kernel and have no idea now.
Thanks for advance if anyone has any suggestion :-).
--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+***@googlegroups.com.
To post to this group, send email to redis-***@googlegroups.com.
Visit this group at https://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.
Loading...