How to recover Zookeeper from java.io.EOFException after a server crash?

The solution for me was to find the last log file (which had 0 byte length)

You will find this inside the version-2 directory

ls -l -r --sort=time

-rw-r--r-- 1 chris chris  67108880 Jan 24 10:37 log.23c6a70
-rw-r--r-- 1 chris chris         0 Jan 24 10:37 log.23d3fb4

I've tried first to delete the snapshot and the last 2 logfiles which is also working but then you would have version which is "a bit" older.

-rw-r--r-- 1 chris chris  3685904 Jan 24 00:56 snapshot.23c6a6e

Maybe you have to delete the last snapshot file and the last logfile together and the 0 length logfile to be safe.

btw. Logfile and snapshot have the same HEX pattern which have to match

log.23c6a70

snapshot.23c6a6e

They have to match and be consistent and you should have this problem fixed.


It looks like you have encountered a known Apache ZooKeeper bug. There are a few different Apache JIRA issues related to this: ZOOKEEPER-1621 and ZOOKEEPER-2332. See the comments in those issues if you're interested in root cause analysis and some potential proposed fixes.

Unfortunately, there is no Apache ZooKeeper release that contains a fix for the bug at this time. There are a few potential workarounds that you could try:

  1. Create your own local build of ZooKeeper with one of the patches attached to the linked JIRA issues applied. Please be advised that these patches have not yet been accepted by the ZooKeeper community, so use at your own risk.
  2. Delete the offending log file. The root cause of the problem is that a log file from a prior run of ZooKeeper was written with an incomplete header. Since the header is at the start of the file, and the header itself is incomplete, we can assume that there is no transaction data in the log file after that point. Therefore, it should be safe to delete without causing any data loss.
  3. If it's easier, you might consider just reformatting this ZooKeeper cluster. This may be an appropriate solution if all of the data in your ZooKeeper installation is ephemeral and doesn't require long-term persistence.

The solution for me was to find the 0 length log file in /hadoop/zookeeper/version-2 (or whichever place your dataDir is) and delete it. Start ZooKeeper afterwards.