Flink Development Platform: Apache Zeppelin (Part-5) Recovering
In the last few posts, I talked about how to develop Flink job (via table api, sql, udf) in Zeppelin. And I also mentioned how to keep Flink job running even when Zeppelin server is down. But there’s still one pitfall, the Flink job link on the top right of paragraph is missing if zeppelin is restarted, and the realtime dashboard (as following) won’t work after Zeppelin is restarted even when the Flink job is still running. This is because the connection between Zeppelin Server and Flink job is broken. Fortunately, in the latest Zeppelin 0.9 (preview2), we fix it via Recovering.
How to enable recovery
It is pretty easy to do that, there’re 2 things you need to do.
Step 1. Configure 2 properties in zeppelin-site.xml
<property>
<name>zeppelin.recovery.storage.class</name>
<value>org.apache.zeppelin.interpreter.recovery.LocalRecoveryStorage</value>
<description>RecoveryStorage implementation based on java native local filesystem</description>
</property><property>
<name>zeppelin.recovery.dir</name>
<value>recovery</value>
<description>Location where recovery metadata is stored</description>
</property>
By default zeppelin.recovery.storage.class
is org.apache.zeppelin.interpreter.recovery.NullRecoveryStorage
which means recovery is disabled. By setting it as org.apache.zeppelin.interpreter.recovery.LocalRecoveryStorage
, Zeppelin will store interpreter process metadata into local files. You can also set it as org.apache.zeppelin.interpreter.recovery.FileSystemRecoveryStorage
which use store recovery metadata in hdfs. zeppelin.recovery.dir
is used for configuring the folder where the interpreter process metadata is stored.
Step 2. Enable hadoop support (optional)
If you are using FileSystemRecoveryStroage,
then you need to enable hadoop. Zeppelin 0.9 doesn’t ship with hadoop jars, instead you need to include it by yourself via the following 2 steps
- set USE_HADOOP to be true in zeppelin-env.sh
export USE_HADOOP=true
- Make sure hadoop command is on your
PATH
. Because internally zeppelin will run commandhadoop classpath
to get all the hadoop jars and put them on the classpath of zeppelin server.
Verify Recovering
Once you made the above configuration, you need to restart zeppelin to make them take effect. Now your interpreter process will keep running even when you stop Zeppelin server, and after you restart Zeppelin, you will see the running paragraphs is recovered. Here’s one screenshot where I use flink interpreter, you will notice that the flink job will be recovered after I restart zeppelin server.
Summary
Zeppelin community still try to improve and evolve the whole user experience of Flink on Zeppelin , you can join Zeppelin slack to discuss with community. http://zeppelin.apache.org/community.html#slack-channel
Besides this I also make a series of videos to show you how to do that, you can check them on this youtube link.