Jdbc Interpreter on Zeppelin Tutorial

Jeff Zhang
Analytics Vidhya
Published in
3 min readJun 30, 2020

Apache Zeppelin is web-based notebook that enables data-driven,
interactive data analytics and collaborative documents with SQL, Scala and more. Jdbc interpreter can connect to any data source that supports jdbc protocol and it is widely used by many users for data analytics.

In this post, I would like to talk about how to use jdbc interpreter to connect with mysql, and other advanced features zeppelin provide to support sql-based analytics.

Outline

  • Configuration
  • Basic usage
  • Built-in visualization
  • Multiple sql statement
  • Realtime dashboard
  • Dynamic forms

Configure

Step 1. Create a new interpreter for your database.

By default, there’s one interpreter named jdbc in zeppelin. You can use this interpreter directly via %jdbc or create a new interpreter for your database. For me, I prefer to create a new interpreter for each database. Here I will create a new interpreter named mysql like following

Step 2. Configure this interpreter

There’ re several properties you need to configure for this mysql interpreter. Here I list several must-have properties.

  • default.driver
  • default.url
  • default.user
  • default.password

Besides that you also need to add mysql jdbc driver in the Dependencies.

Basic Usage

After the above configuration, you can use mysql interpreter to query mysql database such as following:

Besides the basic function of running sql queries, there’s other features

  • Built-in visualization
  • Multiple sql statements
  • Realtime dashboard
  • Dynamic form
  • Credential injection

Built-in Visualization

Zeppelin supports to display tabular data via 6 kinds format

  • Table
  • Bar Chart
  • Pie Chart
  • Area Chart
  • Line Chart
  • Scatter Chart

Multiple sql statements

You can write multiple sql statements in one paragraph, each sql statement is separated with semicolon. Here’s one example,

Realtime dashboard

Besides executing sql statement and displaying result in static dashboard, Zeppelin also support realtime dashboard by running sql queries every n seconds. There’re 2 kinds of realtime dashboard

  • html based dashboard

This type of realtime dashboard is used for the sql statement that only return one row, such as the following example. refreshInterval is for specifying the refresh time interval in milli-seconds, template is for specifying html template. And you can use {i} as placeholder for the ith column of result.

  • chart based dashboard

Chart based dashboard is used for other cases where the sql result has multiple rows, such as following example

Dynamic forms

Apache Zeppelin dynamically creates input forms via dynamic forms, so that you have custom input and more interaction with your data. Here’re 2 examples, one create textbox, another use dropdown list. You can check for more details on this link

Mysql is just one of databases that is supported by Zeppelin, you can find more database that is supported by Zeppelin here.

References

--

--

Jeff Zhang
Analytics Vidhya

Apache Member, Open source veteran, Big Data, Data Science,