Jdbc Interpreter on Zeppelin Tutorial
Apache Zeppelin is web-based notebook that enables data-driven,
interactive data analytics and collaborative documents with SQL, Scala and more. Jdbc interpreter can connect to any data source that supports jdbc protocol and it is widely used by many users for data analytics.
In this post, I would like to talk about how to use jdbc interpreter to connect with mysql, and other advanced features zeppelin provide to support sql-based analytics.
Outline
- Configuration
- Basic usage
- Built-in visualization
- Multiple sql statement
- Realtime dashboard
- Dynamic forms
Configure
Step 1. Create a new interpreter for your database.
By default, there’s one interpreter named jdbc
in zeppelin. You can use this interpreter directly via %jdbc
or create a new interpreter for your database. For me, I prefer to create a new interpreter for each database. Here I will create a new interpreter named mysql
like following
Step 2. Configure this interpreter
There’ re several properties you need to configure for this mysql
interpreter. Here I list several must-have properties.
- default.driver
- default.url
- default.user
- default.password
Besides that you also need to add mysql jdbc driver in the Dependencies.
Basic Usage
After the above configuration, you can use mysql
interpreter to query mysql database such as following:
Besides the basic function of running sql queries, there’s other features
- Built-in visualization
- Multiple sql statements
- Realtime dashboard
- Dynamic form
- Credential injection
Built-in Visualization
Zeppelin supports to display tabular data via 6 kinds format
- Table
- Bar Chart
- Pie Chart
- Area Chart
- Line Chart
- Scatter Chart
Multiple sql statements
You can write multiple sql statements in one paragraph, each sql statement is separated with semicolon. Here’s one example,
Realtime dashboard
Besides executing sql statement and displaying result in static dashboard, Zeppelin also support realtime dashboard by running sql queries every n seconds. There’re 2 kinds of realtime dashboard
- html based dashboard
This type of realtime dashboard is used for the sql statement that only return one row, such as the following example. refreshInterval
is for specifying the refresh time interval in milli-seconds, template
is for specifying html template. And you can use {i} as placeholder for the ith column of result.
- chart based dashboard
Chart based dashboard is used for other cases where the sql result has multiple rows, such as following example
Dynamic forms
Apache Zeppelin dynamically creates input forms via dynamic forms, so that you have custom input and more interaction with your data. Here’re 2 examples, one create textbox, another use dropdown list. You can check for more details on this link
Mysql is just one of databases that is supported by Zeppelin, you can find more database that is supported by Zeppelin here.