Apache Zeppelin is an open source tool that allows interactive data analytics from many data sources like databases, hive, spark, python, hdfs, HANA and more.
It allows for:
- Data Ingestion
- Data Discovery
- Data Analytics
- Data Visualization & Collaboration
SAP HANA is an in memory data platform for storing and analyzing data. At its core it is a columner database with the ability to also become a row database.
Install Apache Zeppelin from https://zeppelin.apache.org/download.html with all interpreters. Once Zeppelin is installed go to the bin directory and run
./zeppelin-daemon.sh start You can view the Zeppelin interface at
Click on the menu to go to interpreter setup.
Choose the Create button.
Pick the JDBC interpreter group in the pull down.
Fill in these HANA properties with your specific server name and port. Give it the name of hana.
The important parts are:
Save the interpreter with the save button.
Install JDBC Driver
Find the jar file called
ngdbc.jar and place it in
The ngdbc.jar may be imbedded in another jar file. You will need to install the SAP HANA Studio or the sql client drivers. In order for me to find the ngdbc.jar I had to unzip the
com.sap.ndb.studio.jdbc_2.3.6.jar file by changing the
.zip. When it is unzipped I found the
ngdbc.jar in the lib directory.
com.sap.ndb.studio.jdbc_2.3.6.jar was found in a directory called
Make a New Note
Go to the menu and create a new note.
Pick hana as your default interpreter and give your note a name.
This will create a workspace where you can create your tables and graphs from data in HANA and also annotate your report with text. It is also possible to pull in data from other database systems you have already configured.
Now enter your query in the text area called the paragraph.
Click the arrow to run the query and then you can select what kind of graph to display.
Now you can add additional paragraphs with markdown text in them to describe your information.
So we have successfully configured Apache Zeppelin to pull data from HANA using a jdbc driver.
How to Use HANA from Spark
You may also want to connect to HANA directly from Spark using Scala, Python or R code. The best way to implement this is to put the reference to the jdbc jar called ngdbc.jar in the Spark interpreter. Go to the Zeppelin settings menu.
Pick the Interpreter and find the Spark section and press edit.
Then add the path you where you have the SAP HANA jdbc driver called ngdbc.jar installed.
This will get you started on using Apache Zeppelin with SAP HANA. We have further tutorials on how to build interfaces in Zeppelin:
- Zeppelin Maps the Easy Way
- Zeppelin Maps the Hard Way
- Using Zeppelin to Explore a Database
- SAP HANA Query Builder On Apache Zeppelin Demo
- Visualizing HANA Graph with Zeppelin