Posts

HANA Zeppelin Query Builder with Map Visualization

SAP HANA Query Builder On Apache Zeppelin Demo

HANA Zeppelin Query Builder with Map Visualization

HANA Zeppelin Query Builder with Map Visualization

In working with Apache Zeppelin I found that users wanted a way to explore data and build charts without needing to know SQL right away. This is an attempt to build a note in Zeppelin that would allow a new data scientist to get familiar with the data structure of their database. And it allows them to build simple single table queries that allow for building charts and maps quickly. In addition it shows the SQL used to perform the work.

Demo

This video will demonstrate how it works. I have leveraged work done by Randy Gelhausen’s query builder post on how to make a where clause builder.  I also used Damien Sorel’s jQuery Query Builder. These were used to make a series of paragraphs to lookup tables and columns in HANA and allow the user to build a custom query. This data can be quickly graphed using the Zeppelin Helium visualizations.

The Code

This is for those data scientists and coders that want to replicate this in their Zeppelin.

Note that this code is imperfect as I have not worked out all the issues with it. You may need to make changes to get it to work. It only works on Zeppelin 0.8.0 Snapshot. It is also made to work with SAP HANA as the databases.

It only has one type of aggregation – sum and it does not have a way to perform a having statement. But these features could easily be added.

This Zeppelin note is dependent on code from a previous post. Follow the directions in Using Zeppelin to Explore a Database first.

Paragraph One

%spark
//Get list of columns on a given table
def columns1(table: String) : Array[(String)] = {
 sqlContext.sql("select * from " + table + " limit 0").columns.map(x => x.asInstanceOf[String])
}

def columns(table: String) : Array[(String, String)] = {
 sqlContext.sql("select * from " + table + " limit 0").columns.map(x => (x, x))
}

def number_column_types(table: String) : Array[String] = {
 var columnType = sqlContext.sql("select column_name from table_columns where table_name='" +
    table + "' and data_type_name = 'INTEGER'")
 
 columnType.map {case Row(column_name: String) => (column_name)}.collect()
}

// set up the tables select list
val tables = sqlContext.sql("show tables").collect.map(s=>s(1).asInstanceOf[String].toUpperCase())
z.angularBind("tables", tables)
var sTable ="tables"
z.angularBind("selectedTable", sTable)


z.angularUnwatch("selectedTable")
z.angularWatch("selectedTable", (before:Object, after:Object) => {
 println("running " + after)
 sTable = after.asInstanceOf[String]
 // put the id for paragraph 2 and 3 here
 z.run("20180109-121251_268745664")
 z.run("20180109-132517_167004794")
})


var col = columns1(sTable)
col = col :+ "*"
z.angularBind("columns", col)
// hack to make the where clause work on initial load
var col2 = columns(sTable)
var extra = ("1","1")
col2 = col2 :+ extra
z.angularBind("columns2", col2)
var colTypes = number_column_types(sTable)
z.angularBind("numberColumns", colTypes)
var sColumns = Array("*")
// hack to make the where clause work on initial load
var clause = "1=1"
var countColumn = "*"
var limit = "10"

// setup for the columns select list
z.angularBind("selectedColumns", sColumns)
z.angularUnwatch("selectedColumns")
z.angularWatch("selectedColumns", (before:Object, after:Object) => {
 sColumns = after.asInstanceOf[Array[String]]
 // put the id for paragraph 2 and 3 here
 z.run("20180109-121251_268745664")
 z.run("20180109-132517_167004794")
})
z.angularBind("selectedCount", countColumn)
z.angularUnwatch("selectedCount")
z.angularWatch("selectedCount", (before:Object, after:Object) => {
 countColumn = after.asInstanceOf[String]
})
// bind the where clause
z.angularBind("clause", clause)
z.angularUnwatch("clause")
z.angularWatch("clause", (oldVal, newVal) => {
 clause = newVal.asInstanceOf[String]
})

z.angularBind("limit", limit)
z.angularUnwatch("limit")
z.angularWatch("limit", (oldVal, newVal) => {
 limit = newVal.asInstanceOf[String]
})

This paragraph is Scala code that sets up some functions that are used to query the table with the list of tables and the table with the list of columns. You must have the tables loaded into Spark as views or tables in order to see them in the select lists. This paragraph performs all the binding so that the next paragraph which is Angular code can get the data built here.

Paragraph Two

%angular
<link rel="stylesheet" href="https://cdn.rawgit.com/mistic100/jQuery-QueryBuilder/master/dist/css/query-builder.default.min.css">
<script src="https://cdn.rawgit.com/mistic100/jQuery-QueryBuilder/master/dist/js/query-builder.standalone.min.js"></script>

<script type="text/javascript">
  var button = $('#generateQuery');
  var qb = $('#builder');
  var whereClause = $('#whereClause');
 
  button.click(function(){
    whereClause.val(qb.queryBuilder('getSQL').sql);
    whereClause.trigger('input'); //triggers Angular to detect changed value
  });
 
  // this builds the where statement builder
  var el = angular.element(qb.parent('.ng-scope'));
  angular.element(el).ready(function(){
    var integer_columns = angular.element('#numCol').val()
    //Executes on page-load and on update to 'columns', defined in first snippet
    window.watcher = el.scope().compiledScope.$watch('columns2', function(newVal, oldVal) {
      //Append each column to QueryBuilder's list of filters
      var options = {allowEmpty: true, filters: []}
      $.each(newVal, function(i, v){
        if(integer_columns.split(',').indexOf(v._1) !== -1){
          options.filters.push({id: v._1, type: 'integer'});
        } else if(v._1.indexOf("DATE") !== -1) {
          options.filters.push({id: v._1, type: 'date'})
        } else { 
          options.filters.push({id: v._1, type: 'string'});
        }
      });
      qb.queryBuilder(options);
    });
  });
</script>
<input type="text" ng-model="numberColumns" id="numCol"></input>
<form class="form-inline">
 <div class="form-group">
 Please select table: Select Columns:<br>
 <select size=5 ng-model="selectedTable" ng-options="o as o for o in tables" 
       data-ng-change="z.runParagraph('20180109-151738_134370871')"></select>
 <select size=5 multiple ng-model="selectedColumns" ng-options="o as o for o in columns">
 <option value="*">*</option>
 </select>
 Sum Column:
 <select ng-model="selectedCount" ng-options="o as o for o in columns">
 <option value="*">*</option>
 </select>
 <label for="limitId">Limit: </label> <input type="text" class="form-control" 
       id="limitId" placeholder="Limit Rows" ng-model="limit"></input>
 </div>
</form>
<div id="builder"></div>
<button type="submit" id="generateQuery" class="btn btn-primary" 
       ng-click="z.runParagraph('20180109-132517_167004794')">Run Query</button>
<input id="whereClause" type="text" ng-model="clause" class="hide"></input>

<h3>Query: select {{selectedColumns.toString()}} from {{selectedTable}} where {{clause}} 
   with a sum on: {{selectedCount}} </h3>

Paragraph two uses javascript libraries from jQuery and jQuery Query Builder. In the z.runParagraph  command use the paragraph id from paragraph three.

Paragraph Three

The results of the query show up in this paragraph. Its function is to generate the query and run it for display.

%spark
import scala.collection.mutable.ArrayBuffer

var selected_count_column = z.angular("selectedCount").asInstanceOf[String]
var selected_columns = z.angular("selectedColumns").asInstanceOf[Array[String]]
var limit = z.angular("limit").asInstanceOf[String]
var limit_clause = ""
if (limit != "*") {
 limit_clause = "limit " + limit
}
val countColumn = z.angular("selectedCount")
var selected_columns_n = selected_columns.toBuffer
// remove from list of columns
selected_columns_n -= selected_count_column

if (countColumn != "*") {
 val query = "select "+ selected_columns_n.mkString(",") + ", sum(" + selected_count_column +
     ") "+ selected_count_column +"_SUM from " + z.angular("selectedTable") + " where " + 
      z.angular("clause") + " group by " + selected_columns_n.mkString(",") + " " + 
      limit_clause
 println(query)
 z.show(sqlContext.sql(query))
} else {
 val query2 = "select "+ selected_columns.mkString(",") +" from " + z.angular("selectedTable") + 
      " where " + z.angular("clause") + " " + limit_clause
 println(query2)
 z.show(sqlContext.sql(query2))
}

Now if everything is just right you will be able to query your tables without writing SQL. This is a limited example as I have not provided options for different types of aggregation, advanced grouping or joins for multiple tables.

 

Please follow us on our website at https://volumeintegration.com and on twitter at volumeint.

Volume Integration Sponsors Education with Robots

Volume Integration Sponsors Education with Robots

We humans are fascinated with robots since at least the 10th century BC when Yan Shi is recorded to have created a human like automaton. Now we teach about robotics beginning in elementary school. Two popular technologies in our area are the First Lego League and the VEX Robotics Competition

Marsteller Robotics

The competition consists of moving objects into a goal with 15 seconds of autonomous operation followed by a 1 minute and 45 seconds of remote control driving. In addition the robots are paired up with a second robot to work as a team.

This process teaches the competitors social and teamwork skills in the process of building something that requires them to learn and exercise math, science, engineering and technology (STEM) skills.

Software

The robots are programmed in a language specifically created for education and robots called RobotC. It is an altered version of C. This sample from the programmer, Elizabeth, on team 14g from last year shows the autonomous sequence.

#pragma config(Motor, port7, claw, tmotorVex393_MC29, openLoop)
void clawopen (int time)
{
motor[claw]=80;
wait1Msec(time);
motor[claw]=0;
}
task autonomous()
{
if(SensorValue(button1)==1)
{
driveforward (400);
clawclose(700);
armup (1600);
drivebackwards(890);
turnleft(400);
driveforward(500);
armdown(2200);
wait1Msec(500);
clawopen (1000);
drivebackwards(1000);
}
else
{
swingleft(200);//blue
clawclose(700);
armup (1600);
drivebackwards(800);
turnright(550);
driveforward(500);
armdown(2200);
wait1Msec(500);
clawopen (1000);
drivebackwards(1000);
}
}
view rawautonomous.c hosted with ❤ by GitHub

A series of functions that the team programmer Elizabeth wrote, such as clawopen, armup and turnright are executed. These functions call internal robot functions that control motors and read from sensors. This snip of code shows the clawopen function which references the claw motor defined in the pragma at the top. The config ties the port number on the cortex to a specific motor type.

Hardware

The VEX competition requires the use of standard parts so teams are working from the same base. The brain of the robot is an item they call the cortex. It is an ARM processor with ports for motor and sensor connections. There is a joystick to control the robot and then there are sensor and motor modules. In addition there are collections of gears, wheels, belts and structural pieces to build the body.

Optical Shaft encoder to calculate distance traveled

Vex Motor 393

 

Teamwork

Each robot is built by a team of four to six students working together to design, build and program it. They also must work together during the competitions in determining strategy and managing their time as they sometimes need to make adjustments between matches.

Between matches they compete in individual events where they are judged on programming skills and robot driving skills.

Robotics Marsteller

During the preliminaries they are teamed with another robot team. So all the matches are two against two. They learn to problem solve under pressure and learn to work with the strengths and weaknesses of the random parings over many matches.

In the semi-finals of the tournament their math skills are tested as they determine the best teams to align with during the semi-finals. The top teams pick two other teams they want to compete with in the finals. It is a loud dash as teams negotiate and jockey for positions.

In the end there are multiple winners. They select the most inventive robot, the tournament champions, the second place team, programming skills winner, robot skills winner, design (quality of their engineering notebook) and judges award.

The best teams during the season go to the Virginia State competition and then the top teams go to a world championship.

Volume Integration and Volume Labs is proud to sponsor this fun and educational technological sport. Be sure and check back for updates on the progress of the Marsteller Team.

I Code Like a Girl

I Code Like a Girl

Programmers Betty Jean Jennings (left) and Fran Bilas (right) operate ENIAC's main control panel at the Moore School of Electrical Engineering. (U.S. Army photo from the archives of the ARL Technical Library)

Programmers Betty Jean Jennings (left) and Fran Bilas (right) operate ENIAC’s main control panel at the Moore School of Electrical Engineering. (U.S. Army photo from the archives of the ARL Technical Library)

The American computer industry seems dominated by people like me – males. But it was not always this way, in fact I stand on the shoulders of women.

While writing about SAGE in previous posts, I learned that at the beginning of the computer age, most programmers were women. My skills, including programming languages and tools, have a foundation built on the contributions of those who came before me. It turns out that I code like a girl, using concepts pioneered by women.

Computers: Human Women or Machine?

In the 1940’s and 50’s, there was a peak in women working as computers. Mathematically intensive industries like nuclear research, ballistics, and engineering often employed women to perform computations.

Do you remember “When Computers Were Women?” The article reminds us that a “computer” was actually a human being until around 1945. Afterward, a computer was a machine and humans were called operators.

ENAIC

In 1946, the first electronic general-purpose computer called ENIAC (Electronic Numerical Integrator And Computer) was built. This began a process of reducing and automating manual calculations.

A select number of women operators and former “computers” were enlisted to become programmers for ENIAC. Notably, the first ENIAC programming team was comprised entirely of women: Kay McNulty, Betty Jennings, Betty Snyder, Marlyn Wescoff, Fran Bilas, and Ruth Lichterman.

Women programmer School of Electrical Engineering. (U.S. Army photo from the archives of the ARL Technical Library)

School of Electrical Engineering. (U.S. Army photo from the archives of the ARL Technical Library)

At the time, there was a strong division between the male domain of hardware and the female sphere of software. Male electronic engineers built the ENAIC system. But since software design and programming were considered clerical work, women instructed the the 27-ton hand-built mass of wires and vacuum tubes to perform calculations in sequence.

In true programmer fashion, the women learned by doing. The hardware engineers dropped the blueprints and wiring documents on them and said, “Here figure out how the machine works and then figure out how to program it.” So the women crawled around the massive frame and learned how each component worked. They successfully understood the interplay between hardware and software and how the computer’s behavior could be traced to a hardware or a software issue.

Unfortunately, all the credit for creating ENIAC went to the men who conceived it and built the hardware. The media covered the debut of ENIAC in February 1946, which showed off the centerpiece calculation of a trajectory. The program created by Betty Snyder and Betty Jennings impressed the VIPs because it allowed the computer to calculate faster than the projectile itself. But the women were not mentioned, seen in pictures of the event, nor invited to the press lunch with the men. In the end, the computer was the star of the show, depicted as an autonomous brain.

Gender Code

Words powerfully describe gender roles. What is now considered a male-dominated field, was once defined as “women’s work.” In the days of the first computers, the norms were as follows:

Male Female
Hard sciences Soft sciences
Engineering Programming
Hardware Software

Society was keen on recognizing men’s contributions, while neglecting those of women. In the book, _Recoding Gender: Women’s Changing Participation in Computing_, Janet Abbate found that publicity materials for ENIAC state that the machine reduced 25 man-months of human computer time to two hours on the ENIAC. However, it fails to mention that most of the human computers were really women. The materials also neglect to highlight the years of labor by both men (on the hardware) and women (writing software) to create the system. The only human labor noted in the press was the initial design of the machine, which was performed by men.

But even women of the time seemed to define their computer jobs as gender specific. Elise Shutt was a programmer on a later version of ENIAC called ORDVAC. When she was hired by Raytheon in 1953, she said, “It really amazed me that these men were programmers, because I thought it was women’s work.”

In another example, Grace Hopper compared programming to tasks like sewing clothes, making a recipe, and the work of a mother teaching a child. Thus, she defined programming as a female occupation. But this seems to have been lost on her supervisor, Howard Aiken who said in praise of Grace, “Grace was a good man.

Recruiting materials were also used to attract women to programming with various metaphors and generalizations. In the 1940s, MIT had a shortage of men and highlighted skills such as needlework and knitting as characteristics useful for programmers. Others noted that female pursuits like crosswords and puzzles would make good programmers.

Recognition

The first women who pioneered programming on the ENIAC finally gained the recognition they deserved in 1997 when they were inducted into the Women in Technology Hall of Fame in 1997 and IEEE in 1997 and 2008.

It seems that we continue to struggle with metaphors and defining skills used to train and attract the next generation to computer work. From recent statistics, we are finding a wider gender gap in the computer industry. The reasons for this are inconclusive and give us a reason for self-evaluation and consideration of language used to hire and promote, treatment of women, and how skills are evaluated.

It becomes increasingly important to value each member of a programming team regardless of gender, age, race, or creed to attract and keep the best minds to build our future software. There seems to be no end to the amount of programming work needed. Code on!

 

Check out more of our work at Volume Integration and follow us on Twitter.