Validate Hibernate Search Input with an Analyzer

Stop Words

Hibernate Search lets you easily assign an @Analyzer on Fields, which are used to process terms before they are written to the index. An anlyzer can be used for instance for stemming and removing of words which are so frequent that they are insignificant for the results. These are examples for stop words:

[“a”, “an”, “and”, “are”, “as”, “at”, “be”, “but”, “by”, “for”, “if”,
 “in”, “into”, “is”, “it”, “no”, “not”, “of”, “on”, “or”, “such”, “that”,
 “the”, “their”, “then”, “there”, “these”, “they”, “this”, “to”, “was”, “will”, “with”]```


It is a common technique, to split input search terms into single keywords and use these keywords for combining a complex queries over several fields. The problem with this approach is that if a user provides such stop words as input and you manually split the input string into a list of keywords, for instance with a split method, it can occur that a stop word becomes a single keyword for Hibernate Search to process.

List keywordsList = Arrays.asList(searchKeywords.split(” “));```

Hibernate does not know what to search for an replies with this error message and also suggests a solution.

The query string 'of' applied on field 'title' has no meaningfull tokens to be matched. Validate the query input against the Analyzer applied on this field.```


In orderb to validate the string with the same analyzer as hibernate does when building the index, we can retrieve the analyzer from the class we want to search on and parse the string by providing it as a stream to the analyzer. First we get the analyser like this:

Validate.notNull(entityManager, “Entity manager can’t be null”); FullTextEntityManager fullTextEntityManager = org.hibernate.search.jpa.Search .getFullTextEntityManager(entityManager); QueryBuilder builder = fullTextEntityManager.getSearchFactory() .buildQueryBuilder().forEntity(MyClass.class).get();```

And then we can write a simple method, which takes the whole string as input and chops the whole string into a list of keywords, which is sent through the analyzer. Thus if your input string contains a stop word and it would not be added to the index, it will also not be included in the list. Thus Hibernate won’t even try to search for it, as it never sees the stemmed word in the first place.

/**
     * Validate input against the tokenizer and return a list of terms.
     * @param analyzer
     * @param string
     * @return
     */
    public static List<String> tokenizeString(Analyzer analyzer, String string)
    {
        List<String> result = new ArrayList&lt;String&gt;();
        try
        {
            TokenStream stream = analyzer.tokenStream(null, new StringReader(string));
            stream.reset();
            while (stream.incrementToken())
            {
                result.add(stream.getAttribute(CharTermAttribute.class).toString());
            }
            stream.close();
        } catch (IOException e)
        {
            // not thrown b/c we're using a string reader...
            throw new RuntimeException(e);
        }
        return result;
    }
}```




<div class="twttr_buttons">
  <div class="twttr_twitter">
    <a href="http://twitter.com/share?text=Validate+Hibernate+Search+Input+with+an+Analyzer" class="twitter-share-button" data-via="" data-hashtags=""  data-size="default" data-url="https://blog.stefanproell.at/2017/04/13/validate-hibernate-search-input-with-an-analyzer/"  data-related="" target="_blank">Tweet</a>
  </div>
  
  <div class="twttr_followme">
    <a href="https://twitter.com/@stefanproell" class="twitter-follow-button" data-show-count="true" data-size="default"  data-show-screen-name="false"  target="_blank">Follow me</a>
  </div>
</div>

Using Hibernate Search with Spring Boot

Spring Boot is a framework, that makes it much easier to develop Spring based applications, by following a convention over configuration principle (while in contrast Spring critics claim that the framework’s principle is rather configuration over everything). In this article, I am going to provide an example how to achieve the following:

  • Create a simple Web application based on Spring Boot
  • Persist and access data with Hibernate
  • Make it searchable with Hibernate Search (Lucine)

I use Eclipse with a Gradle plugin for convenience. MySQL will be our back-end for storing the data. The full example can be obtained from my [Github Repository][1].

Bootstrapping: Create a Simple Spring Boot Webapp

The easiest way to start with Spring Boot is heading over to [start.spring.io][2] and create a new project. In this example, I will use Gradle for building the application and handling the dependencies and I add Web and JPA starters.

[][3]

Download the archive to your local drive and extract it to a folder. I called the project SearchaRoo.

Import the Project with Eclipse

Import it as an existing Gradle Project in Eclipse by using the default settings. You will end up with a nice little project structure as shown below:

[][4]

We have a central application starter class denoted SearchaRooAppication.java, package definitions, application properties and even test classes. The great thing with Spring Boot is that it is very simple to start and that you can debug it as every other local Java application. There is no need for remote debugging or complex application server setups.

Prepare the Database

We need a few permissions on our MySQL instance before we can start.

CREATE DATABASE spring_employees;
CREATE USER 'dev'@'localhost' IDENTIFIED BY 'sEcReT';
GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, RELOAD, FILE, REFERENCES, INDEX,
    ALTER, SHOW DATABASES, SUPER, LOCK TABLES, CREATE VIEW, SHOW VIEW 
    on spring_employees.* TO 'dev'@'localhost';
GRANT RELOAD on *.* TO 'dev'@'localhost';
FLUSH PRIVILEGES;```


We can then add the connection details into the application.properties file. We will edit this file several times when the complexity of this project increases.

===============================

= JPA / HIBERNATE

===============================

Specify the DBMS

spring.jpa.database = MYSQL

Show or not log for each sql query

spring.jpa.show-sql = true spring.datasource.url=jdbc:mysql://127.0.0.1/employees?createDatabaseIfNotExist=true spring.datasource.username=dev spring.datasource.password=sEcReT spring.datasource.driver-class-name=com.mysql.jdbc.Driver```

Now the basic database setup is done. We can then start adding model classes.

Getting some Employees on Board

MySQL offers a rather small but well documented sample database called employees, which is hosted on Github. Obtain and import the data as follows:

git clone https://github.com/datacharmer/test_db.git
cd test_db
mysql -u dev -p sEcReT < employees.sql

The script creates a new schema called employees and you will end up with a schema like this:

[][5]

In the course of this article, we are going to model this schema with Java POJOs by annotating the entities and the a appropriate fields with JPA.

Dependencies

Before we can start modelling the entities in Java, have a look at the Gradle build file. We include additional dependencies for the MySQL connector and Apache commons.

buildscript {
	ext {
		springBootVersion = '1.5.1.RELEASE'
	}
	repositories {
		mavenCentral()
	}
	dependencies {
		classpath("org.springframework.boot:spring-boot-gradle-plugin:${springBootVersion}")
	}
}

apply plugin: 'java'
apply plugin: 'eclipse'
apply plugin: 'org.springframework.boot'

jar {
	baseName = 'SearchaRoo'
	version = '0.0.1-SNAPSHOT'
}

sourceCompatibility = 1.8

repositories {
	mavenCentral()
}


dependencies {
	compile('org.springframework.boot:spring-boot-starter-data-jpa')
	compile('org.springframework.boot:spring-boot-starter-web')
	testCompile('org.springframework.boot:spring-boot-starter-test')
	compile("mysql:mysql-connector-java")
	compile('org.apache.commons:commons-lang3:3.5')
}

Modelling Reality

The next step covers modelling the data which we imported with Java POJOs. Obviously this is not the most natural way, because in general you would create the model first and then add data to it, but as we already had the data we decided to go in this direction. In the application.properties file, set the database to the imported employees database and set the Hibernate create property to validate. With this setting, we can confirm that we modelled the Java classed in accordance with the database model defined by the MySQL employees database.

An example of such a class is shown below, the other classes can be found in the Github repository.

package at.stefanproell.model;

import java.util.Date;
import java.util.List;

import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.EnumType;
import javax.persistence.Enumerated;
import javax.persistence.Id;
import javax.persistence.OneToMany;

@Entity(name="employees")
public class Employee {
&nbsp; &nbsp;@Id
&nbsp; &nbsp;@Column(name="emp_no")
&nbsp; &nbsp;private int employee_id;
&nbsp; &nbsp;
&nbsp; &nbsp;@Column(name="birth_date")
&nbsp; &nbsp;private Date birthdate;
&nbsp; &nbsp;
&nbsp; &nbsp;@Column(name="first_name")
&nbsp; &nbsp;private String firstname;
&nbsp; &nbsp;
&nbsp; &nbsp;@Column(name="last_name")
&nbsp; &nbsp;private String lastname;
&nbsp; &nbsp;
&nbsp; &nbsp;@Column(name="gender",columnDefinition = "ENUM('M', 'F', 'UNKNOWN') DEFAULT 'UNKNOWN'")
&nbsp; &nbsp;@Enumerated(EnumType.STRING)
&nbsp; &nbsp;private Gender gender;
&nbsp; &nbsp;
&nbsp; &nbsp;@Column(name="hire_date")
&nbsp; &nbsp;private Date hireDate;

&nbsp; &nbsp;@OneToMany(mappedBy="employee")
&nbsp; &nbsp;List<Title> titles; 
&nbsp; &nbsp;
&nbsp; &nbsp;@OneToMany(mappedBy="employee")
&nbsp; &nbsp;List<Department_Employee> employee_department;

&nbsp; &nbsp;// Setters and getters
&nbsp; &nbsp;
&nbsp; &nbsp;
}```


Now that we have prepared the data model, our schema is now fixed and does not change any more. We can deactivate the Hibernate based dynamic generation of the database tables and use the Spring database initialization instead.To see if we modelled the data correctly, we import MySQL employee data dump we obtained before and import it into our newly created schema, which maps the Java POJOs.

## Importing the Initial Data

In the next step, we import the data from the MySQL employee database into our schema spring_hibernate. This schema contains the tables that Hibernate created for us. The following script copies the data between the two schemata. If you see an error, then there is an issue with your model.

<pre class="theme:github lang:mysql decode:true">-- The original data is stored in the database called employees
-- Spring created the new schema called spring_employees
USE `spring_employees`;

-- Departments

INSERT INTO `spring_employees`.`departments`
(`dept_no`,
`dept_name`)
SELECT `departments`.`dept_no`,
    `departments`.`dept_name`
FROM `employees`.`departments`;

-- Employees

INSERT INTO `spring_employees`.`employees`
(`emp_no`,
`birth_date`,
`first_name`,
`gender`,
`hire_date`,
`last_name`)
SELECT `employees`.`emp_no`,
    `employees`.`birth_date`,
    `employees`.`first_name`,
    `employees`.`gender`,
    `employees`.`hire_date`,
    `employees`.`last_name`
FROM `employees`.`employees`;

-- Join table 
INSERT INTO `spring_employees`.`dept_emp`
(`emp_no`,
`dept_no`,
`from_date`,
`to_date`)
SELECT 
`dept_emp`.`emp_no`,
    `dept_emp`.`dept_no`,
    `dept_emp`.`from_date`,
    `dept_emp`.`to_date`
FROM `employees`.`dept_emp`;

-- Join table 

INSERT INTO `spring_employees`.`dept_manager`
(
`emp_no`,
`dept_no`,
`from_date`,
`to_date`)
SELECT `dept_manager`.`emp_no`,
    `dept_manager`.`dept_no`,
    `dept_manager`.`from_date`,
    `dept_manager`.`to_date`
FROM `employees`.`dept_manager`;

-- Titles

INSERT INTO `spring_employees`.`titles`
(`emp_no`,
`title`,
`from_date`,
`to_date`)
SELECT `titles`.`emp_no`,
    `titles`.`title`,
    `titles`.`from_date`,
    `titles`.`to_date`
FROM `employees`.`titles`;

-- Salaries

INSERT INTO `spring_employees`.`salaries`
(`emp_no`,
`salary`,
`from_date`,
`to_date`)
SELECT `salaries`.`emp_no`,
    `salaries`.`salary`,
    `salaries`.`from_date`,
    `salaries`.`to_date`
FROM `employees`.`salaries`;```


We now imported the data in the database schema that we defined for our project. Spring can load schema and initial data during start-up. So we provide two files, one containing the schema and the other one containing the data. To do that, we create two dumps of the database. One containing the schema only, the other one containing the data only.

mysqldump -u dev -psEcReT –no-data –databases spring_employees > src/main/resources/schema.sql mysqldump -u dev -psEcReT –no-create-info –databases employees > src/main/resources/data.sql```

By deactivating the Hibernate data generation and activating the Spring way, the database gets initialized every time the application starts. Change and edit the following lines in the application.properties

spring.jpa.hibernate.ddl-auto=none
spring.datasource.initialize=true
spring.datasource.schema=classpath:/schema.sql
spring.datasource.data=classpath:/data.sql```


Before we can import the data with the scripts, make sure to drop the schema and disable foreign key checks in the schema file and enable them again at the end. Spring ignores the actionable MySQL comments. So your schema file should contain this

<pre class="theme:github lang:default decode:true ">DROP DATABASE IF EXISTS `spring_employees`;
SET foreign_key_checks = 0;

// rest of the code

SET foreign_key_checks = 1;```


And also insert the two foreign key statements to the data file. Note that the import can take a while. If you are happy with the initialized data, you can deactivate the initialization by setting the variable to false: <span class="lang:default decode:true crayon-inline">spring.datasource.initialize=false</span>

The application.properties file meanwhile looks like this:

<pre class="theme:github lang:default decode:true "># ===============================
# = JPA / HIBERNATE
# ===============================

# Specify the DBMS
spring.jpa.database = MYSQL

# Show or not log for each sql query
spring.jpa.show-sql = true
spring.datasource.url=jdbc:mysql://127.0.0.1/spring_employees?createDatabaseIfNotExist=true
spring.datasource.username=dev
spring.datasource.password=sEcReT
spring.datasource.driver-class-name=com.mysql.jdbc.Driver
spring.datasource.connectionProperties=useUnicode=true;characterEncoding=utf-8;
spring.jpa.properties.hibernate.dialect = org.hibernate.dialect.MySQL5Dialect
# Do not initialize anything
spring.jpa.hibernate.ddl-auto=none
spring.datasource.initialize=false
spring.datasource.schema=classpath:/schema.sql
spring.datasource.data=classpath:/data.sql
spring.datasource.platform=mysql```


# Adding Hibernate Search

Hibernate search offers full-text search capabilities by using a dedicated index. We need to add the dependencies to the build file.

<pre class="theme:github lang:default decode:true ">dependencies {
	compile('org.springframework.boot:spring-boot-starter-data-jpa')
	compile('org.springframework.boot:spring-boot-starter-web')
	testCompile('org.springframework.boot:spring-boot-starter-test')
	compile("mysql:mysql-connector-java")
	compile('org.apache.commons:commons-lang3:3.5')
	compile("org.hibernate:hibernate-search-orm:5.5.6.Final")
	compile('org.springframework.boot:spring-boot-starter-test')
	compile('org.springframework.boot:spring-boot-starter-logging')
	compile('org.springframework.boot:spring-boot-starter-freemarker')
}```


Refresh the gradle file after including the search dependencies.

## Adding Hibernate Search Dependencies

In this step, we annotate the model POJO classes and introduce the full-text search index. Hibernate search utilises just a few basic settings to get started. Add the following variables to tne application properties file.

===============================

= HIBERNATE SEARCH

===============================

Spring Data JPA will take any properties under spring.jpa.properties.* and

pass them along (with the prefix stripped) once the EntityManagerFactory is

created.

Specify the DirectoryProvider to use (the Lucene Directory)

spring.jpa.properties.hibernate.search.default.directory_provider = filesystem

Using the filesystem DirectoryProvider you also have to specify the default

base directory for all indexes (make sure that the application have write

permissions on such directory)

spring.jpa.properties.hibernate.search.default.indexBase = /tmp/SearchRroo/```

Please not that storing the Lucene index in the tmp directory is not the best idea, but for testing we can use this rather futile location. We also use the filesystem to store the index, as this is the simplest approach.

Create a Service

In order to facilitate Hibernate Search on our data, we add a service class, which offers methods for searching. The service uses a configuration, which is injected by Spring during run time. The configuration is very simple.

package at.stefanproell.service;

import javax.persistence.EntityManager;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;



@Configuration
public class HibernateSearchConfiguration {
	private final Logger logger = LoggerFactory.getLogger(HibernateSearchConfiguration.class);

	@Autowired
	private EntityManager entityManager;

	@Bean
	HibernateSearchService hibernateSearchService() {
		HibernateSearchService hibernateSearchService = new HibernateSearchService(entityManager);
		hibernateSearchService.initializeHibernateSearch();
		return hibernateSearchService;
	}
}```


The @Configuration is loaded when Spring builds the application context. It provides a bean of our service, which can then be injected into the application. The service itself provides methods for creating and searching the index. In this example, the search method is very simple: it only searches on the first and the last name of an employee and it allows users to make one mistake (distance 1).

<pre class="theme:github lang:default decode:true">package at.stefanproell.service;

import java.util.List;

import javax.persistence.EntityManager;
import javax.persistence.NoResultException;
import javax.persistence.PersistenceContext;

import org.apache.lucene.search.Query;
import org.hibernate.search.jpa.FullTextEntityManager;
import org.hibernate.search.jpa.Search;
import org.hibernate.search.query.dsl.QueryBuilder;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
import org.springframework.transaction.annotation.Transactional;


import at.stefanproell.model.Employee;
import ch.qos.logback.classic.Logger;

@Service
public class HibernateSearchService {

&nbsp; &nbsp;private final Logger logger = (Logger) LoggerFactory.getLogger(HibernateSearchService.class);

&nbsp; &nbsp;private final EntityManager entityManager;

&nbsp; &nbsp;@Autowired
&nbsp; &nbsp;public HibernateSearchService(EntityManager entityManager) {
&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;super();
&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;this.entityManager = entityManager;
&nbsp; &nbsp;}


&nbsp; &nbsp;public void initializeHibernateSearch() {

&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;try {
&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(entityManager);
&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;fullTextEntityManager.createIndexer().startAndWait();
&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;} catch (InterruptedException e) {
&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;// TODO Auto-generated catch block
&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;e.printStackTrace();
&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;}
&nbsp; &nbsp;}
&nbsp; &nbsp;
&nbsp; &nbsp;@Transactional
&nbsp; &nbsp;public List<Employee> fuzzySearch(String searchTerm){

&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(entityManager);
&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;QueryBuilder qb = fullTextEntityManager.getSearchFactory().buildQueryBuilder().forEntity(Employee.class).get();
&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;Query luceneQuery = qb.keyword().fuzzy().withEditDistanceUpTo(1).withPrefixLength(1).onFields("firstname", "lastname")
&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;.matching(searchTerm).createQuery();

&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;javax.persistence.Query jpaQuery = fullTextEntityManager.createFullTextQuery(luceneQuery, Employee.class);

&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;// execute search

&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;List<Employee> employeeList = null;
&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;try {
&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;employeeList&nbsp; = jpaQuery.getResultList();
&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;} catch (NoResultException nre) {
&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;logger.warn("No result found");

&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;}

&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;return employeeList;

&nbsp; &nbsp;
&nbsp; &nbsp;}
}

The service implementation currently only contains an initialization method, which used for creating the Lucene index on the filesystem. Before we can test the index, we need to have at least one indexed entity. This can be achieved by simply adding the annotation @Indexed to the POJO.

Hibernate Search and Spring Boot: Building Bridges

Hibernate Search is a very convenient way for storing database content in a Lucine index and add fulltext search capabilities to data driven projects simply by annotating classes. It can be easily integrated into Spring Boot applications and as long as only the basic features are used, it works out of the box. The fun starts when the Autoconfiguration cannot find out how to properly configure things automatically, then it gets tricky quite quickly. Of course this is natural behaviour, but one gets spoiled quite quickly. 

Using the latest Features: Hibernate ORM, Hibernate Search and Spring Boot

The current version of Spring Boot is 1.5.2. This version uses Hibernate ORM 5.0. The latest stable Hibernate Search versions are 5.6.1.Final and 5.7.0.Final, which in in contrast require Hibernate ORM 5.1 and 5.2 respectively. Also you need Java 8 now. For this reason if you need the latest Spring Search features in combination with Spring Boot, you need to adapt the dependencies as follows:

buildscript {
	ext {
		springBootVersion = '1.5.1.RELEASE'
	}
	repositories {
		mavenCentral()
	}
	dependencies {
		classpath("org.springframework.boot:spring-boot-gradle-plugin:${springBootVersion}")
	}
}

apply plugin: 'java'
apply plugin: 'eclipse'
apply plugin: 'org.springframework.boot'

jar {
	baseName = 'SearchaRoo'
	version = '0.0.1-SNAPSHOT'
}

sourceCompatibility = 1.8
targetCompatibility = 1.8

repositories {
	mavenCentral()
}


dependencies {
	compile('org.springframework.boot:spring-boot-starter-data-jpa')
	compile('org.springframework.boot:spring-boot-starter-web')
	testCompile('org.springframework.boot:spring-boot-starter-test')
	compile("mysql:mysql-connector-java")
	compile('org.apache.commons:commons-lang3:3.5')
	compile('org.springframework.boot:spring-boot-starter-test')
	compile('org.springframework.boot:spring-boot-starter-logging')
	compile('org.springframework.boot:spring-boot-starter-freemarker')

	// Hibernate Search
    compile("org.hibernate:hibernate-core:5.2.8.Final")
    compile("org.hibernate:hibernate-search-orm:5.7.0.Final")    
            configurations.all 
    {
    	exclude group: "org.hibernate:", module: "hibernate-entitymanager"
	}
}```


Note that the Hibernate Entity Manager needs to be excluded, because it has been integrated into the core in the new Hibernate version. Details are given in the [Spring Boot documentation][1].

## Enforcing the Dependencies to be Loaded in the Correct Sequence

As written earlier, Spring Boot takes care of a lot of configurations for us. Most of the time, this works perfectly and reduces the pain for configuring a new application manually. In some particular cases, Spring cannot figure out that there exists a dependency between different services, which needs to be resolved in a specified order. A typical use case is the implementation of FieldBridges for Hibernate Search. FieldBrides translate between the actual Object from the Java World and the representation of such an object in the Lucene index. Typically an [EnumBridge][2]is used for indexing Enums, which are often used for realizing internationalization (I18n).

When the Lucene Index is created, Hibernate checks if Enum fields need to be indexed and if there exist Bridge that converts between the object and the actual record in the Index. The problem here is that Hibernate JPA is loaded at a very early stage in the Spring Boot startup proces. The problem only arises if the BridgeClass utilises @Autowired&nbsp;fields which get injected. Typically, these fields would get injected when the&nbsp;AnnotationBeanConfigurerAspect bean is loaded.&nbsp;Hibernate creates the session with the session factory auto configuration before the&nbsp;spring configurer aspect bean was loaded. So the FieldBridge used by Hibernate during the initialization of the index does not have the service injected yet, causing a nasty Null Pointer Exception.&nbsp;

### Example EnumBridge

The following EnumBridge example utilises an injected Service, which needs to be available before Hibernate starts. If not taken care of, this causes a Null Pointer Exception.

@Configurable public class HibernateSearchEnumBridgeExample extends EnumBridge { private I18nMessageService i18nMessageService;

@Autowired
@Required
public void setI18nMessageService(I18nMessageService service) {
this.i18nMessageService = service;
}

@Override
public String objectToString(Object object)
{
     return  i18nMessageService.getMessageForEnum(object);
}


@Override
public Enum<? extends Enum> stringToObject(String name)
{
    return Enum.valueOf(name);

}

@Override
public void setAppliedOnType(Class<?> returnType)
{
    super.setAppliedOnType(returnType);
}

}```

Enforce Loading the Aspect Configurer Before the Session Factory

In order to enforce that theAnnotationBeanConfigurerAspect is created before the Hibernate Session Factory is created, we simply implement our own HibernateJpaAutoConfiguration by extension and add the AnnotationBeanConfigurerAspect to the constructor. Spring Boot now knows that it needs to instantiate the AnnotationBeanConfigurerAspect before it can instantiate the HibernateJpaAutoConfiguration and we then have wired Beans ready for the consumption of the bridge. I found the correct hint [here][3] and [here][4].

@Configuration
public class HibernateSearchConfig extends HibernateJpaAutoConfiguration {

	public HibernateSearchConfig(DataSource dataSource, JpaProperties jpaProperties,
				AnnotationBeanConfigurerAspect beanConfigurerAspect,
				ObjectProvider<JtaTransactionManager> jtaTransactionManager,
				ObjectProvider<TransactionManagerCustomizers> transactionManagerCustomizers) {

			super(dataSource, jpaProperties, jtaTransactionManager, transactionManagerCustomizers);
	}
}```


As it has turned out, using @DependsOn annotations did not work and also @Ordering the precedence of the Beans was not suffucient. With this little hack, we can ensure the correct sequence of initialization.

<div class="twttr_buttons">
  <div class="twttr_twitter">
    <a href="http://twitter.com/share?text=Hibernate+Search+and+Spring+Boot%3A+Building+Bridges" class="twitter-share-button" data-via="" data-hashtags=""  data-size="default" data-url="https://blog.stefanproell.at/2017/03/10/hibernate-search-and-spring-boot/"  data-related="" target="_blank">Tweet</a>
  </div>
  
  <div class="twttr_followme">
    <a href="https://twitter.com/@stefanproell" class="twitter-follow-button" data-show-count="true" data-size="default"  data-show-screen-name="false"  target="_blank">Follow me</a>
  </div>
</div>

 [1]: http://docs.spring.io/spring-boot/docs/1.5.2.RELEASE/reference/htmlsingle/
 [2]: https://docs.jboss.org/hibernate/search/5.7/api/
 [3]: http://blog.piotrturski.net/2014/05/spring-boot-dependson-is-not-enough.html
 [4]: https://hibernate.atlassian.net/browse/HSEARCH-1316

Deploying MySQL in a Local Development Environment

Installing MySQL via apt-get is a simple task, but the migration between different MySQL versions requires planning and testing. Thus installing one central instance of the database system might not be suitable, when the version of MySQL or project specific settings should be switched quickly without interfering with other applications. Using one central instance can quickly become cumbersome. In this article, I will describe how any number of MySQL instances can be stored and executed from within a user’s home directory.

Adapting MySQL Data an Log File Locations

Some scenarios might require to run several MySQL instances at once, other scenarios cover sensitive data, where we do not want MySQL to write any data on non-encrypted partitions. This is especially true for devices which can get easily stolen, for instance laptops. If you use a laptop for developing your applications from time to time, chances are good that you need to store sensitive data in a database, but need to make sure that the data is encrypted when at rest. The data stored in a database needs to be protected when at rest.

This can be solved with full disk encryption, but this technique has several disadvantages. First of all, full disk encryption only utilises one password. This entails, that several users who utilise a device need to share one password, which reduces the reliability of this approach. Also when the system needs to be rebooted, full disk encryption can become an obstacle, which increases the complexity further.

Way easier to use is the transparent home directory encryption, which can be selected during many modern Linux setup procedures out of the box. We will use this encryption type for this article, as it is reasonable secure and easy to setup. Our goal is to store all MySQL related data in the home directory and run MySQL with normal user privileges.

Creating the Directory Structure

The first step is creating a directory structure for storing the data. In this example, the user name is stefan, please adapt to your needs.

Flashing a NanoPc T3 with DietPi

The NanoPc T3 is a 64 bit octa core single board computer, quite similar to the famous Raspberry Pi boards. It is also often referred to as NanoPi T3 as well.

Hardware Specification

The single board computer has eight cores with up to 1.4GhZ and 1 GB of DDR3 RAM. It has a lot of nice interfaces, the specification below is taken from [here][1].

SoC: Samsung S5P6818 Octa-Core Cortex-A53, 400M Hz - 1.4G Hz
    Power Management Unit: AXP228 PMU, it supports software power-off and wake-up.
    System Memory: 1GB/2GB 32bit DDR3 RAM
    Storage: 1 x SD Card Socket
    Ethernet: Gbit Ethernet(RTL8211E)
    WiFi: 802.11b/g/n
    Bluetooth: 4.0 dual mode
    Antenna: Porcelain Antenna IPX Interface
    eMMC: 8GB
    Video Input: DVP Camera/MIPI-CSI (two camera interfaces)
    Video Output: HDMI Type-A / LVDS / Parallel RGB-LCD / MIPI-DSI (four video output interfaces)
    Audio: 3.5 mm audio jack / via HDMI
    Microphone: onboard Microphone
    USB Host: 4 x USB 2.0 Host, two type A ports and two 2.54 mm pitch pin-headers
    MicroUSB: 1 x MicroUSB 2.0 Client, Type A
    LCD Interface: 0.5mm pitch 45 pin FPC seat, full color RGB 8-8-8
    HDMI: 1.4A Type A, 1080P
    DVP Camera: 0.5mm pitch 24 pin FPC seat
    GPIO: 2.54 mm pitch 30 pin-header
    Serial Debug Port: 2.54mm pitch 4-pin-header
    User Key: K1 (power), Reset
    LED: 1 x power LED and 2 x GPIO LED
    Other Resources: CPU’s internal TMU
    RTC Battery: RTC Battery Seat
    Heat Sink: 1 x Heat Sink with mounting holes
    Power: DC 5V/2A
    PCB: Six Layer
    Dimension: 100 mm x 60 mm
    OS/Software: uboot, Android and Debian

Overview

The device offers quite a lot considering its small measurements. The picture below is an overview picture taken from [here][2].

[][3]The device with the heat sink and attached cables is shown below.

[][4]

Comparison with the Raspberry Pi Model 3B

It costs about twice as much as the Raspberry Pi 3, but comes with eight cores at 1.4GHz instead of four cores with 1.2GHz, GBit Ethernet instead of just 100 MBit and several additional interfaces. It has a dedicated power switch, supports soft poweroff and provides reset and boot buttons. It comes with an SD card slot instead of micro SD, has only two standard USB ports but also one micro USB port. This port however is not for powering the device, but only for data.

Some remarks at First

The board can get quite warm, so I would recommend buying the heat sins that fit directly on the board as well. The wifi signal is also rather weak, I would recommend investing in the external antenna if the device is in an area with low signal reception. Also it requires an external 5V power source and does not provide a micro USB port for power like similar boards use.

Buying and Additional Information

The board can be obtained for 60$ from [here][5] and there also exists a [wiki page][1] dedicated to the T3. The images are stored at a One-Click share hoster and the download is very slow. Also the files are not that well organized and can be easily confused with other platforms offered by the same company.

  • Nano PC T3 ($60)
  • Heat sink ($1.99)
  • Power supply ($20)
  • SD card (~ $10)

Additionally there is shipping ($20 to Europe) and also very likely some toll to pay.

Initial Setup

The NanoPi T3 has an internal eMMC storage with 8GB capacity. It comes pre-installed with Android, which is not really useful for my applications. Instead, there exist different ISO images wich can be obtained here. The wiki page documents how to create bootable SD cards with Windows and Linux and there are also scripts offered, which automate the process. Unfortunately, the scripts are not documented well and some of the links are already broken, which reduces the usability of the provided information. Also as the images should be downloaded from some Sharehoster, there is no way of verifying, what kind of image you actually obtained. This is a security risk and not applicable in many scenarios. Fortunately, there also exist alternative images which are more transparent to use.

By default, the device boots from the eMMC flash storage. By pressing the boot button in the lower right corner, we can also boot from the SD card. This is a nice feature, but if you want to reboot the system unattended, then we need to replace the default operating system. In the course of this article, we are going to write an alternative Debian image to the flash memory and boot this OS automatically.

DietPi

[DietPi][6] is a Debian based distribution, which claims to be an optimized and lightweight alternative for single board PCs. The number of supported devices is impressive and luckily, also the NanoPC T3 is in the list. It also comes with a list of nice features for the configuration and the backup of the system. DietPi can be dowloaded [here][7] and the documentation is available [here][8].

The following steps are requried:

  1. Download the DietPi Image
  2. Write the image to the SD card
  3. Mount the SD card on your desktop and copy the DietPi image to the card
  4. Boot the NanoPC T3 from the card
  5. Flash the DietPi image to the eMMC
  6. Reboot
  7. Configure

Creating a Bootable SD Card

The fist step involves creating a bootable SD card by writing the DietPi image with dd to the card. To do so, download the DietPi image to your local Desktop and then write the file with dd. The process does not differ from other single board machines and is described [here][9]. The next step might seem a bit odd. After you finished writing the SD card, mount it on your local Dekstop and copy the DietPi image to the tmp directory of the SD card.The reason we do this is that we need to have a running Linux system so that we can flash the integrated eMMC of the T3. We then use the DietPi Linux zu actually flash the eMMC of the T3 also with the DietPi image. By copying the image we save some time for downloading and we have the image right available in the next step.

Boot the SD Card

Make sure the T3 is powered off and insert the SD card into the board. Hold and keep pressed the boot button and flip the power switch. The T3 then should boot into the DietPi system. It is easier if you attach a monitor and a keyboard to the system for the further configuration. Alternatively, you can also configure the networking settings in advance, by mounting the SD card at the Desktop and edit the configuration files there, but as we simply use this system for installing the actual operating system, this might be a bit too much effort. Press CRTL+ALT+F2 to switch to a new TTY and login. The standard login for the DietPi system is with the user root and password dietpi.

First, create a backup of the original eMMC content, just in case anything does wrong. Use fdisk, to see the available drives.

Switch the Git Clone Protocol from HTTPS to SSH

Gitlab offers several options for interacting with remote repositories: git, http, https and ssh. The first option – git – is the native transport protocol and does not encrypt the traffic. The same applies for http, rendering https and ssh the only feasible protocols if you commit and retrieve data via insecure networks. Ssh and https are also both available via the web interfaces of Github and Gitlab. In both systems you can simply copy and paste the clone URLs including the protocol. The following screenshot shows the Github version.

[][1]

HTTPS

The simplest way to fetch the repository is to just copy the default HTTPS URL and clone it to the local drive. Git will ask you for the Github credentials.

:~/Projects$ git clone https://github.com/username/test-project.git
Klone nach 'test-project' ...
Username for 'https://github.com': 

You will be asked for the credentials every time you interact with the Github remote repository. Per default, git stores credentials for 5 minutes. Instead of waiting so long, we can just drop the credentials and proceed with an empty cache again.

git credential-cache exit```


To make our live a little easier, we can store the username. In this example, we store this information only locally, valid for this cloned repository only. The same settings can also be applied globally.

<pre class="theme:github lang:default decode:true">git config user.email "user@example.org"
git config user.name "username"

Git will store that information locally (i.e. inside the repository) in the config file:

Switching Kernels: Using Python 2.7 and Python 3.5 in Jupyter Notebooks

Jupyter Notebooks are a great way for working with Python interactively. The integration of Python code into documents is very useful for reports or for writing executable documentation of algorithms and functions. The text can be structured and exported in various formats. With the ever increasing popularity of Python based on the data science hype, more and more libraries are available. Although Python3 is considered to be the future of Python, consensus on the question Python 2.7 vs Python 3.5 is not yet reached. There are quite a few differences and Python 3 is not backwards compatible and therefore the code cannot be executed with both versions without modification. When you install Jupyter Notebooks via Anaconda, Python3 is recommended but Python 2.7 packages also exist.

As there is a large number of libraries, which have not yet been ported to Python 3, it can be useful to switch between the language version within a Jupyter Notebook. The following example assumes that you have both Python versions already installed.

Installing a new Kernel

In Jupyter Notebooks, the kernel is responsible for executing Python code. When you install the Anaconda System for Python3, this version also becomes the default for the notebooks. In order to enable Python 2.7 in your notebooks, you need to install a new kernel like this:

Getting Familiar with Eclipse Again: Git Integration in Comparison with IntelliJ IDEA

Eclipse and IntelliJ are both great Java IDEs, which have their own communities, advantages and disadvantages. After having spent a few years in JetBrains IntelliJ Community Edition, I got accustomed to the tight and clean Git integration into the user interface. Now I consider switching back up Eclipse, I stumbled over a few things that I try to describe in this post.

IntelliJ and Eclipse Handle Project Structures Differently

Eclipse utilises a workspace concept, which allows to work on several projects at the same time. IntelliJ in contrast allows only one open project and organizes substructures in modules. A comparison of these concepts can be found here. These two different viewpoints have effects on the way how Git is integrated into the workflow.

Sharing Projects

The different views on project structures of both IDEs imply that Git repositories are also treated differently. While IntelliJ utilises the root of a repository directly, Eclipse introduces a subfolder for the project. This leads to the interesting observation that importing a project from Git again into an Eclipse workspace requires a small adaption in order to let Eclipse recognize the structure again.

A Small Workflow

In order to get familiar again with Eclipse, I created a small test project, which I then shared by pushing it to a Git repository. I then deleted the project and tried to re-import it again.

Step 1

Create new test project. In this case a Spring Boot Project, which works with Maven. Note that the new project is stored in the Eclipse workspace.

Step 2

As the second step, we create a new repository. Login into Github or your Gitlab instance and create a new project. Initialize it so that you have a master branch ready and copy the URL of the repository. We will then add this repository by opening the Git Repository perspective in Eclipse and add the repository. You can provide a default location for your local repositories in the Eclipse -> Team -> Git properties. In the Git Repository perspective, you can then see the path of the local storage location and some information about the repository, for instance that the local and the master branch are identical (they have the same commit hash). Note that the Git path is different than your workspace project path.

Step 3

We now have a fresh Java Maven based project in our Eclipse workspace and an empty Git repository in the default location for Git repositories in a different location. What we need to do next is to share the source code, by moving it into the Git storage location and add it to the Git index. Eclipse can help us with that,, by using the Team->Share menu from the Project Explorer view, when right clicking on the project.

Step 4

In the next step, we need to specify the Git repository we want to push our code to. This is an easy step, simply select the repository we just created from the drop down menu. In the menu you can see that the content of the current project location on the left side will be moved to the right side. Eclipse created a new subfolder within the repository for our project. This is something that IntelliJ would not do.

In this step, eclipse separates the local and custom project metadata from the actual source code, which is a good thing.

Step 5

In the fifth step, we simply apply some changes and commit and push them to the remote repository using the git staging window.

After this step, the changes are visible in the remote repository as well and available to collaborators. In order to simulate someone, who is going to checkout our little project from Gitlab, we delete the project locally and also remove the git repository.

Step 6

Now we start off with a clean workspace and clone the project from Gitlab. In the git repositories window, we can select clone project and provide the same URL again.

Step 7

In the next screen, we select the local destination for the cloned project. This could be for instance the default directory for Git projects or any other location on your disk. Confirm this dialogue.

Step 8

Now comes the tricky part, which did not work as expected in Eclipse Neon 4.6.1. Usually, one would tell Eclipse, that the cloned project is a Maven project, and it should detect the pom.xml file and download the dependencies. Todo so, we would select Import-> Git -> Projects from Git and clone the repository from this dialogue. Then, as a next step, we would select the Configure -> Convert to Maven Project option, but Eclipse does not seem to recognize the Maven structure. It would only show the files and directories, but not consider the Maven dependencies specified in the pom.xml file.

What happens is that Eclipse tries to add a new pom.xml file and ignores the actual one.

Of course this is a problem and does not work.

Step 9 – Solution

Instead of using the method above, just clone the repository from the Git Repository perspective and then go back to the Project Explorer. Now instead of importing the project via the Git menu, chose the existing Maven project and select the path of the Git repository we cloned before.

And in the next dialogue, specify the path:

As you can see, now Eclipse found the correct pom.xml file and provides the correct dependencies and structure!

Conclusion

Which IDE you prefer is a matter of taste and habit. Both environments do provide a lot of features for developers and differ in the implementation of these features. With this short article, we tried to understand some basic implications of the two philosophies how Eclipse and IntelliJ handle project structures. Once we anticipate these differences, it becomes easy to work with both of them.

A MySQL 5.7 Cluster Based on Ubuntu 16.04 LTS – Part 2

In a recent article, I described how to setup a basic MySQL Cluster with two data nodes and a combined SQL and management node. In this article, I am going to highlight a hew more things and we are going to adapt the cluster a little bit.

Using Hostnames

For making our lives easier, we can use hostnames which are easier to remember than IP addresses. Hostnames can be specified for each VM in the file /etc/hosts. For each request to the hostname, the operating system will lookup the corresponding IP address. We need to change this file on all three nodes to the following example:

A MySQL 5.7 Cluster Based on Ubuntu 16.04 LTS – Part 1

A Cluster Scenario

In this example we create the smallest possible MySQL cluster based on four nodes running on three machines. Node 1 will run the cluster management software, Node 2 and Node 3 will serve as dats nodes and Node 4 is the MySQSL API, which runs on the same VM on Node 1.

Das Phrasensammelsurium

Wer viel liest, dem stechen sie unweigerlich ins Auge: sinnbefreite und nervige Phrasen, die sich wie aus dem Nichts in verschiedenen Medien erscheinen und sich plötzlich überall ausbreiten. Journalisten, Autoren und Wissenschafter sind meist Vielleser und machen sich – ganz unbewusst – Ausdrucksweisen, einzelne Begriffe und ganze Phrasen zu eigen. Diese übernehmen sie dann in ihren eigenen Wortschatz und das Drama nimmt seinen Lauf.

Aufgrund mangelnden Detailwissens meinerseits möchte und kann ich gar nicht auf die sprachwissenschaftlichen Hintergründe eingehen. Man kann sich aber beispielsweise mit Hilfe des Google Ngram Viewers ansehen, wann bestimmte Begriffe im deutschen Buch-Korpus auftauchen. Leider gibt es diese Daten für die deutsche Sprache nur bis einschließlich 2008, weswegen ganz aktuelle Phrasen noch nicht enthalten sind. Abgebildet ist ein Beispiel für das Wort Narrativ, das nun allenthalben herhalten muss.

Selbstverständlich gibt es verschiedenste Projekte, wie beispielsweise die Floskelwoche oder dieser Artikel im Österreichischen Journalist, die sich ganz diesem Thema verschrieben haben. Ich möchte hier dennoch meine persönliche Liste nervtötender Phrasen und Füllwörter in mehr oder weniger alphabetischer Reihenfolge festhalten. Einsendungen sind sehr willkommen.

  • “Aber nun der Reihe nach …”
  • Alternativlos
  • “Am Ende des Tages”
  • “Ganz einem Thema verschrieben”
  • Inflationär
  • “X kann Y”. Beispiel: “Karl-Heinz kann Social Media”
  • Narrativ
  • “X neu denken”.
  • Postfaktisch
  • “So muss X”. Beispiel: “So muss Technik
  • Spannend!
  • “Wir sind X”. Beispiel: “Wir sind Papst

Phrase einreichen

Parsing SQL Statements

JDBC and the Limits of ResultSet Metadata

For my work in the area of data citation, I need to analyse queries, which are used for creating subsets. I am particularly interested in query parameters, sortings and filters. One of the most commonly used query languages is SQL, which is used by many relational database management systems such as MySQL. In some cases, the interaction with databases is abstract, meaning that there is hardly any SQL statements executed directly. The SQL statements are rather built on the fly by object relational mappers such as Hibernate. Other scenarios use SQL statements as String and also prepared statements, which are executed via JDBC. However,  analysing SQL statements is tricky as the language is very flexible.

In order to understand what columns have been selected, it is sufficient to utilise the ResultSet Metadata and retrieve the column names from there. In my case I need to extract this imformation from the query in advance and potentially enforce a specific sorting by adding columns to the ORDER BY clause. In this scenario, I need to parse the SQL statement and retrieve this information from the statement itself. Probably the best way to do this would be to implement a parser for the SQL dialect with ANTLR (ANother Tool for Language Recognition). But this is quite a challenge, so I decided to take a shortcut: FoundationDB.

The FoundationDB Parser

FoundationDB was a NoSQL database which provided several layers for supporting different paradigms at once. I am using past tense here, because the project got acquired by Apple in 2015 and since then does pursue the open source project any more. However, the Maven libraries for the software are still available at Maven Central. FoundationDB uses its own SQL parser, which understands standard SQL queries. These queries can be interpreted as a tree and the parser library allows traversing SQL statements and analyse the nodes. We can use this tree to parse and interpret SQL statements and extract additional information.

The Foundations of FoundationDB

The FoundationDB parser can be included into your own project with the following Maven dependency:

<dependency>
    <groupId>com.foundationdb&lt;/groupId&gt;
    <artifactId>fdb-sql-parser&lt;/artifactId&gt;
    <version>1.6.1&lt;/version&gt;
</dependency>

The usage of the parser is straight forward. We use the following example SQL statement as input:

	FROM tableA AS a, tableB AS b 
	WHERE a.firstColumn = b.secondColumn AND 
	b.thirdColumn < 5 
	ORDER BY a.thirdColumn,a.secondColumn DESC

The following function calls the parser and prints the tree of the statement.

	/*
     * Print a SQL statement
     * @param sqlString
     */
	public void parseSQLString(String sqlString) {
                Parser parser = new Parser(); 
		StatementNode stmt;
		try {
		    stmt = this.parser.parseStatement(sqlString);
                    stmt.treePrint();

		} catch (StandardException e) {
			e.printStackTrace();
		}
	}

The resulting tree is listed below. The statement has also been normalized, which ensures a stable sequence of the parameters.

name: null
updateMode: UNSPECIFIED
statementType: SELECT
resultSet: 	
	com.foundationdb.sql.parser.SelectNode@21b8d17c
	isDistinct: false
	resultColumns: 		
		com.foundationdb.sql.parser.ResultColumnList@6433a2

		[0]:		
		com.foundationdb.sql.parser.ResultColumn@5910e440
		exposedName: firstcolumn
		name: firstcolumn
		tableName: null
		isDefaultColumn: false
		type: null
		expression: 			
			com.foundationdb.sql.parser.ColumnReference@6267c3bb
			columnName: firstcolumn
			tableName: a
			type: null
		[1]:		
		com.foundationdb.sql.parser.ResultColumn@533ddba
		exposedName: secondcolumn
		name: secondcolumn
		tableName: null
		isDefaultColumn: false
		type: null
		expression: 			
			com.foundationdb.sql.parser.ColumnReference@246b179d
			columnName: secondcolumn
			tableName: b
			type: null
		[2]:		
		com.foundationdb.sql.parser.ResultColumn@7a07c5b4
		exposedName: thirdcolumn
		name: thirdcolumn
		tableName: null
		isDefaultColumn: false
		type: null
		expression: 			
			com.foundationdb.sql.parser.ColumnReference@26a1ab54
			columnName: thirdcolumn
			tableName: b
			type: null
	fromList: 		
		com.foundationdb.sql.parser.FromList@3d646c37

		[0]:		
		com.foundationdb.sql.parser.FromBaseTable@41cf53f9
		tableName: tablea
		updateOrDelete: null
		null
		correlation Name: a
		a
		[1]:		
		com.foundationdb.sql.parser.FromBaseTable@5a10411
		tableName: tableb
		updateOrDelete: null
		null
		correlation Name: b
		b
	whereClause: 		
		com.foundationdb.sql.parser.AndNode@2ef1e4fa
		operator: and
		methodName: and
		type: null
		leftOperand: 			
			com.foundationdb.sql.parser.BinaryRelationalOperatorNode@306a30c7
			operator: =
			methodName: equals
			type: null
			leftOperand: 				
				com.foundationdb.sql.parser.ColumnReference@b81eda8
				columnName: firstcolumn
				tableName: a
				type: null
			rightOperand: 				
				com.foundationdb.sql.parser.ColumnReference@68de145
				columnName: secondcolumn
				tableName: b
				type: null
		rightOperand: 			
			com.foundationdb.sql.parser.BinaryRelationalOperatorNode@27fa135a
			operator: <
			methodName: lessThan
			type: null
			leftOperand: 				
				com.foundationdb.sql.parser.ColumnReference@46f7f36a
				columnName: thirdcolumn
				tableName: b
				type: null
			rightOperand: 				
				com.foundationdb.sql.parser.NumericConstantNode@421faab1
				value: 5
				type: INTEGER NOT NULL
orderByList: 	
	com.foundationdb.sql.parser.OrderByList@2b71fc7e
	allAscending: false
	[0]:	
	com.foundationdb.sql.parser.OrderByColumn@5ce65a89
	ascending: true
	nullsOrderedLow: false
	columnPosition: -1
	expression: 		
		com.foundationdb.sql.parser.ColumnReference@25f38edc
		columnName: thirdcolumn
		tableName: a
		type: null
	[1]:	
	com.foundationdb.sql.parser.OrderByColumn@1a86f2f1
	ascending: false
	nullsOrderedLow: false
	columnPosition: -1
	expression: 		
		com.foundationdb.sql.parser.ColumnReference@3eb07fd3
		columnName: secondcolumn
		tableName: a
		type: null

This tree offers a lot of information, which can be used programmatically as well. In the top of the output, we can see that the statement was a SELECT statement and that it was not DISTINCT. Then follows the ResultSet, which contains a list of the three ResultColumns, which have been specified in the SELECT clause. We can see the column names and the table names from which they are drawn. The next block provides the referenced tables (the FROM list) and their alias names. The WHERE – block contains the operands which have been used for filtering and last but not least, there is the list of ORDER BY clauses and their sorting directions.

The Visitor

In order to access the information shown above programmatically, we need to access the content of the node one by one. This can be achieved with the visitor pattern, which traverses all the nodes of the tree. The following listing shows how the visitor pattern can be used for accessing the list of columns from the SELECT clause.

	**
     * Return a list of columns of a SELECT clause
     * @param sql
     * @return
     */
	public ArrayList selectColumnsList(String sql){
        SQLParser parser = new SQLParser();
        BooleanNormalizer normalizer = new BooleanNormalizer(parser);
        StatementNode stmt = null;

        try {
            stmt = parser.parseStatement(sql);
            stmt = normalizer.normalize(stmt);
        } catch (StandardException e) {
            e.printStackTrace();
        }


        final ArrayList<ResultColumn> columns = new ArrayList&lt;ResultColumn&gt;();
        Visitor v = new Visitor() {

            @Override
            public boolean visitChildrenFirst(Visitable node) {
                if (node instanceof SelectNode)
                    return true;
                return false;
            }

            @Override
            public Visitable visit(Visitable node) throws StandardException {
                if (node instanceof ResultColumn) {
                    ResultColumn resultColumn = (ResultColumn) node;
                    columns.add(resultColumn);
                    System.out.println("Column " + columns.size()+ ": " + resultColumn.getName());
                }
                return null;
            }

            @Override
            public boolean stopTraversal() {
                // TODO Auto-generated method stub
                return false;
            }

            @Override
            public boolean skipChildren(Visitable node) throws StandardException {
                if (node instanceof FromList) {
                    return true;
                }
                return false;
            }
        };

        try {
            stmt.accept(v);
        } catch (StandardException e) {
            e.printStackTrace();
        }

        return columns;

    }

This code example, we define a visitor which traverses all the ResultColumn nodes. Every time the current node is an instance of ResultColumn, we add this node to our list of columns. The nodes are only visited, if they are children of a SELECT statement. This is our entry point into the tree. We leave the tree when we reach the FROM list. We then apply the visitor to the statement, which initiates the traversal. As a result, we receive a list of columns which have been used for the result set.

In order to get the list of ORDER BY columns, we can utilise a similar approach. The following functions gives an example:

/**
     * Return list of order by clauses
     * @param sqlText
     * @return
     */
	public OrderByList orderByColumns(String sqlText){

		SQLParser parser = new SQLParser();
		BooleanNormalizer normalizer = new BooleanNormalizer(parser);
		StatementNode stmt;
        OrderByList orderByList = null;
        try {
			stmt = parser.parseStatement(sqlText);
			stmt = normalizer.normalize(stmt);
            CursorNode node = (CursorNode) stmt;
            orderByList = node.getOrderByList();
            int i=0;
            for(OrderByColumn orderByColumn : orderByList){
                i++;
                String direction;
                if(orderByColumn.isAscending()){
                    direction="ASC";
                }else{
                    direction="DESC";
                }
                System.out.println("ORDER BY Column " +i+ ": " +orderByColumn.getExpression().getColumnName()+ " Direction: " + direction );

            }
        } catch (StandardException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
		return orderByList;


	}

This time, we retrieve the list of ORDER BY columns directly from the CurserNode. Similar principles can be used for manipulating SQL statements and apply a different sorting for instance.

An Interactive Map with Leaflet, GeoJSON, and jQuery Using Bootstrap

A Side Project – An Interactive Parking Map of Innsbruck

When I recently moved to Innsbruck, I noticed that there was no interactive map for the parking system available. The amount of time you can park your car depends on the zone your car is located in. There are 20 parking zones and they are defined by their bordering streets in the city.  Innsbruck is very dense and parking is always a hot topic. So I thought having an interactive map makes it easier to find zones where you can leave your car longer, also if your are not so familiar with the street names. The city of Innsbruck offers the GIS data at the open data portal of Austria, which made it quite easy to implement such a map. I used the following technologies for creating this map:

The source code is available at my Github account. The implementation is available here and also at the Austrian Open Data Portal.

Code Snippets

Explaining the whole source code would be a bit too much for this post and most of it is pretty self explanatory, but in the following section I would like to highlight a few things of the project.

Initializing

In the top of the HTML file, we load the JavaScript file which contains all the functions and variables for our implementation. The script is called parkraum.js (parkraum means parking space in German).

The initialization of the Javascript code is straight forward with jQuery. I structured the initialization into a few components, as you can see in the following example.

<script>
    $(document).ready(function() {
	initMap();
	placeZonesOnMap();
	loadScrolling();
	populateParkzoneDropdown();
    });
</script>

We first initialize the Map itself, then place the parking zones, initialize the page scrolling to make it more smoothly and then populate the drop down menu with the available parking zones.

Initialize the Map

The first step is the initialization of the map with Leaflet. Note that map is a global variable. The coordinates [47.2685694, 11.3932759] are the center of Innsbruck annd 14 is the zoom level. In order to offer the map on a public page, you need to register with mapbox, a service which provides the tiles for the map. Mapbox is free for 50k map views per month. Make sure to use your own key and show some attribution.

// Initialize map and add a legend
function initMap() {
    map = L.map('map').setView([47.2685694, 11.3932759], 14);
    tileLayer = L.tileLayer('https://api.tiles.mapbox.com/v4/{id}/{z}/{x}/{y}.png?access_token=YOUR_API_KEY, {
        maxZoom: 18,
        attribution: 'Map data &copy; <a href="http://openstreetmap.org">OpenStreetMap&lt;/a&gt; contributors, ' +
            '<a href="http://creativecommons.org/licenses/by-sa/2.0/">CC-BY-SA&lt;/a&gt;, ' +
            'Imagery © <a href="http://mapbox.com">Mapbox&lt;/a&gt;',
        id: 'mapbox.light'
    }).addTo(map);

    addLegend();
}

We also add a legend to the map, for indicating the parking area type with colors. There exist 4 types of parking areas and we just add little squares with that colors to the map. The legend improves the readability of the map.

/* Add a legend to the map. */
function addLegend(){
    var legend = L.control({position: 'bottomright'});
    legend.onAdd = function (map) {

    var div = L.DomUtil.create('div', 'info legend');
    div.innerHTML =
        '<i style="background:#72B2FF";>&lt;/i&gt;&lt;span style="font-weight: 600;"&gt;90 Minuten Kurzparkzone&lt;/span&gt;&lt;br&gt;' +
        '<i style="background:#BEE7FF";>&lt;/i&gt;&lt;span style="font-weight: 600;"&gt;180 Minuten Kurzparkzone&lt;/span&gt;&lt;br&gt;' +
        '<i style="background:#A3FF72";>&lt;/i&gt;&lt;span style="font-weight: 600;"&gt;Werktags Parkstraße&lt;/span&gt;&lt;br&gt;' +
        '<i style="background:#D8D0F4";>&lt;/i&gt;&lt;span style="font-weight: 600;"&gt;Parkstraße&lt;/span&gt;&lt;br&gt;';

    return div;
};

legend.addTo(map);
}

The final map with the legend looks like this:

Reading Data, Adding Layers

I obtained the shapefiles with the geographic information from the Austrian Open Data Portal. The data seems to be exported from ArcGIS and I manually converted it into GeoJSON, which is directly supported by Leaflet.js without any plugins. To do this, I just copied the polygon data into the GeoJSON structure. I also separated the quite large file into smaller junks, each parking zone in one file.

Below you can see an example how the JSON looks like for the parking zone C. The structure contains the short name (“C”), some additional information about the zone, attributes for the color, opacity and outline and of course the actual coordinates, which make up a polygon covering the parking area.

{
    "type": "Feature",
    "properties": {
        "parkzonenKuerzel": "C",
        "parkzoneInfo":"Kurzparkzone 90 min gebührenpflichtig, werktags Mo-Fr von 9-21 Uhr und Sa von 9-13 Uhr, ½ Stunde EUR 0.70, 1 Stunde EUR 1.40, 1½ Stunden EUR 2.10.",
        "style": {
            "weight": 2,
            "color": "#999",
            "opacity": 1,
            "fillColor": "#72B2FF",
            "fillOpacity": 0.5
        }
    },
    "geometry":{
        "type":"Polygon",
        "coordinates":[
            [
                [11.389460535000069,47.261162269000067],[11.389526044000036,47.261021078000056],
                ...
          ]
        ]
    }
}

I created a small object containing the name of a parking zone, the relative path of the JSON file and a place holder for layer information.

var parkzonen = [{
    parkzone: 'C',
    jsonFile: './data/zoneC.json',
    layer: ''
}, {
    parkzone: 'D',
    jsonFile: './data/zoneD.json',
    layer: ''
}
...
}];

This is globally available inside the JS file. The actual loading of the parking zones and the placement of the polygones on the map is happening in the following function. It uses jQuery to load the JSON files. Note that jQuery expects the files to be delivered by a Web server. So in order for this to work, you need to make sure that the files can be served by a Web server , also on your local development machine. You can try this very easy, if you execute the following python statement within the root directory of your local development directory:``` sudo http-server -p 80  .

function placeZonesOnMap() {
    for (var zone in parkzonen) {
        var parkzonenKuerzel = parkzonen[zone].parkzone;
        var jsonURL = parkzonen[zone].jsonFile;
        $.ajaxSetup({
            beforeSend: function(xhr) {
                if (xhr.overrideMimeType) {
                    xhr.overrideMimeType("application/json");
                }
            }
        });

        $.getJSON(jsonURL, function(data) {

            placeZoneOnMap(data);

        });

    }
}
// Place zone on map
function placeZoneOnMap(data) {
    var parkzonenKuerzel = data.properties.parkzonenKuerzel;


    for (var zone in parkzonen) {
        if(parkzonen[zone].parkzone===parkzonenKuerzel){
            var layer = addGeoJSONToMap(data);
            parkzonen[zone].layer= layer;
        }
    }
}

function addGeoJSONToMap(data){
    var layer = L.geoJson([data], {
        style: function(feature) {
            return feature.properties && feature.properties.style;
        },
        onEachFeature: onEachFeature,
    }).addTo(map);
    return layer;
}

function onEachFeature(feature, layer) {
    var popupContent = 'Parkzone: <span style="font-weight: 900; font-size: 150%;">' + feature.properties.parkzonenKuerzel + '&lt;/span&gt;&lt;/br&gt;' + feature.properties.parkzoneInfo;

    layer.bindPopup(popupContent);
    var label = L.marker(layer.getBounds().getCenter(), {
        icon: L.divIcon({
            className: 'label',
            html: '<span style="font-weight: 900; font-size: 200%;color:black;">' + feature.properties.parkzonenKuerzel+'&lt;/span&gt;',
            iconSize: [100, 40]
        })
    }).addTo(map);
}```


The for loop iterates over the object, where we stored all the parking zones, respectively the JSON file locations. We load the files, one by one, and place the polygons on the map. This of course works in an asynchronous fashion. After this step, the polygons become visible on the map. We also add a clickable info box on all parking zones, which then display additional information.

## Dropdown Selection for Marking and Highlighting a Parking Zone

Users are able to select one of the parking zones from a drop down list.  In the first step, we add all parking zones by iterating over the parking zones object. Once selected, the parking zone will change the color and therefore be highlighted. To do that, we remove the layer of the parking zone and add it again in a different color.

function populateParkzoneDropdown(){ $('#selectParkzone’).empty(); $('#selectParkzone’).append($(‘</option>').val(‘Bitte Auswählen’).html(‘Zonen’)); $.each(parkzonen, function(i, p) { $('#selectParkzone’).append($(‘</option>').val(p.parkzone).html(p.parkzone)); });

}

$("#selectParkzone”).change(function () { var selectedParkZone = $("#selectParkzone”).val(); changeParkzoneColor(selectedParkZone);

});

function changeParkzoneColor (selectParkzone){ resetMap();

for (var zone in parkzonen) {
    if(parkzonen[zone].parkzone==selectParkzone){
        var layer = parkzonen[zone].layer;

        map.removeLayer(layer);
        layer.setStyle({
            fillColor: 'red',
            fillOpacity: 0.7
        });
        map.addLayer(layer);
    }
}

}

// Reset all layers function resetMap(){     console.log(‘reset’);     map.eachLayer(function (layer) {         map.removeLayer(layer);     });     map.addLayer(tileLayer);     placeZonesOnMap(); }



## Show the Current Location

Showing the current location is also a nice feature. By clicking on a button, the map scrolls to the current location, which is transmitted by the browser. Note that this only works if you deliver your pages with HTTPS!

function currentPosition() { if (navigator.geolocation) { navigator.geolocation.getCurrentPosition(function(position) { latit = position.coords.latitude; longit = position.coords.longitude; //console.log(‘Current position: ' + latit + ' ' + longit); //alert(‘Current position: ' + latit + ' ' + longit); // this is just a marker placed in that position var abc = L.marker([position.coords.latitude, position.coords.longitude]).addTo(map); // move the map to have the location in its center map.panTo(new L.LatLng(latit, longit)); }); } }```

The single page app utilises bootstrap for rendering the content nicely and providing the navigation features. The following code snippet show how we can make all links scroll smoothly.

// use the first element that is "scrollable"
function scrollableElement(els) {
    for (var i = 0, argLength = arguments.length; i < argLength; i++) {
        var el = arguments[i],
            $scrollElement = $(el);
        if ($scrollElement.scrollTop() > 0) {
            return el;
        } else {
            $scrollElement.scrollTop(1);
            var isScrollable = $scrollElement.scrollTop() > 0;
            $scrollElement.scrollTop(0);
            if (isScrollable) {
                return el;
            }
        }
    }
    return [];
}

// Filter all links 
function filterPath(string) {
    return string
        .replace(/^\//, '')
        .replace(/(index|default).[a-zA-Z]{3,4}$/, '')
        .replace(/\/$/, '');
}

function loadScrolling() {
    var locationPath = filterPath(location.pathname);
    var scrollElem = scrollableElement('html', 'body');

    $('a[href*=\\#]').each(function() {
        var thisPath = filterPath(this.pathname) || locationPath;
        if (locationPath == thisPath &&
            (location.hostname == this.hostname || !this.hostname) &&
            this.hash.replace(/#/, '')) {
            var $target = $(this.hash),
                target = this.hash;
            if (target) {
                var targetOffset = $target.offset().top;
                $(this).click(function(event) {
                    event.preventDefault();
                    $(scrollElem).animate({
                        scrollTop: targetOffset
                    }, 400, function() {
                        location.hash = target;
                    });
                });
            }
        }
    });
}

WAMS Containerstandorte auf einer interaktiven Karte

Der Verein WAMS ist ein Sozialbetrieb mit dem Ziel Arbeitsplätze für Menschen zu schaffen, die aufgrund ihrer besonderen Lebenssituationen im konventionellen Arbeitsmarkt benachteiligt werden. Ein besonderes Augenmerk des Vereins liegt auch auf dem Recyclinggedanken und der Wiederverwertung von Ressourcen. Aus diesem Grund betreut und betreibt der Verein auch Altkleidersammelstellen und verwertet das gespendete Gewand. Eine Liste der Standorte dieser gelben Container befindet sich im Flyer von WAMS, der auf der Vereinshomepage bezogen werden kann. Für all jene, denen die Innsbrucker Straßennamen noch nicht allzu viel sagen oder deren geographisches Gedächtnis lückenhaft ist, habe ich die Standorte auf einer Karte eingetragen. Der Quellcode ist auf Github verfügbar und hier beschrieben.

Das Innsbrucker Parkzonensystem

Der Innsbrucker Parkraum ist in 20 Parkzonen unterteilt. In diesen Parkzonen ist das Parken zu unterschiedlichen Zeiten gebührenpflichtig. Innsbrucker mit Hauptwohnsitz können, sofern die Voraussetzungen gegeben sind, eine Anwohnerparkkarte beantragen. Der statische Parkzonenplan der Stadt Innsbruck befindet sich hier und einen interaktiven Parkzonenplan habe ich hier erstellt. Ein Klick auf das Bild bringt Sie dort hin. Sie finden diese Webapplikation auch im offiziellen Anwendungskatalog der Open Government Data Initiative.

Open Data

Die Daten für das Kartenoverlay basieren auf dem Datensatz Parkzonen in Innsbruck der Stadt Innsbruck. Die Daten habe ich vom Open Government Data Catalog hier bezogen und in das GeoJSON Format konvertiert. Datenstand ist der 03.02.2016, der eindeutige Datenidentifikator lautet 8ffd16df-e7b8-423b-8348-199e6a5bf0ca.

Open Source Software

Diese Seite wurde mit Open Source Software erstellt. Das Kartenmaterial kommt von Open Street Map, die Kartenbilder von Mapbox und die Zonenoverlays habe ich mit Leaflet JS realisiert. Das Layout und Design ist mit Bootstrap gemacht und die interaktiven Elemente sind mit jQuery implementiert.