In our project we need to port some hive table data to our local RDBMS(Oracle). For tables with a lot of columns(hundreds), it could be very tedious to wrote the hive sql and convert the resultSet to the Jpa entity object.
Spring jdbctemplate provides us a good class which would do camel-case conversion to the underscore for us. Do leverage that, we just need to make sure the hive table has the same column name as the RDBMS table. Then we just need to call:
hiveJdbcTemplate.query(sql, new BeanPropertyRowMapper<>(YourOwnEntity.class));
However you might find that the result size will be good but all the entity fields will be null. That is because beginning with Hive .13, a new Hive property called hive.resultset.use.unique.column.names was created. The default value of this property is TRUE.
If the default value is used, the Hive table name is prepended to all column names. This is a change in behavior from previous versions of Hive.
This change will cause the all nulls described above because Spring converts all the fields from the target entity class to underscore format(col_name1, col_name2), however when it does the comparison with the resultSet, the ‘ResultSetMetaData’ returned by hive jdbc contains header as ‘table.col_name1’, ‘table.col_name2’ etc. As a result, the fields will be all null.
To prevent the Hive table names from being prepended to the column names, use this property setting in the Hive-site.xml file.
<property> <name>hive.resultset.use.unique.column.names</name> <value>false</value> </property>