Complex data type in Hive: Struct

Struct – a complex data type in Hive which can store a set of fields of different data types. The elements of a struct are accessed using dot notation.

Create Table

While creating a table with Struct data type, we need to specify the ‘COLLECTION ITEMS TERMINATED BY’ character. This character will be used to specify different elements in an Struct. We will create a table containing a Struct data type.

CREATE TABLE Struct_test(
         id int,       
         weather_reading struct<temp:int, humidity:int, comment:string> 
) COMMENT 'This is a table stored as textfile'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
COLLECTION ITEMS TERMINATED BY '$'
STORED AS TEXTFILE;

Load Data

We have a file ‘/home/hadoop/weather‘ on our HDFS.
The file contains 4 rows and each row contains an id and data for Struct fields separated by ‘$’. Contents of the file:-

1,32$65$moderate
2,37$78$humid
3,43$55$hot
4,23$45$cold

Load the data into table:-

LOAD DATA INPATH '/home/hadoop/weather' overwrite into table struct_test;

Verify data

Now let us execute some queries on the Struct data type.

#Select the id and weather_reading struct from all the rows.
0: jdbc:hive2://localhost:10000> select id, weather_reading from struct_test;
+-----+-------------------------------------------------+--+
| id  |                 weather_reading                 |
+-----+-------------------------------------------------+--+
| 1   | {"temp":32,"humidity":65,"comment":"moderate"}  |
| 2   | {"temp":37,"humidity":78,"comment":"humid"}     |
| 3   | {"temp":43,"humidity":55,"comment":"hot"}       |
| 4   | {"temp":23,"humidity":45,"comment":"cold"}      |
+-----+-------------------------------------------------+--+

#Select the id and temp field from the weather_reading struct for all the rows
0: jdbc:hive2://localhost:10000> select id, weather_reading.temp from struct_test;
+-----+-------+--+
| id  | temp  |
+-----+-------+--+
| 1   | 32    |
| 2   | 37    |
| 3   | 43    |
| 4   | 23    |
+-----+-------+--+

#Select the id and humidity field from weather_reading struct for all the rows
0: jdbc:hive2://localhost:10000> select id, weather_reading.humidity from struct_test;
+-----+-----------+--+
| id  | humidity  |
+-----+-----------+--+
| 1   | 65        |
| 2   | 78        |
| 3   | 55        |
| 4   | 45        |
+-----+-----------+--+

#Select the id and comment field from weather_reading struct for all the rows
0: jdbc:hive2://localhost:10000> select id, weather_reading.comment from struct_test;
+-----+-----------+--+
| id  |  comment  |
+-----+-----------+--+
| 1   | moderate  |
| 2   | humid     |
| 3   | hot       |
| 4   | cold      |
+-----+-----------+--+


Leave a Reply

Your email address will not be published. Required fields are marked *