Struct – a complex data type in Hive which can store a set of fields of different data types. The elements of a struct are accessed using dot notation.
Create Table
While creating a table with Struct data type, we need to specify the ‘COLLECTION ITEMS TERMINATED BY’ character. This character will be used to specify different elements in an Struct. We will create a table containing a Struct data type.
CREATE TABLE Struct_test( id int, weather_reading struct<temp:int, humidity:int, comment:string> ) COMMENT 'This is a table stored as textfile' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' COLLECTION ITEMS TERMINATED BY '$' STORED AS TEXTFILE;
Load Data
We have a file ‘/home/hadoop/weather‘ on our HDFS.
The file contains 4 rows and each row contains an id and data for Struct fields separated by ‘$’. Contents of the file:-
1,32$65$moderate 2,37$78$humid 3,43$55$hot 4,23$45$cold
Load the data into table:-
LOAD DATA INPATH '/home/hadoop/weather' overwrite into table struct_test;
Verify data
Now let us execute some queries on the Struct data type.
#Select the id and weather_reading struct from all the rows. 0: jdbc:hive2://localhost:10000> select id, weather_reading from struct_test; +-----+-------------------------------------------------+--+ | id | weather_reading | +-----+-------------------------------------------------+--+ | 1 | {"temp":32,"humidity":65,"comment":"moderate"} | | 2 | {"temp":37,"humidity":78,"comment":"humid"} | | 3 | {"temp":43,"humidity":55,"comment":"hot"} | | 4 | {"temp":23,"humidity":45,"comment":"cold"} | +-----+-------------------------------------------------+--+ #Select the id and temp field from the weather_reading struct for all the rows 0: jdbc:hive2://localhost:10000> select id, weather_reading.temp from struct_test; +-----+-------+--+ | id | temp | +-----+-------+--+ | 1 | 32 | | 2 | 37 | | 3 | 43 | | 4 | 23 | +-----+-------+--+ #Select the id and humidity field from weather_reading struct for all the rows 0: jdbc:hive2://localhost:10000> select id, weather_reading.humidity from struct_test; +-----+-----------+--+ | id | humidity | +-----+-----------+--+ | 1 | 65 | | 2 | 78 | | 3 | 55 | | 4 | 45 | +-----+-----------+--+ #Select the id and comment field from weather_reading struct for all the rows 0: jdbc:hive2://localhost:10000> select id, weather_reading.comment from struct_test; +-----+-----------+--+ | id | comment | +-----+-----------+--+ | 1 | moderate | | 2 | humid | | 3 | hot | | 4 | cold | +-----+-----------+--+