Complex data type in Hive: Map

Map – a complex data type in Hive which can store Key-Value pairs. Values from a map can be accessed using the keys.

Create Table

While creating a table with Map data type, we need to specify the –

  1. ‘COLLECTION ITEMS TERMINATED BY’ character to specify different key-value pairs.
  2. ‘MAP KEYS TERMINATED BY’ character to specify key and value.

We will create a table containing an Map<int, string> data type –

CREATE TABLE Map_test(
         id int,       
         comments_map Map<int, string> 
) COMMENT 'This is a table stored as textfile'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
COLLECTION ITEMS TERMINATED BY '#'
MAP KEYS TERMINATED BY '@'
STORED AS TEXTFILE;

Load Data

We have a file ‘/home/hadoop/comments‘ on our HDFS.
The file contains 4 rows and each row contains an id and a map containing user_id and comment. Contents of the file:-

1,1@india is great#2@india won icc t20#3@jai hind
2,1@we are awesome#2@i like cricket
3,1@hurray we won#2@what a great match#3@watching cricket all day
4,1@hectic day#3@irctc rocks

Load the data into table:-

LOAD DATA INPATH '/home/hadoop/comments' overwrite into table map_test;

Verify data

Now let us execute some queries on the Map data type.

#Select id and all the comments stored in Map column.
select id, comments_map from map_test;
+-----+--------------------------------------------------------------------------+--+
| id  |                               comments_map                               |
+-----+--------------------------------------------------------------------------+--+
| 1   | {1:"india is great",2:"india won icc t20",3:"jai hind"}                  |
| 2   | {1:"we are awesome",2:"i like cricket"}                                  |
| 3   | {1:"hurray we won",2:"what a great match",3:"watching cricket all day"}  |
| 4   | {1:"hectic day",3:"irctc rocks"}                                         |
+-----+--------------------------------------------------------------------------+--+

#Select id and all the comments by user_id 1
select id, comments_map[1] as comments_by_user_1 from map_test;
+-----+---------------------+--+
| id  | comments_by_user_1  |
+-----+---------------------+--+
| 1   | india is great      |
| 2   | we are awesome      |
| 3   | hurray we won       |
| 4   | hectic day          |
+-----+---------------------+--+

#Select id and all the comments by user_id 2
select id, comments_map[2] as comments_by_user_2 from map_test;
+-----+---------------------+--+
| id  | comments_by_user_2  |
+-----+---------------------+--+
| 1   | india won icc t20   |
| 2   | i like cricket      |
| 3   | what a great match  |
| 4   | NULL                |
+-----+---------------------+--+

Leave a Reply

Your email address will not be published. Required fields are marked *