Different Approaches for Inserting Data Using Dynamic Partitioning into a Partitioned Hive Table

We will see different ways for inserting data using Dynamic partitioning into a Partitioned Hive table. To know how to create partitioned tables in Hive, go through the following links:- Creating Partitioned Hive table and importing data Creating Hive Table Partitioned by Multiple Columns and Importing Data Dynamic Partitioning Dynamic…

Continue reading

Different Approaches for Inserting Data Using Static Partitioning into a Partitioned Hive Table

We will see different ways for inserting data using static partitioning into a Partitioned Hive table. To know how to create partitioned tables in Hive, go through the following links:- Creating Partitioned Hive table and importing data Creating Hive Table Partitioned by Multiple Columns and Importing Data Static Partitioning Static…

Continue reading

Different Approaches for Inserting Data into a Hive Table

We will see different ways for inserting data into a Hive table. We have a table Employee in Hive with the following schema:- 0: jdbc:hive2://localhost:10000> desc employee; +———–+————+———-+–+ | col_name  | data_type  | comment  | +———–+————+———-+–+ | id        | bigint     |          | | name      | string     |          | | age      …

Continue reading

Hive Queries- Sort by, Order by, Cluster by, and Distribute By

In Hive queries, we can use Sort by, Order by, Cluster by, and Distribute by to manage the ordering and distribution of the output of a SELECT query. We will see this with an example. We have a table Employee in Hive, partitioned by Department. 0: jdbc:hive2://localhost:10000> desc employee; +————————–+———————–+———————–+–+…

Continue reading

Setup Hive 1.x

We will see how to setup Hive 1.x. Download Hive We will download hive-1.2.1, from – https://hive.apache.org/downloads.html Hive distribution file to download – apache-hive-1.2.1-bin.tar.gz Extract the contents of the file to a directory /home/hadoopUser/hive. tar -xvf apache-hive-1.2.1-bin.tar.gz Set Environment Variables For Hive to work, we need to set $HADOOP_HOME or…

Continue reading

Start Hiveserver2, Connect Through Beeline and Run Hive Queries

Hiveserver2 HiveServer2 is an enhanced Hive server designed for multi-client concurrency and improved authentication. It also provides better support for clients connecting through JDBC and ODBC. Start Hiverserver2 We have our hive installation under the directory – /home/hadoop/hive. Go to the ‘bin‘ directory under hive installation directory. To start the…

Continue reading