My IT Learnings

Some Configuration Properties in Hive

rajesh • March 21, 2016bigdata

We will see some of the configuration properties available in Hive. Hive Warehouse Directory hive.metastore.warehouse.dir Location of directory on HDFS which will be used for storing the hive warehouse data. Default Value: /user/hive/warehouse 0: jdbc:hive2://localhost:10000> show conf “hive.metastore.warehouse.dir”; +———————–+———+————————————————-+–+ | default | type | desc | +———————–+———+————————————————-+–+ | /user/hive/warehouse |…

Export and Import a Hive Table/Partition

rajesh • March 18, 2016bigdata

EXPORT We use EXPORT command to export data of a table or partition into a specified output location. The EXPORT command exports the metadata along-with the data at the output location. EXPORT a table :- EXPORT table employee to ‘/home/hadoop/employee’; EXPORT a partition :- EXPORT table employee partition(department=’BIGDATA’) to ‘/home/hadoop/employee_bigdata’;…

What is Hive

rajesh • March 18, 2016bigdata

Apache Hive is a data warehouse infrastructure for querying, analyzing and summarizing the data stored in Hadoop’s HDFS. It provides an SQL-like language called HiveQL with schema on read and implicitly converts queries to MapReduce, Tez or Spark jobs. Some of the Hive features:- Different storage formats for data in…

Data Storage Formats in Hive

rajesh • March 18, 2016bigdata

We will see different file formats for storing data into a Hive table. Using a right file format for Hive table will save a lot of disk space as well as will improve performance of Hive queries. TEXTFILE Textfile format stores data as plain text files. Textfile format enables rapid development…

Managed and External Tables in Hive

rajesh • March 18, 2016bigdata

Hive allows us to create two type of tables. Managed tables External tables Managed Tables Hive manages the table and its data. When a Managed table is deleted, Hive deletes the data from the table as well as the table metadata from the Hive metastore. When we create a Hive…

Partitioning in Hive

rajesh • March 18, 2016bigdata

Partitioning We can use partitioning feature of Hive to divide a table into different partitions. Each partition of a table is associated with a particular value(s) of partition column(s). Partitioning allows Hive to run queries on a specific set of data in the table based on the value of partition…

Select Query with Joins in Hive

rajesh • March 18, 2016bigdata

Joins are used in a query to combine data from two or more tables based on the values of some columns. We will see how to write queries using join in Hive. Hive Tables We have the following two tables in Hive. Employee table containing data about Employees:- 0: jdbc:hive2://localhost:10000>…

Select Query with Where clause in Hive

rajesh • March 17, 2016bigdata

Create, Use and Drop a Database in Hive

rajesh • March 17, 2016bigdata

We will see how to Create, Use and Drop a database in Hive. Create a Database #List all the databases 0: jdbc:hive2://localhost:10000> show databases; +—————-+–+ | database_name | +—————-+–+ | default | +—————-+–+ #Create a new Database 0: jdbc:hive2://localhost:10000> create database mydb; #List all the databases 0: jdbc:hive2://localhost:10000> show databases;…

Algorithms

Software Design

Databases

Search Engine development

Bigdata

Some Configuration Properties in Hive

Export and Import a Hive Table/Partition

What is Hive

Data Storage Formats in Hive

Managed and External Tables in Hive

Partitioning in Hive

Select Query with Joins in Hive

Select Query with Where clause in Hive

Create, Use and Drop a Database in Hive