Managed and External Tables in Hive

Hive allows us to create two type of tables.

  1. Managed tables
  2. External tables

Managed Tables

Hive manages the table and its data. When a Managed table is deleted, Hive deletes the data from the table as well as the table metadata from the Hive metastore.

When we create a Hive table, by default it is a managed table.

By default all managed tables are created inside the Hive warehouse directory. The property which controls this setting is – hive.metastore.warehouse.dir

Following links explain about creating managed Hive table and importing data, using different formats:-

Creating Hive table using TEXTFILE format and importing data

Creating Hive table using SEQUENCEFILE format and importing data

Creating Hive table using ORC format and importing data

 

External Table

External tables in Hive do not store data for the table in the hive warehouse directory. External table in Hive stores only the metadata about the table in the Hive metastore. Any directory on HDFS can be pointed to as the table data while creating the external table. All files inside the directory will be treated as table data.

When external table is deleted, only the table metadata from the hive metastore is deleted. The directory containing the data remains intact.

External tables allows a user to manage data outside of hive.

Following link explains about creating an External Hive table and importing data :-

Creating External Hive table and importing data

 

Leave a Reply

Your email address will not be published. Required fields are marked *