Hive allows us to create two type of tables.
- Managed tables
- External tables
Managed Tables
Hive manages the table and its data. When a Managed table is deleted, Hive deletes the data from the table as well as the table metadata from the Hive metastore.
When we create a Hive table, by default it is a managed table.
By default all managed tables are created inside the Hive warehouse directory. The property which controls this setting is – hive.metastore.warehouse.dir
Following links explain about creating managed Hive table and importing data, using different formats:-
Creating Hive table using TEXTFILE format and importing data
Creating Hive table using SEQUENCEFILE format and importing data
Creating Hive table using ORC format and importing data
External Table
External tables in Hive do not store data for the table in the hive warehouse directory. External table in Hive stores only the metadata about the table in the Hive metastore. Any directory on HDFS can be pointed to as the table data while creating the external table. All files inside the directory will be treated as table data.
When external table is deleted, only the table metadata from the hive metastore is deleted. The directory containing the data remains intact.
External tables allows a user to manage data outside of hive.
Following link explains about creating an External Hive table and importing data :-
Creating External Hive table and importing data