Loading data in an oracle table from a file using sqlldr
We will see how to load data from a file in an oracle table using sqlldr. Table We have a table EMPLOYEE in the oracle database with the following schema:- SQL> desc EMPLOYEE; Name Null? Type —————————————– ——– —————————- EMP_ID NOT NULL NUMBER(38) EMP_NAME VARCHAR2(500) EMP_AGE NUMBER(38) Data File We…
Java code to run a remote script on remote host using SSH
SSH SSH (Secure Shell) provides support for secure remote login, secure file transfer, and secure TCP/IP and X11 forwarding. SSH uses a client-server model for – 1. Establishing a secured connection between two parties, 2. Authenticating the two parties, and 3. Encrypting the data transmissions between the two parties. To…
Creating Hive table using ORC format and importing data
We will see how to create a table in Hive using ORC format and how to import data into the table. ORC format ORC (Optimized Row Columnar) file format provides a highly efficient way to store Hive data. Using ORC format improves performance when reading, writing, and processing data in…
Creating Hive table using SEQUENCEFILE format and importing data
We will see how to create a table in Hive using SEQUENCEFILE format and how to import data into the table. Create table CREATE TABLE Employee( ID BIGINT, NAME STRING, AGE INT, SALARY BIGINT ) COMMENT ‘This is Employee table stored as sequencefile’ ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’…
Creating Hive table using TEXTFILE format and importing data
We will see how to create a table in Hive using TEXTFILE format and how to import data into the table. TEXTFILE Textfile format stores data as plain text files. Textfile format enables rapid development due to its simplicity but other file formats like ORC are much better when it comes…
Producer Consumer using Java BlockingQueue
BlockingQueue A BlockingQueue supports operations that :- 1. wait for some element to be available in the queue when retrieving an element. 2. wait for space to become available in the queue when storing an element. We will see implementation of a simple Producer-Consumer using BlockingQueue. We will be using…
Java Code for Running HIVE queries through JDBC
In this article we will see how to run Hive queries through JDBC. We are using apache-hive-1.0.1 and hiveserver2 is running on port 10000 on localhost. Jars Required To access Hive through JDBC we need to add the following jars in the classpath:- guava-18.0.jar hive-common-1.0.0.jar hive-exec-0.13.0.jar hive-jdbc-1.0.0.jar hive-metastore-1.0.0.jar hive-serde-1.0.0.jar hive-service-1.0.0.jar…
HDFS File operations using Java APIs
In this article we will see how to perform file operations on HDFS using Java APIs. Hadoop core jar is required to be added to the classpath. The class org.apache.hadoop.fs.FileSystem provide APIs for performing operations on HDFS. 1. Read a file from HDFS //Imports import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; //Code to…
Installing a Hadoop 1.x Cluster
We will see step by step process to install and configure a multi-node Hadoop 1.x setup. Cluster & User Suppose we want to setup a cluster of 5 machines. 192.168.0.1 192.168.0.2 192.168.0.3 192.168.0.4 192.168.0.5 We want to setup the cluster with the machine 192.168.0.1 as master node and all the…
What is JDBC?
JDBC(Java Database Connectivity) provides a standard Java API (Application Programming Interface) for accessing different database systems. JDBC provides a simple, database-independent set of APIs. Using the same APIs we can access a number of database systems in Java. Components of JDBC 1) Driver For connecting to the database we need…