Anvil

Installing Hive

This post lists the steps to install Apache Hive on a cluster that has Hadoop installed on it. The server commands are for RHEL, the clients ones for macOS. Items in purple text require updating to reflect circumstances.

Install Hive

Connect to the Hadoop name node (the host identifier is defined in ~/.ssh/config):

ssh namenode

Download and unpack the Hive objects to an appropriate location on the name node:

wget http://mirror_site/apache-hive-n.n.n-bin.tar.gz -P ~/downloads
sudo tar zxvf ~/downloads/apache-hive-* -C /usr/local
sudo mv /usr/local/apache-hive-* /usr/local/hive

Add Hive to your path:

nano ~/.bash_profile
export HIVE_HOME=/usr/local/hive
export PATH=$PATH:$HIVE_HOME/bin
export HADOOP_USER_CLASSPATH_FIRST=true

Log out and back in.

Start Hadoop:

start-dfs.sh && start-yarn.sh && mr-jobhistory-daemon.sh start historyserver

Configure the metastore database

Install a database to act as a local metastore:

sudo yum install mariadb-server mariadb

Start the database:

sudo systemctl start mariadb.service

Set the database (in this case MariaDb) to start automatically:

sudo systemctl enable mariadb.service

Run the security configuration script, clicking enter when prompted for the root password (set as part of the script):

mysql_secure_installation

Install the MySql connector (downloads to /usr/share/java):

sudo yum install mysql-connector-java

Create a symbolic link for the connector in Hive’s lib directory:

sudo ln -s /usr/share/java/mysql-connector-java.jar $HIVE_HOME/lib/mysql-connector-java.jar

Login to MariaDbusing the root password set earlier:

/usr/bin/mysql -u root -p

Create the Hive user:

CREATE USER 'hive'@'localhost' IDENTIFIED BY 'hive';
REVOKE ALL PRIVILEGES, GRANT OPTION FROM 'hive'@'localhost';
GRANT SELECT,INSERT,UPDATE,DELETE,LOCK TABLES,EXECUTE ON metastore.* TO 'hive'@'localhost';
FLUSH PRIVILEGES;

Check:

SHOW GRANTS FOR 'hive'@'localhost';

Create the metastore:

CREATE DATABASE metastore;
USE metastore;
SOURCE /usr/local/hive/scripts/metastore/upgrade/mysql/hive-schema-2.1.0.mysql.sql;

Configure Hive

sudo nano $HIVE_HOME/conf/hive-site.xml
<configuration>
	<property>
		<name>javax.jdo.option.ConnectionURL</name>
		<value>jdbc:mysql://localhost/metastore?createDatabaseIfNotExist=true</value>
		<description>metadata is stored in a MySQL server</description>
	</property>
	<property>
		<name>javax.jdo.option.ConnectionDriverName</name>
		<value>com.mysql.jdbc.Driver</value>
		<description>MySQL JDBC driver class</description>
	</property>
	<property>
		<name>javax.jdo.option.ConnectionUserName</name>
		<value>hive</value>
		<description>user name for connecting to mysql server</description>
	</property>
	<property>
		<name>javax.jdo.option.ConnectionPassword</name>
		<value>hive</value>
		<description>password for connecting to mysql server</description>
	</property>
	<property>
		<name>hive.server2.enable.doAs</name>
		<value>false</value>
		<description>Impersonate the connected user</description>
	</property>
</configuration>

Connect to Hive

Try the HiveServer1 CLI to see if Hive works:

hive

For HiveServer2 and Beeline (may need to create a link first):

sudo ln -s $HIVE_HOME/jdbc/hive-jdbc-2.1.0-standalone.jar $HIVE_HOME/lib/hive-jdbc-2.1.0-standalone.jar
hive --service metastore &
hive --service hiveserver2 &
beeline
!connect jdbc:hive2://localhost:10000 hive_username hive_password

Remotely:

beeline -u jdbc:hive2://name_node_ip:port -n hive_username -p hive_password