Apache Hive is a data warehouse infrastructure built on top of Hadoop for providing data summarization, query, and analysis. While initially developed by Facebook, Apache Hive is now used and developed by other companies such as Netflix Amazon maintains a software fork of Apache Hive that is included in Amazon Elastic MapReduce on Amazon Web Services
Steps for installation
1. First you need to download the apache hive
you can download the latest mirror of Apache Hive from here
http://www.eu.apache.org/dist/hive/hive-1.2.1/apache-hive-1.2.1-bin.tar.gz
2.Untar the package which id downloaded using the following command:
sudo tar –xvzf apache-hive-1.2.1-bin.tar.gz
sudo mv apache-hive-1.2.1-bin hive
4. Provide the appropriate permission for the hive folder
sudo chown –R hduser:hdgroup hive
5. Next move the folder to /usr/local
sudo mv hive /usr/local/
6.Now edit the ~/.bashrc file and add the following contents at the end
sudo vim ~/.bashrc
Add the following at the end
# Set Hive-related environment variables
export HIVE_HOME=/usr/local/hive
export HIVE_CONF=/usr/local/hive/conf
export HIVE_LIB=/usr/local/hive/lib
export HiVE_CLASSPATH=/usr/local/hive/lib
export PATH=$PATH:HIVE_HOME/bin
7. Add a file under /conf folder called "hive-site.xml"
Add the following contents to the hive-site.xml file
<Configuration><property>
</property>
<property>
8. Now create two directories tmp and waehouse
Create /tmp and /user/hive/warehouse on HDFS and give them full permission
bin/hadoop dfs –mkdir /tmp
bin/hadoop dfs –mkdir /user/hive/warehouse
Giving permissions
bin/hadoop dfs –chmod g+w /tmp
bin/hadoop dfs –chmod g+w /user/hive/warehouse
useful article
ReplyDeletenice post.
ReplyDeleteetl testing online courses
web methods online courses