目录

Hive配置笔记

下载Hive

1
2
3
4
5
6
7
wget https://mirror.sjtu.edu.cn/apache/hive/hive-3.1.2/apache-hive-3.1.2-bin.tar.gz
tar -xzf ./apache-hive-3.1.2-bin.tar.gz
mv apache-hive-3.1.2-bin apache-hive-3.1.2
sudo mkdir /opt/hive
sudo mv ./apache-hive-3.1.2 /opt/hive/
sudo chown -R wjadmin:hadoop /opt/hive/
sudo setfacl -R -m g:hadoop:rwx /opt/hive/

配置环境变量

1
2
export HIVE_HOME="/opt/hive/apache-hive-3.1.2"
export PATH=$HIVE_HOME/bin:$PATH

配置Hive

之前配置hadoop的时候已经配置好了HADOOP_HOME了
所以直接可以使用Hadoop相关命令

1
2
3
4
$HADOOP_HOME/bin/hadoop fs -mkdir -p /tmp/hive
$HADOOP_HOME/bin/hadoop fs -mkdir -p /hive/warehouse
$HADOOP_HOME/bin/hadoop fs -chmod -R a+rwx /tmp
$HADOOP_HOME/bin/hadoop fs -chmod -R g+w /hive

hive-env.sh

1
2
cp /opt/hive/apache-hive-3.1.2/conf/hive-env.sh.template /opt/hive/apache-hive-3.1.2/conf/hive-env.sh
sudo chown -R wjadmin:hadoop /opt/hive/
1
2
3
4
5
6
HADOOP_HOME="/opt/hadoop/hadoop-3.3.1"
JAVA_HOME="/opt/java/graalvm-ce-java8-20.3.2"
HIVE_HOME="/opt/hive/apache-hive-3.1.2"
export HADOOP_HOME
export JAVA_HOME
export HIVE_HOME

hive-site.xml

1
2
cp /opt/hive/apache-hive-3.1.2/conf/hive-default.xml.template /opt/hive/apache-hive-3.1.2/conf/hive-site.xml
sudo chown -R wjadmin:hadoop /opt/hive/
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
<configuration>
  <!--Hive作业的HDFS根目录位置 -->
  <property>
    <name>hive.exec.scratchdir</name>
    <value>/tmp/hive</value>
  </property>
  <!--Hive作业的HDFS根目录创建写权限 -->
  <property>
    <name>hive.scratch.dir.permission</name>
    <value>733</value>
  </property>
  <!--hdfs上hive元数据存放位置 -->
  <property>
    <name>hive.metastore.warehouse.dir</name>
    <value>/hive/warehouse</value>
  </property>
  <!--连接数据库地址,名称 -->
  <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://node98:3306/hive?createDatabaseIfNotExist=true</value>
  </property>
  <!--连接数据库驱动 -->
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.cj.jdbc.Driver</value>
  </property>
  <!--连接数据库用户名称 -->
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>wjadmin</value>
  </property>
  <!--连接数据库用户密码 -->
  <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>xxxxxx</value>
  </property>
  <!--客户端显示当前查询表的头信息 -->
  <property>
    <name>hive.cli.print.header</name>
    <value>true</value>
  </property>
  <!--客户端显示当前数据库名称信息 -->
  <property>
    <name>hive.cli.print.current.db</name>
    <value>true</value>
  </property>
    <property>
    <name>hive.server2.authentication</name>
    <value>NONE</value>
  </property>
  <property>
    <name>hive.server2.thrift.bind.host</name>
    <value>192.168.131.198</value>
  </property>
  <property>
    <name>hive.execution.engine</name>
    <value>mr</value>
  </property>
</configuration>

创建数据库

1
2
mysql -u wjadmin -p
> CREATE DATABASE hive;

安装Mysql connector

1
2
3
4
https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-8.0.26.tar.gz
 tar -xzf ./mysql-connector-java-8.0.26.tar.gz
 cp ./mysql-connector-java-8.0.26/mysql-connector-java-8.0.26.jar /opt/hive/apache-hive-3.1.2/lib/
 sudo chown -R wjadmin:hadoop /opt/hive/

初始化Hive

1
schematool -dbType mysql -initSchema

https://cdn.nlark.com/yuque/0/2021/png/368236/1626769003178-f1ecd7d9-4261-4583-8218-0cdd506c0700.png#clientId=u9ba2a74c-c653-4&from=paste&height=48&id=u4feacfe8&margin=%5Bobject%20Object%5D&name=image.png&originHeight=48&originWidth=291&originalType=binary&ratio=1&size=3002&status=done&style=none&taskId=uf9df0cf7-1e0c-4427-985f-b8d39fbbccd&width=291

创建systemd服务

metastore

1
sudo vim /etc/systemd/system/hive-meta.service
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
[Unit] 
Description=Hive metastore 
After=network.target 
 
[Service] 
User=wjadmin
Group=hadoop 
ExecStart=/opt/hive/apache-hive-3.1.2/bin/hive --service metastore 
 
[Install] 
WantedBy=multi-user.target
1
2
sudo systemctl daemon-reload
sudo systemctl start hive-meta

hiveserver2

1
sudo vim /etc/systemd/system/hiveserver2.service
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
[Unit] 
Description=Hive metastore 
After=network.target 
 
[Service] 
User=wjadmin
Group=hadoop 
ExecStart=/opt/hive/apache-hive-3.1.2/bin/hiveserver2
 
[Install] 
WantedBy=multi-user.target
1
2
sudo systemctl daemon-reload
sudo systemctl start hiveserver2

验证

http://192.168.131.198:10002/
https://cdn.nlark.com/yuque/0/2021/png/368236/1626770844523-f4a8f15f-0c1f-490a-b7e2-d14944e5fb81.png#clientId=u9ba2a74c-c653-4&from=paste&height=931&id=u81bc4142&margin=%5Bobject%20Object%5D&name=image.png&originHeight=931&originWidth=1213&originalType=binary&ratio=1&size=93532&status=done&style=none&taskId=ue4d35bc8-4624-497c-a949-fbf46872608&width=1213

排错

遇到connection refused

这一般是hiveserver2没有成功启动,查看一下hiveserver2的webui
通过查看日志,我发现遇到了tez引擎的问题,通过在配置文件中指定hive.execution.engine为mr可以绕过这个为你,但是有个副作用,就是每次启动第一次都会失败重试,也就是启动时间加了60s。
参考:https://blog.csdn.net/weixin_44917271/article/details/110456020

User: xxxx is not allowed to impersonate anonymous

这个问题需要修改hadoop的配置
参考:https://stackoverflow.com/questions/42234242/hiveserver2-failed-to-open-new-session-in-beeline

Beeline报错说tmp没权限

1
2
# 直接给777完事
$HADOOP_HOME/bin/hadoop fs -chmod -R a+rwx /tmp

执行语句说warehouse文件夹没有权限

换成有权限的wjadmin用户即可