- 安装hadoop
- 整体
- 配置/etc/hosts
- 关闭防火墙
- centos7需要修改yum
- 安装hadoop,hive
- 下载hadoop并解压
- 更改java1.8d的环境变量
- 配置hadoop 的环境变量
- 修改site文件 etc/hadoop/core-site.xml
- 修改 hdfs的配置
- 修改配置文件中的slaves文件
- 初始化
- 主节点初始化文件系统
- 启动集群
- 在主节点启动所有HDFS服务进程
- 在主节点启动所有HDFS服务进程
- 排错
- 安装hive
- 安装mysql
- 创建hive的数据库
- 初始化密码
- 重新登陆创建
- 安装hive
- 下载hive 3.1.3
- 修改环境变量
- 修改hive配置文件
- 初始化hive meta
- 启动hive
- impala安装
- 下载cdh5,这个要用来做yum仓库
- 安装impala 会丢2个so文件,需要安装以下二个包
- 安装nginx ,来做yum云
- 生成yum文件`
- 安装impala
- 配置impala
- 配置javahome
- 配置/etc/impala/conf
- 运行 &测试
安装hadoop
整体
服务器目前是3台,centos stream 9.0、最好采用centos7
注意,centos 必须选择英文版本,否则会提示缺水包langpack-en
192.168.230.136/146/147
前面做master,后面做 slave
配置/etc/hosts
cat <<EOF >>/etc/hosts
192.168.230.136 master
192.168.230.146 node1
192.168.230.147 node2
EOF
关闭防火墙
systemctl stop firewalld
centos7需要修改yum
wget -O /etc/yum.repos.d/ali.repo http://mirrors.aliyun.com/repo/Centos-7.repo
安装hadoop,hive
下载hadoop并解压
wget https://archive.apache.org/dist/hadoop/common/hadoop-2.10.2/hadoop-2.10.2.tar.gz
下载jdk1.8
jdk到 5080下载。
更改java1.8d的环境变量
echo "export JAVA_HOME=/opt/jdk1.8.0_202" >>/etc/profile
cho "export PATH=$PATH:/opt/jdk1.8.0_202/bin" >>/etc/profile
. /etc/profile
###
配置hadoop 的环境变量
[root@localhost hadoop-2.10.2]# echo "export HADOOP_HOME=/opt/hadoop-2.10.2" >>/etc/profile
[root@localhost hadoop-2.10.2]# echo "export PATH=$PATH:/opt/hadoop-2.10.2/bin" >>/etc/profile
. /etc/profile
修改site文件 etc/hadoop/core-site.xml
文件主要内容如下
<configuration>
<!--用于设置Hadoop的文件系统,由URI指定-->
<property>
<name>fs.defaultFS</name>
<!--用于指定namenode地址在master机器上-->
<value>hdfs://master:9000</value>
</property>
<!--配置Hadoop的临时目录,默认/tmp/hadoop-${user.name}-->
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp</value>
</property>
</configuration>
修改 hdfs的配置
<configuration>
<!--指定HDFS的数量-->
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<!--secondary namenode 所在主机的IP和端口-->
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>node1:50090</value>
</property>
</configuration>
修改配置文件中的slaves文件
增加node1,node2 原来的localhost去掉。
初始化
主节点初始化文件系统
hdfs namenode -format
24/09/15 17:01:28 INFO util.GSet: 0.029999999329447746% max memory 966.7 MB = 297.0 KB
24/09/15 17:01:28 INFO util.GSet: capacity = 2^15 = 32768 entries
24/09/15 17:01:28 INFO namenode.FSImage: Allocated new BlockPoolId: BP-571852059-192.168.230.136-1726390888193
24/09/15 17:01:28 INFO common.Storage: Storage directory /export/servers/hadoop-2.7.4/tmp/dfs/name has been successfully formatted.
24/09/15 17:01:28 INFO namenode.FSImageFormatProtobuf: Saving image file /export/servers/hadoop-2.7.4/tmp/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
24/09/15 17:01:28 INFO namenode.FSImageFormatProtobuf: Image file /export/servers/hadoop-2.7.4/tmp/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 323 bytes saved in 0 seconds .
24/09/15 17:01:28 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
24/09/15 17:01:28 INFO namenode.FSImage: FSImageSaver clean checkpoint: txid = 0 when meet shutdown.
24/09/15 17:01:28 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/192.168.230.136
启动集群
在主节点启动所有HDFS服务进程
start-dfs.sh
执行以后出现java_home not set的错误
在HADOOP_HOME/libexec/hdfs-config.sh中,增加java-home的设置
在这个文件中的 hadoop-config.sh 文件前部增加export JAVA_HOME=/opt/jdk1.8.0_202
注意,增加到其他任何地方都没作用。
在主节点启动所有HDFS服务进程
start-yarn.sh
正常启动成功
排错
- 注意所有的配置文件都从master机器复制过去,避免配置问题
- slave文件不要忘记配置
3、主机的IP地址要固定IP。
安装hive
安装mysql
wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm
rpm -ivh mysql-community-release-el7-5.noarch.rpm
修改启用Mysql5.7的安装,修改vi /etc/yum.repos.d/mysql-community.repo,注意安装的时候后面要加 –noppgcheck
yum install mysql-server --nopgpcheck
systemctl start mysqld
grep password /var/log/mysqld.log
2024-09-15T12:19:08.484269Z 1 [Note] A temporary password is generated for root@localhost: .JHqnJLUa94#
2024-09-15T12:19:21.186969Z 2 [Note] Access denied for user 'root'@'localhost' (using password: NO)
创建hive的数据库
初始化密码
mysql> create database hive charset utf8;
ERROR 1820 (HY000): You must reset your password using ALTER USER statement before executing this statement.
mysql> set password=password('1qaz!QAZ');
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> flush privileges;
重新登陆创建
mysql> create database hive charset utf8;
Query OK, 1 row affected (0.00 sec)
mysql> create user hive@localhost identified by '1qaz!QAZ';
Query OK, 0 rows affected (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
上面不能忘记对数据库用户授权。
grant all privileges on hive.* to hive@localhost;
安装hive
下载hive 3.1.3
wget https://mirrors.aliyun.com/apache/hive/hive-3.1.3/apache-hive-3.1.3-bin.tar.gz -O /opt
cd /opt && tar zxf /apache-hive-3.1.3-bin.tar.gz
修改环境变量
在/etc/profile 尾部添加或者合并
export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin:/opt/jdk1.8.0_202/bin
export HADOOP_HOME=/opt/hadoop-2.10.2
export PATH=/opt/apache-hive-3.1.3-bin/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin:/opt/jdk1.8.0_202/bin:/opt/hadoop-2.10.2/bin:$HADOOP_HOME/sbin
export HIVE_HOME=/opt/apache-hive-3.1.3-bin
执行 . /etc/profile
生效。
修改hive配置文件
修改hive的hive-site.xml文文件
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>mysql驱动</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>数据库使用用户名</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>1qaz!QAZ</value>
<description>数据库密码</description>
</property>
<!-- 数据库 end -->
<!-- 其它 end -->
修改hadoop的core-site配置文件
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
<description>配置超级用户允许通过代理用户所属组</description>
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
<description>配置超级用户允许通过代理访问的主机节点</description>
</property>
<property>
<name>hadoop.proxyuser.root.users</name>
<value>*</value>
</property>
###
下载mysql驱动到 $HIVE_HOME/lib的目录下
/opt/apache-hive-3.1.3-bin/lib
curl 'http://pan.itshine.cn:5080/?explorer/share/fileOut&shareID=64h6PiQQ&path=%7BshareItemLink%3A64h6PiQQ%7D%2F%E6%95%B0%E6%8D%AE%E5%BA%93%E5%AE%89%E8%A3%85%E5%8C%85%2Fmysql5%E9%A9%B1%E5%8A%A8%2Fmysql-connector-java-5.1.49-bin.jar' > './mysql-connector-java-5.1.49-bin.jar'
初始化hive meta
[root@master conf]# schematool -dbType mysql -initSchema
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/apache-hive-3.1.3-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-2.10.2/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:derby:;databaseName=metastore_db;create=true
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: hive
Starting metastore schema initialization to 3.1.0
Initialization script hive-schema-3.1.0.mysql.sql
Error: Syntax error: Encountered "<EOF>" at line 1, column 64. (state=42X01,code=30000)
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
Underlying cause: java.io.IOException : Schema script failed, errorcode 2
Use --verbose for detailed stacktrace.
*** schemaTool failed ***
y
都是配置文件错误
最终的hive-site.xml文件如下
<!-- H2S运行绑定host -->
<property>
<name>hive.server2.thrift.bind.host</name>
<value>node1</value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?characterEncoding=UTF-8</value>
</property>
<property>
<name>hive.metastore.db.type</name>
<value>mysql</value>
<description>
Expects one of [derby, oracle, mysql, mssql, postgres].
Type of database used by the metastore. Information schema & JDBCStorageHandler depend on it.
</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>1qaz!QAZ</value>
</property>
</configuration>
然后执行 schematool -dbType mysql -initSchema
# schematool -dbType mysql -initSchema
....
jdbc:mysql://localhost:3306/hive> /*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */
No rows affected (0.001 seconds)
0: jdbc:mysql://localhost:3306/hive> /*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */
No rows affected (0.001 seconds)
0: jdbc:mysql://localhost:3306/hive> !closeall
Closing: 0: jdbc:mysql://localhost:3306/hive?characterEncoding=UTF-8
beeline>
beeline> Initialization script completed
Sun Sep 15 21:10:13 CST 2024 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
schemaTool complete
启动hive
前台启动启动hiveserver2服务:
nohup bin/hive –service metastore &
nohup bin/hive –service hiveserver2 &
impala安装
下载cdh5,这个要用来做yum仓库
curl 'http://pan.itshine.cn:5080/?explorer/share/file&hash=259dTd3iBsOzHLUkqX2Sh1KOVfnsPvwROb5XzGdPZuM6CHHpHyAnklZ8ESxozwuMi989UQ' > './cdh5.14.0-centos6.tar.gz'
##在所有的服务器上操作。可以考虑scp复制
安装impala 会丢2个so文件,需要安装以下二个包
curl 'http://pan.itshine.cn:5080/?explorer/share/file&hash=8e12BhKHTII9vSrr3G0Z7ir-ZkJW4YQ7d5yk-pqdq_wYbxqw2bhcvSAhA51ptYA6Pbc0VQ' > './cyrus-sasl-lib-2.1.23-15.el6_6.2.x86_64.rpm'
curl 'http://pan.itshine.cn:5080/?explorer/share/file&hash=898fjDM37_4UoZdz9rDnl5jpeKinnHEyFMCJNv_iMfctxNKzEtBfxQ-dVzGL5_gY9hYaqw' > './python-libs-2.6.6-66.el6_8.x86_64.rpm'
rpm -ivh --nodeps --force cyrus-sasl-lib-2.1.23-15.el6_6.2.x86_64.rpm
rpm -ivh --nodeps --force python-libs-2.6.6-66.el6_8.x86_64.rpm
安装nginx ,来做yum云
cd /opt
tar xzf cdh5.14.0-centos6.tar.gz
yum -y install nginx
ln -s /opt/cdh/5.14.0 /usr/share/nginx/html/5.14.0
生成yum文件`
cat <<EOF >/etc/yum.repos.d/cdh.repo
[cdh]
name=Cloudera's Distribution for Hadoop, Version 5
baseurl=http://node1//5.14.0
gpgcheck=0
enabled =1
EOF
安装impala
yum -y install impala*
配置impala
配置javahome
cat > /etc/default/bigtop-utils <<EOF
export JAVA_HOME=/opt/jdk1.8.0_202
EOF
配置/etc/impala/conf
mkdir -p /etc/impala/conf
cp /opt/apache-hive-3.1.3-bin/conf/hive-site.xml .
cp /opt/hadoop-2.10.2/etc/hadoop/hdfs-site.xml /etc/impala/conf
cp /opt/hadoop-2.10.2/etc/hadoop/core-site.xml /etc/impala/conf
运行 &测试
service impala-server start
service impala-catalog start
service impala-state-store start
[root@localhost yum.repos.d]# impala-shell
Starting Impala Shell without Kerberos authentication
Connected to localhost.localdomain:21000
Server version: impalad version 2.11.0-cdh5.14.0 RELEASE (build d68206561bce6b26762d62c01a78e6cd27aa7690)
***********************************************************************************
Welcome to the Impala shell.
(Impala Shell v2.11.0-cdh5.14.0 (d682065) built on Sat Jan 6 13:27:16 PST 2018)
To see live updates on a query's progress, run 'set LIVE_SUMMARY=1;'.
***********************************************************************************
[localhost.localdomain:21000] >
到此安装结束
最后编辑:严锋 更新时间:2025-05-09 15:48