未加星标

Howto Setup Apache Zookeeper Cluster on Multiple Nodes in Linux

字体大小 | |
[系统(linux) 所属分类 系统(linux) | 发布者 店小二05 | 时间 2016 | 作者 红领巾 ] 0人收藏点击收藏

Howto Setup Apache Zookeeper Cluster on Multiple Nodes in Linux
If you are running Apache zookeeper in your infrastructure, you should set it up to run in a cluster mode. Zookeeper cluster is called as ensemble.

For a cluster to be always up and running, majority of the nodes in the cluster should be up. So, it is always recommended to run zookeeper cluster in odd number of servers. For example, cluster with 3 nodes, or cluster with 5 nodes, etc.

In this tutorial, we’ll setup zookeeper cluster with 3 node setup on the following servers: node1, node2, and node3.

Java Pre-req

For zookeeper, you should have java already installed on your system. JKD version 6 or above will work with Zookeeper.

The following will install the latest Java version on your system:

yum install java-1.8.0-openjdk

Verify that the java is installed properly.

# java -version
openjdk version "1.8.0_91"
OpenJDK Runtime Environment (build 1.8.0_91-b14)
OpenJDK 64-Bit Server VM (build 25.91-b14, mixed mode) Verify: Start zookeeper in Standalone Mode for Testing

Before we start the zookeeper cluster, first start the zookeeper on the individual machine in a single node configuration (without cluster setup) to make sure it works properly.

This way, we’ll isolate any non-cluster related issues and fix them first on the individual nodes.

In this example, I’ve installed zookeeper under /opt/zookeeper directory. This is using the latest zookeeper 3.4.9 version:

ZOOKEEPER_HOME=/opt/zookeeper

On node1, use the zookeeper’s sample configuration file zoo_sample.cfg as baseline.

cd $ZOOKEEPER_HOME/conf
cp zoo_sample.cfg zoo.cfg

From now on, we’ll use the zoo.cfg as our configuration file. We’ll modify this for our cluster setup later.

On node1, execute the following command to start the single node zookeeper.

cd $ZOOKEEPER_HOME
java -cp zookeeper-3.4.9.jar:lib/log4j-1.2.16.jar:lib/slf4j-log4j12-1.6.1.jar:lib/slf4j-api-1.6.1.jar:conf \
org.apache.zookeeper.server.quorum.QuorumPeerMain \
conf/zoo.cfg

In the above command:

Specify the jar files that should be included when zookeeper is started. This includes the zookeeper jar file, log4j, slf4j and slf4j-api jar files. All these jar files comes with the zookeeper installation, and you don’t have to download them separately. QuorumPeerMain is the name of the main class that should be invoked to start the zookeeper. conf/zoo.cfg is the zookeeper configuration file. If everything goes well, you’ll get the following output on the screen. In the beginning of each of the following line, it will have the time stamp followed by “[myid:] INFO ” [main:[email protected]] - Reading configuration from: conf/zoo.cfg
[main:[email protected]] - autopurge.snapRetainCount set to 3
[main:[email protected]] - autopurge.purgeInterval set to 0
[main:[email protected]] - Purge task is not scheduled.
[main:[email protected]] - Either no config or no quorum defined in config, running in standalone mode
[main:[email protected]] - Reading configuration from: conf/zoo.cfg
[main:[email protected]] - Starting server
[main:[email protected]] - Server environment:zookeeper.version=3.4.9-1757313, built on 08/23/2016 06:50 GMT
[main:[email protected]] - Server environment:host.name=node1.thegeekstuff.com
[main:[email protected]] - Server environment:java.version=1.8.0_91
[main:[email protected]] - Server environment:java.vendor=Oracle Corporation
[main:[email protected]] - Server environment:java.home=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.91-0.b14.el7_2.x86_64/jre
[main:[email protected]] - Server environment:java.class.path=zookeeper-3.4.9.jar:lib/log4j-1.2.16.jar:lib/slf4j-log4j12-1.6.1.jar:lib/slf4j-api-1.6.1.jar:conf
[main:[email protected]] - Server environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
[main:[email protected]] - Server environment:java.io.tmpdir=/tmp
[main:[email protected]] - Server environment:java.compiler=NA
[main:[email protected]] - Server environment:os.name=linux
[main:[email protected]] - Server environment:os.arch=amd64
[main:[email protected]] - Server environment:os.version=3.10.0-327.18.2.el7.x86_64
[main:[email protected]] - Server environment:user.name=root
[main:[email protected]] - Server environment:user.home=/root
[main:[email protected]] - Server environment:user.dir=/opt/zookeeper
[main:[email protected]] - tickTime set to 2000
[main:[email protected]] - minSessionTimeout set to -1
[main:[email protected]] - maxSessionTimeout set to -1
[main:[email protected]] - binding to port 0.0.0.0/0.0.0.0:2181
[main:[email protected]] - Reading snapshot /tmp/zookeeper/version-2/snapshot.363

Note: Now that we know this is working properly on single node, press Ctrl-C and come-out of this.

Repeat the above testing on node1 and node3 also to make sure zookeeper works on all the nodes in a single user standalone mode.

Possible Errors and Solutions during Zookeeper Startup

During the above standalone mode zookeeper startup testing, you might encounter these errors:

Error 1: You might get the following “java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory” error

Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory
at org.apache.zookeeper.server.quorum.QuorumPeerMain.clinit(QuorumPeerMain.java:64)
Caused by: java.lang.ClassNotFoundException: org.slf4j.LoggerFactory
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

Solution 1: The above error will happen when you don’t have slf4j-log4j12’s jar in the classpath. Include this jar file as shown below during startup.

java -cp zookeeper-3.4.9.jar:lib/log4j-1.2.16.jar:lib/slf4j-log4j12-1.6.1.jar:lib/slf4j-api-1.6.1.jar:conf org.apache.zookeeper.server.quorum.QuorumPeerMain conf/zoo.cfg Error 2: You might get the following “ERROR [main:[email protected]] Invalid config, exiting abnormally” error [myid:] - INFO [main:[email protected]] - Reading configuration from: zoo.cfg
[myid:] - ERROR [main:[email protected]] - Invalid config, exiting abnormally
org.apache.zookeeper.server.quorum.QuorumPeerConfig$ConfigException: Error processing zoo.cfg
at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:144)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:101)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
Caused by: java.lang.IllegalArgumentException: zoo.cfg1 file is missing
at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:128)
... 2 more

Solution 2: The above error will happen when it can’t find the zookeepers configuration file zoo.cfg. Make sure you’ve mentioned “conf/zoo.cfg” in the command line path at the end of the command as shown below.

java -cp zookeeper-3.4.9.jar:lib/log4j-1.2.16.jar:lib/slf4j-log4j12-1.6.1.jar:lib/slf4j-api-1.6.1.jar:conf \
org.apache.zookeeper.server.quorum.QuorumPeerMain \
conf/zoo.cfg

Error 3: You might get the following could not find main class error message.

Could not find or load main class org.apache.zookeeper.server.quorum.QuorumPeerMain

Solution 3: Make sure you are starting the zookeeper from the zookeeper home directory. For example, if you’ve installed zookeeper under /opt/zookeeper, start it as shown below:

export ZOOKEEPER_HOME=/opt/zookeeper
cd $ZOOKEEPER_HOME
java -cp zookeeper-3.4.9.jar:lib/log4j-1.2.16.jar:lib/slf4j-log4j12-1.6.1.jar:lib/slf4j-api-1.6.1.jar:conf \
org.apache.zookeeper.server.quorum.QuorumPeerMain \
conf/zoo.cfg Setup Zookeeper Cluster: Modify zoo.cfg File

Append the following lines to your $ZOOKEEPER_HOME/conf/zoo.cfg file. These parameters are required for cluster setup.

initLimit=5
syncLimit=2
server.1=node1.thegeekstuff.com:2888:3888
server.2=node2.thegeekstuff.com:2888:3888
server.3=node3.thegeekstuff.com:2888:3888

In the above:

initLimit This is the timeout limit, which indicates the length of time for one of the zookeeper nodes in quorum have to connect to the leader. syncLimit This specifies the limit on how much apart the individual nodes can be out-of-sync (i.e out-of-date) from the leader. The above two init and sync limit are calculated using tickTime. By default tickTime is set to 2000 in the zoo.cfg. This means 2000 milliseconds. So, when we set initLimit as 5, multiply that by tickTime to calculate it in seconds. So, initLimit=5*2000=10000=10 seconds. syncLimit=2*2000=4000=4 seconds. server.1, server.2 and server.3 will list all the three nodes. In this, instead of giving the full hostname, you can also specify the ip-address of the nodes. Don’t change the “:2888:3888” that is at the end of the nodes. Zookeeper nodes will use these ports to connect the individual follower nodes to the leader nodes. The another port is used for leader election.

Also, in the zoo.cfg, by default, the dataDir will be pointing to /tmp/zookeeper directory. Change this to something else.

In zoo.cfg, set the dataDir to the following:

dataDir=/var/zookeeper

Make sure this directory is created.

mkdir /var/zookeeper

Note: Make the above zoo.cfg changes on all the nodes (i.e node1, node2 and node3)

Create Unique Zookeeper Id on Individual Nodes

On node1, create a unique zookeeper id and store it in the “myid” file that should be located under the directory that is specified by the “dataDir” in zoo.cfg.

On node1, the unique id will be “1”, which will be stored in the /var/zookeeper/myid file.

# echo "1" > /var/zookeeper/myid
# cat /var/zookeeper/myid
1

On node2, the unique id will be “2”.

echo "2" > /var/zookeeper/myid

On node3, the unique id will be “3”.

echo "3" > /var/zookeeper/myid

Note: If you don’t set the myid properly, when you start the zookeper you’ll set the following “/var/zookeeper/myid file is missing” error message:

[myid:] - INFO [main:[email protected]] - Reading configuration from: conf/zoo.cfg
[myid:] - INFO [main:[email protected]] - Resolved hostname: node1.thegeekstuff.com to address: /192.168.101.1
[myid:] - INFO [main:[email protected]] - Resolved hostname: node2.thegeekstuff.com to address: /192.168.101.2
[myid:] - INFO [main:[email protected]] - Resolved hostname: node3.thegeekstuff.com to address: /192.168.101.3
[myid:] - WARN [main:[email protected]] - No server failure will be tolerated. You need at least 3 servers.
[myid:] - INFO [main:[email protected]] - Defaulting to majority quorums
[myid:] - ERROR [main:[email protected]] - Invalid config, exiting abnormally
org.apache.zookeeper.server.quorum.QuorumPeerConfig$ConfigException: Error processing conf/zoo.cfg
at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:144)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:101)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
Caused by: java.lang.IllegalArgumentException: /var/zookeeper/myid file is missing
at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parseProperties(QuorumPeerConfig.java:362)
at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:140)
... 2 more
Invalid config, exiting abnormally Note: You’ll see “[myid:]” in the above without any id number in it. But, once you fix the problem, on node1, in the log files, you’ll see “[myid:1]”. On node2, you’ll see [myid:2], and node3 will display [myid:3]. This is a easy and quick way to identify which zookeeper node a log message is from. Start the Zookeeper Cluster

Now, to start the cluster, start the zookeeper one-by-one on all the individual nodes, as shown below:

export ZOOKEEPER_HOME=/opt/zookeeper
cd $ZOOKEEPER_HOME
java -cp zookeeper-3.4.9.jar:lib/log4j-1.2.16.jar:lib/slf4j-log4j12-1.6.1.jar:lib/slf4j-api-1.6.1.jar:conf \
org.apache.zookeeper.server.quorum.QuorumPeerMain \
conf/zoo.cfg

Note: The best thing you can do is to put the above lines inside zookeeper-start.sh and use nohup command to start it in the background as shown below:

nohup zookeeper-start.sh &

Note: To stop the zookeeper cluster, on all the individual nodes, usegrep command to locate zookeeper process, and usekill command to terminate it.

On the node1, at this stage, you’ll start getting some error messages like these. You can ignore them for now.

We are getting this error because currently only node1 is up. Once node2 and node3 are up, we’ll not see this error message anymore.

[myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:[email protected]] - Resolved hostname: node1.thegeekstuff.com to address: /192.168.101.1
[myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:[email protected]] - Notification time out: 400
[myid:1] - WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:[email protected]] - Cannot open channel to 2 at election address /192.168.101.2:3888
[myid:1] - WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:[email protected]] - Cannot open channel to 3 at election address /192.168.101.3:3888
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:381)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:426)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:843)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:822) Note: Just by looking at the above logs message lines, we know it is from node1, as it says “[myid:1]” in the front of each line.

After starting the zookeeper on node2 and node3, we’ll see the following on all the logs on the individual nodes, indicating that the zookeeper cluster is up and running.

In front of each of these lines, there will be a timestamp, followed by “[myid:1] INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:” [email protected]] - Resolved hostname: node1.thegeekstuff.com to address: /192.168.101.1
[email protected]] - Notification time out: 6400
[/192.168.101.1:3888:[email protected]] - Received connection request /192.168.101.2:56214
[WorkerReceiver[myid=1]:[email protected]] - Notification: 1 (message format version), 2 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 2 (n.sid), 0x0 (n.peerEpoch) LOOKING (my state)
[WorkerReceiver[myid=1]:[email protected]] - Notification: 1 (message format version), 2 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x0 (n.peerEpoch) LOOKING (my state)
[email protected]] - FOLLOWING
[email protected]] - TCP NoDelay set to: true
[email protected]] - Server environment:zookeeper.version=3.4.9-1757313, built on 08/23/2016 06:50 GMT
[email protected]] - Server environment:host.name=node1.thegeekstuff.com
[email protected]] - Server environment:java.version=1.8.0_91
[email protected]] - Server environment:java.vendor=Oracle Corporation
[email protected]] - Server environment:java.home=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.91-0.b14.el7_2.x86_64/jre
[email protected]] - Server environment:java.class.path=zookeeper-3.4.9.jar:lib/log4j-1.2.16.jar:lib/slf4j-log4j12-1.6.1.jar:lib/slf4j-api-1.6.1.jar:conf
[email protected]] - Server environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
[email protected]] - Server environment:java.io.tmpdir=/tmp
[email protected]] - Server environment:java.compiler=NA
[email protected]] - Server environment:os.name=Linux
[email protected]] - Server environment:os.arch=amd64
[email protected]] - Server environment:os.version=3.10.0-327.18.2.el7.x86_64
[email protected]] - Server environment:user.name=root
[email protected]] - Server environment:user.home=/root
[email protected]] - Server environment:user.dir=/opt/zookeeper
[email protected]] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir /var/zookeeper/version-2 snapdir /var/zookeeper/version-2
[email protected]] - FOLLOWING - LEADER ELECTION TOOK - 8869
[email protected]] - Resolved hostname: node2.thegeekstuff.com to address: /192.168.101.2
[email protected]] - Resolved hostname: node3.thegeekstuff.com to address: /192.168.101.3
[email protected]] - Getting a diff from the leader 0x0
[email protected]] - Snapshotting: 0x0 to /var/zookeeper/version-2/snapshot.0

本文系统(linux)相关术语:linux系统 鸟哥的linux私房菜 linux命令大全 linux操作系统

主题: LinuxZooKeeperJavaKEEGMTICTI
分页:12
转载请注明
本文标题:Howto Setup Apache Zookeeper Cluster on Multiple Nodes in Linux
本站链接:http://www.codesec.net/view/481311.html
分享请点击:


1.凡CodeSecTeam转载的文章,均出自其它媒体或其他官网介绍,目的在于传递更多的信息,并不代表本站赞同其观点和其真实性负责;
2.转载的文章仅代表原创作者观点,与本站无关。其原创性以及文中陈述文字和内容未经本站证实,本站对该文以及其中全部或者部分内容、文字的真实性、完整性、及时性,不作出任何保证或承若;
3.如本站转载稿涉及版权等问题,请作者及时联系本站,我们会及时处理。
登录后可拥有收藏文章、关注作者等权限...
技术大类 技术大类 | 系统(linux) | 评论(0) | 阅读(50)