Overview:
Wondering why I hadn't updated my progress with installing JGroups on EC2?
It was because I had three false starts and got nowhere.
Finally however I found some more documentation and was able to get it running.
I found this report about a JGroups Performance test and the associated JBoss wiki Perftests.
That was enough information to understand how to get it working. It also helped that the more recent version JGroups 2.5.1 came with some sample configuration files.
Comments:
- The network bandwidth between EC2 nodes is the limiting factor.
- For 2 nodes: 4183.65 msgs/sec averaged over all receivers (throughput=4.18MB/sec) vs 60783.12 msgs/sec averaged over all receivers (throughput=60.78MB/sec)
- For 4 nodes: 3852.11 msgs/sec averaged over all receivers (throughput=3.85MB/sec)
vs 60783.12 msgs/sec averaged over all receivers (throughput=60.78MB/sec)
The next step would be to test on the larger instances. If the network performance, rated as better for the larger instances versus the default is true, it will show up in the results.
Install:
- wget http://easynews.dl.sourceforge.net/sourceforge/javagroups/JGroups-2.5.1.bin.zip
- unzip JGroups-2.5.1.bin.zip -d YourJavaLibDirectory
- cd YourJavaLibDirectory.
- nslookup `hostname` to get your servers IP address.
- edit the JGroups-2.5.1.bin/config.txt and JGroups-2.5.1.bin/tcp.xml to add the hosts. See the sample files at the bottom of this post.
- java -cp JGroups-2.5.1.bin/concurrent.jar:JGroups-2.5.1.bin/jgroups-all.jar:JGroups-2.5.1.bin/commons-logging.jar org.jgroups.tests.perf.Test -receiver -config JGroups-2.5.1.bin/config.txt -props JGroups-2.5.1.bin/tcp.xml
- java -cp JGroups-2.5.1.bin/concurrent.jar:JGroups-2.5.1.bin/jgroups-all.jar:JGroups-2.5.1.bin/commons-logging.jar org.jgroups.tests.perf.Test -sender -config JGroups-2.5.1.bin/config.txt -props JGroups-2.5.1.bin/tcp.xml
- If you have the hosts correct it should run the test.
Results:
2 nodes
-- results:
10.255.23.160:7800 (myself):
num_msgs_expected=20000, num_msgs_received=20000 (loss rate=0.0%), received=20MB, time=4664ms, msgs/sec=4288.16, throughput=4.29MB
10.255.26.143:7800:
num_msgs_expected=20000, num_msgs_received=20000 (loss rate=0.0%), received=20MB, time=4903ms, msgs/sec=4079.14, throughput=4.08MB
combined: 4183.65 msgs/sec averaged over all receivers (throughput=4.18MB/sec)
4 nodes (2 senders, 2 receivers):
-- results:
10.253.15.95:7800:
num_msgs_expected=20000, num_msgs_received=20000 (loss rate=0.0%), received=20MB, time=5212ms, msgs/sec=3837.3, throughput=3.84MB
10.255.23.160:7800:
num_msgs_expected=20000, num_msgs_received=20000 (loss rate=0.0%), received=20MB, time=5174ms, msgs/sec=3865.48, throughput=3.87MB
10.255.26.143:7800:
num_msgs_expected=20000, num_msgs_received=20000 (loss rate=0.0%), received=20MB, time=5192ms, msgs/sec=3852.08, throughput=3.85MB
10.253.83.143:7800 (myself):
num_msgs_expected=20000, num_msgs_received=20000 (loss rate=0.0%), received=20MB, time=5190ms, msgs/sec=3853.56, throughput=3.85MB
combined: 3852.11 msgs/sec averaged over all receivers (throughput=3.85MB/sec)
Sample Output:
2 Nodes:
----------------------- TEST -----------------------
Date: Mon Nov 05 04:30:54 EST 2007
Run by: root
mcast_port: 7500
log_interval: 1000
sender: true
props: JGroups-2.5.1.bin/tcp.xml
jmx: false
bind_addr: localhost
num_members: 2
msg_size: 1000
dump_transport_stats: false
start_port: 7800
topic: topic/testTopic
num_senders: 2
cluster: 10.255.23.160:7800,10.255.26.143:7801
num_msgs: 10000
transport: org.jgroups.tests.perf.transports.JGroupsTransport
config: JGroups-2.5.1.bin/config.txt
processing_delay: 0
mcast_addr: 228.1.2.3
JGroups version: 2.5.1
Nov 5, 2007 4:30:54 AM org.jgroups.JChannel init
INFO: JGroups version: 2.5.1
-------------------------------------------------------
GMS: address is 10.255.26.143:7800
-------------------------------------------------------
-- 10.255.26.143:7800 joined
-- waiting for 2 members to join
-- 10.255.23.160:7800 joined
-- READY (2 acks)
-- sending 10000 1KB messages
-- received 1000 messages
-- received 2000 messages
++ sent 1000
-- received 3000 messages
++ sent 2000
-- received 4000 messages
-- received 5000 messages
++ sent 3000
-- received 6000 messages
++ sent 4000
-- received 7000 messages
-- received 8000 messages
++ sent 5000
-- received 9000 messages
-- received 10000 messages
-- received 11000 messages
++ sent 6000
-- received 12000 messages
++ sent 7000
-- received 13000 messages
-- received 14000 messages
-- received 15000 messages
++ sent 8000
-- received 16000 messages
-- received 17000 messages
++ sent 9000
-- received 18000 messages
-- received 19000 messages
++ sent 10000
-- received 20000 messages
-- results:
10.255.23.160:7800:
num_msgs_expected=20000, num_msgs_received=20000 (loss rate=0.0%), received=20MB, time=4664ms, msgs/sec=4288.16, throughput=4.29MB
10.255.26.143:7800 (myself):
num_msgs_expected=20000, num_msgs_received=20000 (loss rate=0.0%), received=20MB, time=4903ms, msgs/sec=4079.14, throughput=4.08MB
combined: 4183.65 msgs/sec averaged over all receivers (throughput=4.18MB/sec)
4 Nodes:
Sender Node Output
----------------------- TEST -----------------------
Date: Mon Nov 05 04:30:54 EST 2007
Run by: root
mcast_port: 7500
log_interval: 1000
sender: true
props: JGroups-2.5.1.bin/tcp.xml
jmx: false
bind_addr: localhost
num_members: 2
msg_size: 1000
dump_transport_stats: false
start_port: 7800
topic: topic/testTopic
num_senders: 2
cluster: 10.255.23.160:7800,10.255.26.143:7801
num_msgs: 10000
transport: org.jgroups.tests.perf.transports.JGroupsTransport
config: JGroups-2.5.1.bin/config.txt
processing_delay: 0
mcast_addr: 228.1.2.3
JGroups version: 2.5.1
Nov 5, 2007 4:30:54 AM org.jgroups.JChannel init
INFO: JGroups version: 2.5.1
-------------------------------------------------------
GMS: address is 10.255.26.143:7800
-------------------------------------------------------
-- 10.255.26.143:7800 joined
-- waiting for 2 members to join
-- 10.255.23.160:7800 joined
-- READY (2 acks)
-- sending 10000 1KB messages
-- received 1000 messages
-- received 2000 messages
++ sent 1000
-- received 3000 messages
++ sent 2000
-- received 4000 messages
-- received 5000 messages
++ sent 3000
-- received 6000 messages
++ sent 4000
-- received 7000 messages
-- received 8000 messages
++ sent 5000
-- received 9000 messages
-- received 10000 messages
-- received 11000 messages
++ sent 6000
-- received 12000 messages
++ sent 7000
-- received 13000 messages
-- received 14000 messages
-- received 15000 messages
++ sent 8000
-- received 16000 messages
-- received 17000 messages
++ sent 9000
-- received 18000 messages
-- received 19000 messages
++ sent 10000
-- received 20000 messages
-- results:
10.255.23.160:7800:
num_msgs_expected=20000, num_msgs_received=20000 (loss rate=0.0%), received=20MB, time=4664ms, msgs/sec=4288.16, throughput=4.29MB
10.255.26.143:7800 (myself):
num_msgs_expected=20000, num_msgs_received=20000 (loss rate=0.0%), received=20MB, time=4903ms, msgs/sec=4079.14, throughput=4.08MB
combined: 4183.65 msgs/sec averaged over all receivers (throughput=4.18MB/sec)
Receiver Node Output
----------------------- TEST -----------------------
Date: Mon Nov 05 04:46:12 EST 2007
Run by: root
mcast_port: 7500
log_interval: 1000
sender: false
props: JGroups-2.5.1.bin/tcp.xml
jmx: false
bind_addr: localhost
num_members: 4
msg_size: 1000
dump_transport_stats: false
start_port: 7800
topic: topic/testTopic
num_senders: 2
cluster: 10.255.23.160:7800,10.255.26.143:7801,10.253.83.143:7802,10.253.15.95:7803
num_msgs: 10000
transport: org.jgroups.tests.perf.transports.JGroupsTransport
config: JGroups-2.5.1.bin/config.txt
processing_delay: 0
mcast_addr: 228.1.2.3
JGroups version: 2.5.1
Nov 5, 2007 4:46:12 AM org.jgroups.JChannel init
INFO: JGroups version: 2.5.1
-------------------------------------------------------
GMS: address is 10.253.83.143:7800
-------------------------------------------------------
-- 10.253.15.95:7800 joined
-- 10.253.83.143:7800 joined
-- waiting for 4 members to join
-- 10.255.23.160:7800 joined
-- 10.255.26.143:7800 joined
-- READY (4 acks)
-- received 1000 messages
-- received 2000 messages
-- received 3000 messages
-- received 4000 messages
-- received 5000 messages
-- received 6000 messages
-- received 7000 messages
-- received 8000 messages
-- received 9000 messages
-- received 10000 messages
-- received 11000 messages
-- received 12000 messages
-- received 13000 messages
-- received 14000 messages
-- received 15000 messages
-- received 16000 messages
-- received 17000 messages
-- received 18000 messages
-- received 19000 messages
-- received 20000 messages
-- local results:
sender: 10.255.23.160:7800: num_msgs_expected=10000, num_msgs_received=10000 (loss rate=0.0%), received=10MB, time=5180ms,
msgs/sec=1930.5, throughput=1.93MB
sender: 10.253.15.95:7800: num_msgs_expected=10000, num_msgs_received=10000 (loss rate=0.0%), received=10MB, time=4832ms,
msgs/sec=2069.54, throughput=2.07MB
-- results:
10.253.15.95:7800:
num_msgs_expected=20000, num_msgs_received=20000 (loss rate=0.0%), received=20MB, time=5212ms, msgs/sec=3837.3, throughput=3.84MB
10.255.23.160:7800:
num_msgs_expected=20000, num_msgs_received=20000 (loss rate=0.0%), received=20MB, time=5174ms, msgs/sec=3865.48, throughput=3.87MB
10.255.26.143:7800:
num_msgs_expected=20000, num_msgs_received=20000 (loss rate=0.0%), received=20MB, time=5192ms, msgs/sec=3852.08, throughput=3.85MB
10.253.83.143:7800 (myself):
num_msgs_expected=20000, num_msgs_received=20000 (loss rate=0.0%), received=20MB, time=5190ms, msgs/sec=3853.56, throughput=3.85MB
combined: 3852.11 msgs/sec averaged over all receivers (throughput=3.85MB/sec)
-- results:
10.253.15.95:7800:
num_msgs_expected=20000, num_msgs_received=20000 (loss rate=0.0%), received=20MB, time=5212ms, msgs/sec=3837.3, throughput=3.84MB
10.255.23.160:7800:
num_msgs_expected=20000, num_msgs_received=20000 (loss rate=0.0%), received=20MB, time=5174ms, msgs/sec=3865.48, throughput=3.87MB
10.255.26.143:7800:
num_msgs_expected=20000, num_msgs_received=20000 (loss rate=0.0%), received=20MB, time=5192ms, msgs/sec=3852.08, throughput=3.85MB
10.253.83.143:7800 (myself):
num_msgs_expected=20000, num_msgs_received=20000 (loss rate=0.0%), received=20MB, time=5190ms, msgs/sec=3853.56, throughput=3.85MB
combined: 3852.11 msgs/sec averaged over all receivers (throughput=3.85MB/sec)
Sample config.txt file
############################
# only used by TCP Transport
############################
# List of hosts in the cluster. Since we don't specify ports, you cannot run multiple TcpTransports
# on the same machine: each ember has to be run on a separate machine (this may be changed in a next version)
#cluster=127.0.0.1:7800,127.0.0.1:7801
# 2nodes # cluster=10.255.23.160:7800,10.255.26.143:7801
cluster=10.255.23.160:7800,10.255.26.143:7801,10.253.83.143:7802,10.253.15.95:7803
Sample hosts line in tcp.xml
initial_hosts="${jgroups.tcpping.initial_hosts:10.255.23.160[7800],10.255.26.143[7801],10.253.83.143[7802],10.253.15.95[7803]}"