As I mentioned in the road map, the plan is to run through the popular data mining and business intelligence software installing and running any demos available.
This is partially to gain experience with the software but also to demonstrate the ability to use the on demand nature of Amazon's EC2 (Elastic Cloud beta) to provide the ability to use the tools when required and ramp up or down the amount of computing resources used.
Pentaho is a popular open source Business Intelligence (BI) software suite, it has a active support group, good support forums with active Pentaho employee participation. You can download the software suite or subsections of the software from the downloads area or alternatively go to the sourceforge site and get them from there.
Like any good software vendor, open source or not, they provide a Pentaho 1.2.1 GA demo of their software so potential clients can get a good look and feel for the product.
I used my old faithful CentOS 4.4 linux distro which is essentially Red Hat Linux Enterprise 4 (RHEL4) running MySQL 5.1 as a base to install the Pentaho demo.
The Pentaho BI Suite is built on Java and the demo uses JBoss, providing access to the various parts of the BI Suite (Reporting, Kettle ETL, Weka and Shark Workflow Engine).
So naturally it required a Java JDK. Given I had the JDK-1.5.0.12 for linux handy I installed that Java.
Comments on the install and demo:
- I tried the Pentaho 1.6.0 Release Candidate 1 demo (pentaho_demo_mysql5-1.6.0-RC1.782.tar.gz) and the demo install failed with a bunch of java class errors. I found this Pentaho forum post indicating similar issues. I haven't tried the 1.6.0 zip file to check whether it is indeed an issue with missing jar files.
- Once I reverted to the Pentaho 1.2.1 GA demo everything was sweet.
- To run the Pentaho Server on EC2 and use your browser you will need to update the /install_dir/pentaho-demo/jboss/server/default/deploy/pentaho.war/WEB-INF/web.xml and modify the base-url to be the hostname of your server.
- The log produced by the start_pentaho.sh was very verbose and actually very interesting to see the calls made to service the web requests.
- Set the EC2-security group to allow access to port 8080 unless you are using the default security group.
- Point your browser at http://yourEC2-DNS-hostname:8080
I have included a screenshot of the home page once the demo was up and running. As per normal I have dumped the most revelant pieces of my work at the end of this post.
Have Fun
Paul
Get Java 1.5 JDK and follow the intructions at Java 1.5 and install
cd /usr/local
sh /mnt/jdk-1_5_0_12-linux-i586.bin
Do you agree to the above license terms? [yes or no]
yes
Unpacking...
Checksumming...
0
0
Extracting...
UnZipSFX 5.42 of 14 January 2001, by Info-ZIP (Zip-Bugs@lists.wku.edu).
creating: jdk1.5.0_12/
creating: jdk1.5.0_12/jre/
creating: jdk1.5.0_12/jre/bin/
inflating: jdk1.5.0_12/jre/bin/java
inflating: jdk1.5.0_12/jre/bin/keytool
inflating: jdk1.5.0_12/jre/bin/policytool
...
Setup a bunch of symbolic links. Note: This allows flexibility to change the versions in the future.
ln -s /usr/local/jdk1.5.0_12/ java
cd bin
ln -s /usr/local/jdk1.5.0_12/bin/java java
Check java is working
java -version
java version "1.5.0_12"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_12-b04)
Java HotSpot(TM) Client VM (build 1.5.0_12-b04, mixed mode, sharing)
edit .bash_profile add JAVA_HOME and add java and mysql binaries to the path
[pentaho@domU-12-31-35-00-53-92 ~]$ source .bash_profile
Example of bash_profile:
cat .bash_profile
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
PATH=$PATH:$HOME/bin:$JAVA_HOME/bin:/usr/local/mysql/bin
JAVA_HOME=/usr/local/java/
export PATH JAVA_HOME
unset USERNAME
Check the JAVA and MySQL versions and path:
java -version
java version "1.5.0_12"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_12-b04)
Java HotSpot(TM) Client VM (build 1.5.0_12-b04, mixed mode, sharing)
mysql -V
mysql Ver 14.13 Distrib 5.1.20-beta, for pc-linux-gnu (i686) using readline 5.0
Get Pentaho demo 1.2 zipfile Demo (to be safe)
wget http://umn.dl.sourceforge.net/sourceforge/pentaho/pentaho_demo-1.2.1.625-GA.zip
unzip pentaho_demo-1.2.1.625-GA.zip -d /usr/local/pentaho
cd /usr/local/pentaho
chown -R root:pentaho .
ls -la
total 12
drwxr-xr-x 3 root pentaho 4096 Aug 25 03:01 .
drwxr-xr-x 15 root root 4096 Aug 25 03:00 ..
drwxr-xr-x 5 root pentaho 4096 Aug 25 03:01 pentaho-demo
Loading the sample data and checking what was created in MySQL database:
cd /usr/local/pentaho/pentaho-demo/data
mysql -u root -p < SampleDataDump_MySql.sql
mysql -u root -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.1.20-beta-log MySQL Community Server (GPL)
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| hibernate |
| mysql |
| quartz |
| sampledata |
| test |
+--------------------+
6 rows in set (0.00 sec)
mysql> use sampledata
Database changed
mysql> show tables;
+----------------------+
| Tables_in_sampledata |
+----------------------+
| CUSTOMERS |
| CUSTOMER_W_TER |
| DEPARTMENT_MANAGERS |
| EMPLOYEES |
| OFFICES |
| ORDERDETAILS |
| ORDERFACT |
| ORDERS |
| PAYMENTS |
| PRODUCTS |
| QUADRANT_ACTUALS |
| TIME |
| TRIAL_BALANCE |
+----------------------+
13 rows in set (0.00 sec)
mysql> show table status;
+---------------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-------------------+----------+----------------+----------------------------------------------------------------------------------+
| Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time | Check_time | Collation | Checksum | Create_options | Comment |
+---------------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-------------------+----------+----------------+----------------------------------------------------------------------------------+
| CUSTOMERS | InnoDB | 10 | Compact | 117 | 420 | 49152 | 0 | 0 | 0 | NULL | 2007-08-25 03:04:58 | NULL | NULL | latin1_general_cs | NULL | | InnoDB free: 11264 kB |
| CUSTOMER_W_TER | InnoDB | 10 | Compact | 103 | 477 | 49152 | 0 | 16384 | 0 | NULL | 2007-08-25 03:04:58 | NULL | NULL | latin1_general_cs | NULL | | InnoDB free: 11264 kB |
| DEPARTMENT_MANAGERS | InnoDB | 10 | Compact | 4 | 4096 | 16384 | 0 | 0 | 0 | NULL | 2007-08-25 03:04:58 | NULL | NULL | latin1_general_cs | NULL | | InnoDB free: 11264 kB |
| EMPLOYEES | InnoDB | 10 | Compact | 23 | 712 | 16384 | 0 | 0 | 0 | NULL | 2007-08-25 03:04:58 | NULL | NULL | latin1_general_cs | NULL | | InnoDB free: 11264 kB |
| OFFICES | InnoDB | 10 | Compact | 7 | 2340 | 16384 | 0 | 0 | 0 | NULL | 2007-08-25 03:04:58 | NULL | NULL | latin1_general_cs | NULL | | InnoDB free: 11264 kB |
| ORDERDETAILS | InnoDB | 10 | Compact | 2913 | 61 | 180224 | 0 | 0 | 0 | NULL | 2007-08-25 03:04:58 | NULL | NULL | latin1_general_cs | NULL | | InnoDB free: 11264 kB |
| ORDERFACT | InnoDB | 10 | Compact | 3027 | 173 | 524288 | 0 | 131072 | 0 | NULL | 2007-08-25 03:04:58 | NULL | NULL | latin1_general_cs | NULL | | InnoDB free: 11264 kB; (`PRODUCTCODE`) REFER `sampledata`.`PRODUCTS`(`PRODUCTCOD |
| ORDERS | InnoDB | 10 | Compact | 227 | 216 | 49152 | 0 | 0 | 0 | NULL | 2007-08-25 03:04:59 | NULL | NULL | latin1_general_cs | NULL | | InnoDB free: 11264 kB |
| PAYMENTS | InnoDB | 10 | Compact | 272 | 60 | 16384 | 0 | 0 | 0 | NULL | 2007-08-25 03:04:59 | NULL | NULL | latin1_general_cs | NULL | | InnoDB free: 11264 kB |
| PRODUCTS | InnoDB | 10 | Compact | 91 | 720 | 65536 | 0 | 0 | 0 | NULL | 2007-08-25 03:04:58 | NULL | NULL | latin1_general_cs | NULL | | InnoDB free: 11264 kB |
| QUADRANT_ACTUALS | InnoDB | 10 | Compact | 148 | 110 | 16384 | 0 | 0 | 0 | NULL | 2007-08-25 03:04:59 | NULL | NULL | latin1_general_cs | NULL | | InnoDB free: 11264 kB |
| TIME | InnoDB | 10 | Compact | 207 | 237 | 49152 | 0 | 0 | 0 | NULL | 2007-08-25 03:04:59 | NULL | NULL | latin1_general_cs | NULL | | InnoDB free: 11264 kB |
| TRIAL_BALANCE | InnoDB | 10 | Compact | 22 | 744 | 16384 | 0 | 0 | 0 | NULL | 2007-08-25 03:04:59 | NULL | NULL | latin1_general_cs | NULL | | InnoDB free: 11264 kB |
+---------------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-------------------+----------+----------------+----------------------------------------------------------------------------------+
13 rows in set (0.01 sec)
The Pentaho 1.2.1 GA demo does not set the permission correctly for linux,
the files lack execution permission.
chmod -R +x /usr/local/pentaho/pentaho-demo/
Start the server and redirect STDOUT and STDERR to the one file
mkdir -p /usr/local/pentaho/pentaho-demo/logs
cd usr/local/pentaho/pentaho-demo
./start-pentaho.sh > logs/pentaho_`date +%Y%m%d`.log 2>&1 &
[1] 3375
Check the output of the server log
tail -f /usr/local/pentaho/pentaho-demo/logs/pentaho_20070825.log
JAVA_HOME set to /usr/local/java/
JAVA is /usr/local/java//bin/java
=========================================================================
JBoss Bootstrap Environment
JBOSS_HOME: /usr/local/pentaho/pentaho-demo/jboss
JAVA: /usr/local/java//bin/java
JAVA_OPTS: -server -Xms128m -Xmx512m -XX:MaxPermSize=256m -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterva
l=3600000 -Djava.awt.headless=true -Djava.io.tmpdir=/tmp/ -Dprogram.name=run.sh
CLASSPATH: /usr/local/pentaho/pentaho-demo/jboss/bin/run.jar:/usr/local/java//lib/tools.jar
=========================================================================
[Server@1a758cb]: [Thread[main,5,main]]: checkRunning(false) entered
[Server@1a758cb]: [Thread[main,5,main]]: checkRunning(false) exited
[Server@1a758cb]: Startup sequence initiated from main() method
[Server@1a758cb]: Loaded properties from [/usr/local/pentaho/pentaho-demo/data/server.properties]
[Server@1a758cb]: Initiating startup sequence...
[Server@1a758cb]: Server socket opened successfully in 6 ms.
04:26:13,863 INFO [Server] Starting JBoss (MX MicroKernel)...
04:26:13,865 INFO [Server] Release ID: JBoss [Zion] 4.0.4.GA (build: CVSTag=JBoss_4_0_4_GA date=200605151000)
04:26:13,866 INFO [Server] Home Dir: /usr/local/pentaho/pentaho-demo/jboss
04:26:13,947 INFO [Server] Home URL: file:/usr/local/pentaho/pentaho-demo/jboss/
04:26:13,949 INFO [Server] Patch URL: null
04:26:13,949 INFO [Server] Server Name: default
04:26:13,949 INFO [Server] Server Home Dir: /usr/local/pentaho/pentaho-demo/jboss/server/default
04:26:13,949 INFO [Server] Server Home URL: file:/usr/local/pentaho/pentaho-demo/jboss/server/default/
04:26:13,949 INFO [Server] Server Log Dir: /usr/local/pentaho/pentaho-demo/jboss/server/default/log
04:26:13,950 INFO [Server] Server Temp Dir: /usr/local/pentaho/pentaho-demo/jboss/server/default/tmp
04:26:13,950 INFO [Server] Root Deployment Filename: jboss-service.xml
04:26:14,735 INFO [ServerInfo] Java version: 1.5.0_12,Sun Microsystems Inc.
04:26:14,735 INFO [ServerInfo] Java VM: Java HotSpot(TM) Server VM 1.5.0_12-b04,Sun Microsystems Inc.
04:26:14,735 INFO [ServerInfo] OS-System: Linux 2.6.16-xenU,i386
04:26:16,761 INFO [Server] Core system initialized
[Server@1a758cb]: Database [index=0, id=0, db=file:sampledata/sampledata, alias=sampledata] opened sucessfully in 6835 ms.
[Server@1a758cb]: Database [index=1, id=1, db=file:shark/shark, alias=shark] opened sucessfully in 46 ms.
[Server@1a758cb]: Database [index=2, id=2, db=file:hibernate/hibernate, alias=hibernate] opened sucessfully in 13 ms.
[Server@1a758cb]: Database [index=3, id=3, db=file:quartz/quartz, alias=quartz] opened sucessfully in 79 ms.
[Server@1a758cb]: Startup sequence completed in 6984 ms.
...
04:29:36,794 INFO [STDOUT] Pentaho BI Platform server is ready. (1.2.1-625 GA)
04:29:44,906 INFO [TomcatDeployer] deploy, ctxPath=/sw-style, warUrl=.../deploy/sw-style.war/
04:29:45,985 INFO [Http11BaseProtocol] Starting Coyote HTTP/1.1 on http-0.0.0.0-8080
04:29:46,132 INFO [ChannelSocket] JK: ajp13 listening on /0.0.0.0:8009
04:29:46,276 INFO [JkMain] Jk running ID=0 time=0/163 config=null
04:29:46,294 INFO [Server] JBoss (MX MicroKernel) [4.0.4.GA (build: CVSTag=JBoss_4_0_4_GA date=200605151000)] Started in 3m:32s:342ms
Need to edit the web.xml file to get the base-url
vi /usr/local/pentaho/pentaho-demo/jboss/server/default/deploy/pentaho.war/WEB-INF/web.xml
replace localhost:8080 with your external EC2 DNS name eg: ec2-67-202-2-78.z-2.compute-1.amazonaws.com:8080
No comments:
Post a Comment