« Posts under sysadmin

Tips for beginners with Puppet (server automation)

Since Puppet is written in Ruby, some of its syntax are Ruby-specific (I would not know for sure since I have not learned about Ruby yet). I’ll try to update this post so beginners to Puppet and strangers to Ruby can get pass the unwanted headache.

Case-sensitivity

Case-changing is one of the source of confusions. If you see this error:

"Could not find dependency Class[changeTimeZone] for Node[baseserver]"

And in your nodes.pp you have:

node baseServer {
  require changeTimeZone
}

Change it to all lower case will fix the problem

Require a definition within a class

The syntax is

exec { "RunSomething":
  command => "abc",
  require => MyClass::MyFunction["def"],

DIsplay return code of a shell command

If you need to determine the return code for shell scripting, you can use “$?”

# ifconfig | grep eth0 >/dev/null 2>&1
# echo $?
0 => good/found
# ifconfig | grep eth1111 >/dev/null 2>&1
# echo $?
1 => bad/not found

haproxy and stunnel

This quick reference to install haproxy, stunnel to support SSL forwarding (with IP forwarding patch)

wget http://haproxy.1wt.eu/download/1.3/src/haproxy-1.3.22.tar.gz
tar -zxf haproxy-1.3.22.tar.gz
cd haproxy-1.3.22
make TARGET=linux26
cp haproxy /usr/sbin/haproxy
vi /etc/haproxy.cfg
...
vi /etc/init.d/haproxy
...
chmod +x /etc/init.d/haproxy 

useradd haproxy 
mkdir -p /var/chroot/haproxy 
chown haproxy:haproxy /var/chroot/haproxy 
chmod 700 /var/chroot/haproxy

service haproxy start
chkconfig --add haproxy 

vi /etc/sysconfig/syslog
SYSLOGD_OPTIONS=”-m 0 -r”

vi /etc/syslog.conf
local0.* /var/log/haproxy.log
local1.* /var/log/haproxy-1.log

Stunnel with HAProxy patch

yum remove stunnel
yum install openssl-devel openssl

wget http://www.stunnel.org/download/stunnel/src/stunnel-4.22.tar.gz
tar -xzf stunnel-4.22.tar.gz

cd stunnel-4.22
wget http://haproxy.1wt.eu/download/patches/stunnel-4.22-xforwarded-for.diff
patch -p1 < stunnel-4.22-xforwarded-for.diff

./configure --disable-fips
make
make install
mkdir -p /etc/stunnel
vi /etc/stunnel/stunnel.conf
....
vi /etc/init.d/stunnel
....
vi /etc/stunnel/your.pem
....
ln -s /usr/local/bin/stunnel /usr/sbin/stunnel
chmod +x /etc/init.d/stunnel
service stunnel start
chkconfig --add stunnel 

Install APC automatically via script

If you try to install APC via scripting, you might experience the interactive prompt asking about “apxs”. How to by pass that? Use expect (“yum install expect”). This script will solve your problem:

#!/usr/bin/expect
spawn pecl install apc
expect "Use apxs to set compile flag"
send "yes\r"
expect "install ok"
expect eof

Hadoop vs. MySQL

I just play with Hadoop, HBase, Hive, Pig via Cloudera’s guide (thanks to Cloudera for bringing these packages to CentOS) for a couple days. Cloudera is going in the right direction, targeting the enterprises. Hadoop is definitely on the watch list as it matures. But right now, it’s very technical and would not be suitable for the general public. I’m also disappointed on its performance for a smaller testing cluster (which I understand is unfair for what it’s designed for). For its to shine, you would need both, the problem has to be big enough and the server farms has to be big enough. However, I think there are many companies that initially test Hadoop on a small cluster before actually investing more time and money into it. It’s the first impression that makes a lasting impact. As it matures, I expect there will be overhead-reduction optimizations done on the small/low-end clusters.

Setting up MySQL is easy, scaling it is not so easy but there are many related software and technology to help you. But don’t think you can just switch to Hadoop/HBase/Hive in a day. The selling point is there (no-limit scaling on commodity hardware at the core design) but there are many land mines that you could walk on if decisions are not evaluated carefully. Right now, I see Hadoop as one of the last resorts because you’re running into a wall, exhausting RDBMS options and its related software/technology that help you scale, like memcache, message queues, load balancing, etc. You should not choose Hadoop just because you started a company and might get big in a couple years. Of course there are exceptions when you know your problem domain is only solvable in a distributed system. The popularity of Hadoop could change (or not) if the priority on Hadoop is to dominate both markets or just focus on the large farms.

You face complexity when dealing with Hadoop/Hbase/Hive/HDFS (like setting up, breaking things down into tasks, and setting up batch operations). For many many applications, MySQL (or RDBMS) ain’t going anywhere. I see smart companies use both for different parts of their operations. Unless Hadoop can do real-time, low-latency operations in distributed server farms effortlessly, there is no clear winner now, or ever. Maybe the trend on real-time search (Twitter, FaceBook) might be able to speed this up.

Hive troubleshooting

I am playing with Hadoop and Hive via Cloudera RPMs. The development status is very active, meaning it could be hard to track down the errors or find help with a specific one.

Permission of /tmp in HDFS

FAILED: Unknown exception : org.apache.hadoop.fs.permission.AccessControlException: Permission denied: user=mathie, access=WRITE, inode="tmp":hadoop:supergroup:rwxrwxr-x

Solution: You need to turn on full write permissions for /tmp

sudo -u hadoop hadoop fs -chmod 777 /tmp

.hivehistory

[root@r2 tmp]# sudo -u hadoop hive
Hive history file=/tmp/hadoop/hive_job_log_hadoop_200911142019_988931842.txt
java.io.FileNotFoundException: /.hivehistory (Permission denied)
 at java.io.FileOutputStream.open(Native Method)
 at java.io.FileOutputStream.(FileOutputStream.java:179)
 at java.io.FileOutputStream.(FileOutputStream.java:131)
 at java.io.FileWriter.(FileWriter.java:73)
 at jline.History.setHistoryFile(History.java:45)
 at jline.History.(History.java:37)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:298)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
 at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
 at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

It means your $HOME folder is empty and it’s trying to create /.hivehistory on the top level, which of course is not possible. Solution: make sure it’s a real user with a $HOME (“echo $HOME” to check) and not via sudo

/etc/hive/conf/hive-site.xml

hive> show tables;
FAILED: Error in metadata: javax.jdo.JDODataStoreException: SQL exception: Add classes to Catalog "", Schema "APP"
NestedThrowables:
java.sql.SQLNonTransientConnectionException: No current connection.
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
Time taken: 3.142 seconds

Even with embedded mode for metastore, if you run into this problem, look like an issue with Cloudera RPM for Hive that uses ${user.name} not being replaced properly.

Solution: change “${user.name}” to an regular folder and Hive works fine.

haproxy vs. LVS (layer 7 vs layer 4 load balancing)

We just deployed our first haproxy load balancer and still running several LVS-TUN load balancers. Even when advertised as a fast and lightweight, it’s comparing with other layer-7 LB, not with layer-4 LB like LVS.

Load Average / CPU Usage

haproxy still requires much more resource. On the same really basic server (Pentium 3/4 or something really slow), LVS load average is always or near zero, even with many incoming requests. haproxy’s load average is about 0.3 to 0.5

Features

But the good thing about haproxy is that it has more features and flexible in term of configuration. LVS-TUN is best when the ISP allows packets to have LB’s IP (spoofed packets). But if you don’t have that option, haproxy is the next best thing. Assuming you don’t need HTTP header inspection feature of haproxy, which LVS does not have because it’s layer 4.

Bandwidth Utilization

LVS-TUN only takes the incoming portion of the requests so bandwidth requirement would be half of the full process (haproxy and same for LVS-NAT).

SSL

LVS-TUN does it effortlessly because it does not deal with the content of the packets at all. haproxy can deal with SSL with 2 options:

  • via TCP option (haproxy acts as a layer 4 LB). Pros: easy. Cons: you won’t be able to get the client IP, which to some app is a deal breaker.
  • Stunnel runs on the same machine as haproxy to process SSL then forward to haproxy as a standard request. Pros: client IP is passed with the provided patch on haproxy’s website. Cons: could slow down the LB machine if there’re many SSL requests, need to setup SSL when passing between haproxy and the workers for really secure data

Conclusion

Both haproxy and LVS have their own space. Use LVS-TUN when possible for the best performance and scalability. haproxy is best when you need header inspection and LVS-TUN is not possible with the ISP/network.

Monitor LSI MegaRAID under CentOS

Not very user friendly with documentation but I guess at least it runs!

Basic Monitor Script

Sample Output

Checking RAID status on xxx
Controller a0:  MegaRAID SAS 8344ELP
No of Physical disks online : 4
Degraded : 0
Failed Disks : 0

Upgrade Firmware

To determine the current firmware, run “MegaCli -AdpAllInfo -a0”

Product Name    : MegaRAID SAS 8344ELP
Serial No       : P00253390X
FW Package Build: 7.0.1-0064

                    Mfg. Data
                ================
Mfg. Date       : 09/27/06
Rework Date : 00/00/00
Revision No     : 8

                Image Versions In Flash:
                ================
Boot Block Version : R.2.3.15
BIOS Version       : MT33
MPT Version        : MPTFW-01.18.79.00-IT
FW Version         : 1.12.220-0560
WebBIOS Version    : 1.1-33g-e_11-Rel
Ctrl-R Version     : 1.04-019A

Check the LSi website for the current downloads, in this case:

http://www.lsi.com/storage_home/products_home/internal_raid/megaraid_sas/megaraid_sas_8344elp/index.html

- Download the firmware, unzip
- Run "MegaCli -adpfwflash -f SAS1068_FW_Image.rom -a0"

init.d script for gearmand

init.d script for stunnel on CentOS

You might need to modify some settings to suite your installation. I installed from source.

whereis stunnel
(might need to ln -s /usr/local/bin/stunnel /usr/sbin/stunnel)

vi /etc/init.d/stunnel