Tuesday, November 11, 2008

Adding a second hard drive under Linux/CentOS

Check out the new new drive

In this example, you will see I have 2x500GB drives, sda and sdb. sda already in used and sdb is new.
[root@s3 mathie]# fdisk -l

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          13      104391   83  Linux
/dev/sda2              14       60801   488279610   8e  Linux LVM

Disk /dev/sdb: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System

Partition the new drive

For my purpose, I want the whole sdb to be /home, your partitioning scheme might be different, but fdisk should be very easy to use.
[root@s3 mathie]# fdisk /dev/sdb

The number of cylinders for this disk is set to 60801.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): m
Command action
   a   toggle a bootable flag
   b   edit bsd disklabel
   c   toggle the dos compatibility flag
   d   delete a partition
   l   list known partition types
   m   print this menu
   n   add a new partition
   o   create a new empty DOS partition table
   p   print the partition table
   q   quit without saving changes
   s   create a new empty Sun disklabel
   t   change a partition's system id
   u   change display/entry units
   v   verify the partition table
   w   write table to disk and exit
   x   extra functionality (experts only)
Create a new partition
Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-60801, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-60801, default 60801):
Using default value 60801
Check the new drive
Command (m for help): p

Disk /dev/sdb: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1       60801   488384001   83  Linux
Write and save
Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

Format the new partition(s)

[root@s3 mathie]# /sbin/mkfs -t ext3 /dev/sdb1
mke2fs 1.39 (29-May-2006)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
61063168 inodes, 122096000 blocks
6104800 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
3727 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
        102400000

Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 26 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.

Mount the new drive

mkdir /home
mount /dev/sdb1 /home

#add this into /etc/fstab for auto-mount on booting up
/dev/sdb1               /home                   ext3    defaults        0 0

Labels:

Wednesday, October 29, 2008

Getting lighttpd RPM

There are many places you can download the latest lighttpd RPMs:
Primary: 
http://packages.sw.be/lighttpd/
http://dag.wieers.com/rpm/packages/lighttpd/

Secondary:
EL4: http://ftp.freshrpms.net/pub/freshrpms/redhat/testing/EL4/lighttpd/
EL5: http://www.kevindustries.com/media/kw/files/linux/lighttpd/RPMS/EL5/
EL5 x86_64: http://linuxwave.blogspot.com/2007/08/installing-lighttpd-in-centos-5-for.html
1.4.17: http://www.kevindustries.com/media/kw/files/linux/lighttpd/RPMS/

Labels:

Friday, October 17, 2008

Stress test a new server

Got a new server and want to stress test its CPU & disk? Run Folding@Home or other computing service and also BitTorrent (download some Linux distros like Fedora, CentOS). Testing and yet help the community.

Folding @ Home

wget http://www.stanford.edu/group/pandegroup/folding/release/FAH6.02-Linux.tgz
tar -zxf FAH6.02-Linux.tgz
./fah6 --config
echo "./fah6 -smp -verbosity 9 $* &" > fah
chmod +x fah
./fah > /dev/null &

BitTorrent

Download RPM packages from: http://dag.wieers.com/rpm/packages/bittorrent/ http://dag.wieers.com/rpm/packages/python-crypto/ nohup launchmany-console --saveas_style 1 --max_upload_rate 600 --display_interv al 5 . > torrent.log &

Labels:

Friday, October 03, 2008

How to automatically reboot after a kernel panic?

You should investigate the root cause when possible but if the machine needs to be up and running again without much interruption, you can have it reboot when panicking after a number of seconds. Put this into /etc/sysctl.conf
kernel.panic = 60

Labels:

Sunday, September 21, 2008

pecl and memory limit error

If you run "pecl install [something]" and get this error: "Fatal error: Allowed memory size of 8388608 bytes exhausted (tried to allocate xxx bytes)", you want to change "/usr/bin/pecl" (or run "locate pecl" to see where it is) and specify a larger memory limit.
#!/bin/sh
exec /usr/bin/php -C -n -q -d include_path=/usr/share/pear \
    -d output_buffering=1 /usr/share/pear/peclcmd.php "$@"
becomes
#!/bin/sh
exec /usr/bin/php -C -n -q -d include_path=/usr/share/pear \
    -d memory_limit=16M -d output_buffering=1 /usr/share/pear/peclcmd.php "$@"

Labels:

Friday, September 19, 2008

ip_conntrack and dropped packets

For busy servers, ip_conntrack can fill up quickly and must be monitored or you will get intermittent packet drops. Check var/log/messages for these error messages. Couple values can be adjusted to the kernel:
more /proc/sys/net/ipv4/netfilter/ip_conntrack_count
more /proc/sys/net/ipv4/netfilter/ip_conntrack_max
=> count should be less than max, if it's near the maximum value, increase max

more /proc/sys/net/ipv4/netfilter/ip_conntrack_tcp_timeout_established
=> default 5 days, might want to lower it

echo 0 > /proc/sys/net/ipv4/netfilter/ip_conntrack_tcp_loose

Labels:

Thursday, September 18, 2008

mod_deflate bug

Hmm, another day, another bug: #33499. This time it is mod_deflate and PHP don't play well together. PHP makes a GIF/JPG file and yet mod_deflate still compress it. Solution: use mod_filter (which only available from 2.1), or disable Apache's compression for PHP files and let PHP do the compression via ob_start('ob_gzhandler');

Labels:

Wednesday, September 17, 2008

Get network stats for RRD graphing

This snippet displays Active, Passive and Established connections reported by "netstat --statistics" for saving into RRD or other monitoring tools.

Labels:

Tuesday, September 16, 2008

Apache versus lighttpd

Both run on the same server: Apache/2.0.59 (port 80) & lighttpd 1.4.19 (port 8080). 2 tests: dynamic & static files. To make things a little realistic, it's from a EU client to a US server.

Serving a dynamic file

eu$ ab -n 1000 -c 10 "http://us.server/run-some-sql.php"
Server Software:        Apache
Server Port:            80
Document Length:        824 bytes
Time taken for tests:   36.51463 seconds
Total transferred:      1213118 bytes
HTML transferred:       847886 bytes
Requests per second:    27.74 [#/sec] (mean)
Time per request:       360.515 [ms] (mean)
Time per request:       36.051 [ms] (mean, across all concurrent requests)
Transfer rate:          32.84 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:      158  158   0.3    158     162
Processing:   174  200  26.4    191     340
Waiting:      173  199  26.3    191     340
Total:        332  358  26.4    349     498
Server Software:        lighttpd/1.4.19
Server Port:            8080
Document Length:        921 bytes
Time taken for tests:   35.406200 seconds
Total transferred:      1202655 bytes
HTML transferred:       857071 bytes
Requests per second:    28.24 [#/sec] (mean)
Time per request:       354.062 [ms] (mean)
Time per request:       35.406 [ms] (mean, across all concurrent requests)
Transfer rate:          33.16 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:      158  158   0.6    158     167
Processing:   172  193  29.0    183     383
Waiting:      172  192  29.0    183     383
Total:        330  351  29.1    341     541
Apache: 27.74 requests/sec
Lighttpd 28.24 requests/sec

Serving a static file

eu$ ab -n 1000 -c 10 "http://us.server/img/some-image.gif"
Server Software:        Apache
Server Port:            80
Document Length:        14781 bytes
Time taken for tests:   63.858434 seconds
Total transferred:      15060000 bytes
HTML transferred:       14781000 bytes
Requests per second:    15.66 [#/sec] (mean)
Time per request:       638.584 [ms] (mean)
Time per request:       63.858 [ms] (mean, across all concurrent requests)
Transfer rate:          230.31 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:      157  158   0.5    158     164
Processing:   476  478   4.6    478     549
Waiting:      158  159   4.0    159     228
Total:        634  636   4.6    636     707
Server Software:        lighttp/1.4.19
Server Port:            8080
Document Length:        14781 bytes
Time taken for tests:   63.736261 seconds
Total transferred:      14992000 bytes
HTML transferred:       14781000 bytes
Requests per second:    15.69 [#/sec] (mean)
Time per request:       637.363 [ms] (mean)
Time per request:       63.736 [ms] (mean, across all concurrent requests)
Transfer rate:          229.70 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:      157  158   0.4    158     160
Processing:   476  478   2.2    478     491
Waiting:      158  158   1.5    159     166
Total:        634  636   2.3    636     649
Apache 15.66 requests/sec
Lighttpd 15.69 requests/sec
Apache is very decent when there is a low concurrency level (about 10-20). When taken into account the stability, features, modules, it's an excellent choice. Lighttpd under high load although can perform very well, it does suffer from an issue with PHP (current with 1.4.19 and 5.1.6), its backend fast-cgi became overloaded and gave out 500 errors to clients. Bad lighty, or bad PHP! Hope they got it fixed in 1.5 or some future version of PHP

Labels:

Counting TIME_WAIT with netstat

# netstat -tan | grep ':80 ' | awk '{print $6}' | sort | uniq -c
Sample Output:

     15 CLOSING
     26 ESTABLISHED
     31 FIN_WAIT1
      7 FIN_WAIT2
     14 LAST_ACK
      2 LISTEN
     24 SYN_RECV
   2428 TIME_WAIT

Labels:

Tuesday, September 09, 2008

What happens when you do "rm -rf /*"

Just for the fun of it. Here is what happens:
[root@s10 ~]# cd /
[root@s10 /]# dir
bin   dev  initrd  lost+found  misc  opt   sbin     srv  tmp  var
boot  etc  lib     media       mnt   proc  selinux  sys  usr
[root@s10 /]# rm -rf *
rm: cannot remove directory `boot': Device or resource busy
rm: cannot remove directory `dev/shm': Device or resource busy
rm: cannot remove `dev/pts/1': Operation not permitted
rm: `proc/asound/ICH' changed dev/ino: Operation not permitted
[root@s10 /]#
[root@s10 /]# dir
-bash: /usr/bin/dir: /lib/ld-linux.so.2: bad ELF interpreter: No such file or directory
[root@s10 /]# ll
-bash: ls: command not found
[root@s10 /]# reboot
-bash: /sbin/reboot: /lib/ld-linux.so.2: bad ELF interpreter: No such file or directory
Since the processes are still running, SSH still accept connections, but cannot sign in, can't run anything either. Was it fun?!

Labels:

Thursday, August 21, 2008

vmstat - Get an overview look at your server

Get an update every one second
[root@s14 trungson]# vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 3  0  41200  33324   2152 1489108    0    0     4    27    0     1  5  3 91  0
 2  0  41200  33500   2152 1489108    0    0     8     0 1838  3320  8  4 88  0
 1  0  41200  31452   2152 1489176    0    0    32     0 1787  3078  7  4 89  0
 2  0  41200  33260   2152 1489176    0    0     8     0 1788  2895  6  4 90  0
 1  0  41200  33068   2164 1489164    0    0    32   768 2038  3207  7  4 87  2
 2  0  41200  33132   2168 1489228    0    0    32     0 2082  4422 10  5 85  0
 2  0  41200  35628   2172 1489360    0    0   148     0 1924  3658  8  5 86  1
 0  0  41200  34596   2172 1489360    0    0    16     0 1904  3531  8  5 87  0
 4  0  41200  28636   2172 1489428    0    0   116     0 1922  3732  9  5 85  1
 0  0  41200  33036   2180 1489488    0    0     8   860 2127  3828  8  5 86  1
 1  0  41200  32844   2180 1489488    0    0    20     0 1784  3108  7  5 88  0
 0  0  41200  32780   2180 1489556    0    0    24     0 1850  3108  7  4 88  0
 2  0  41200  32844   2180 1489692    0    0   120     0 1915  3842  9  5 85  0
 2  0  41200  26508   2180 1489828    0    0    32   376 1976  3744  8  6 86  0
From the man page:
Procs
  r: The number of processes waiting for run time.
  b: The number of processes in uninterruptible sleep.
Memory
  swpd: the amount of virtual memory used.
  free: the amount of idle memory.
  buff: the amount of memory used as buffers.
  cache: the amount of memory used as cache.
  inact: the amount of inactive memory. (-a option)
  active: the amount of active memory. (-a option)
Swap
  si: Amount of memory swapped in from disk (/s).
  so: Amount of memory swapped to disk (/s).
IO
  bi: Blocks received from a block device (blocks/s).
  bo: Blocks sent to a block device (blocks/s).
System
  in: The number of interrupts per second, including the clock.
  cs: The number of context switches per second.
CPU
  These are percentages of total CPU time.
  us: Time spent running non-kernel code. (user time, including nice time)
  sy: Time spent running kernel code. (system time)
  id: Time spent idle. Prior to Linux 2.5.41, this includes IO-wait time.
  wa: Time spent waiting for IO. Prior to Linux 2.5.41, shown as zero.

Labels:

Wednesday, August 13, 2008

Use ethtool or mii-tool to detect problems with ethernet card

[root@s2 adserver]# ethtool eth0
Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
                      100baseT/Half 100baseT/Full
                      1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Full
Advertised auto-negotiation: Yes
Speed: Unknown! (0)
Duplex: Half
Port: Twisted Pair
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: pumbg
Wake-on: g
Current message level: 0x00000033 (51)
Link detected: yes
You can also change the interface settings with ethtool.
[root@s2 adserver]# mii-tool
eth0: negotiated 10baseT-FD, link ok

Labels:

Wednesday, July 16, 2008

Linux CentOS - Kernel panic

This looks like an error with memory by sim. Anyone has a better clue? The kernel version is 2.6.9-67.0.4.EL, then we rebooted and upgraded to 2.6.9-67.0.20.EL. Any kernel bug I should be aware of?
Jul 13 04:03:13 host syslogd 1.4.1: restart.
Jul 16 08:00:01 host kernel: swap_free: Unused swap offset entry 00010000
Jul 16 08:00:01 host kernel: swap_free: Unused swap offset entry 00010000
Jul 16 08:45:01 host kernel: Unable to handle kernel paging request at virtual address 313a3921
Jul 16 08:45:01 host kernel:  printing eip:
Jul 16 08:45:01 host kernel: c015eebb
Jul 16 08:45:01 host kernel: *pde = 00000000
Jul 16 08:45:01 host kernel: Oops: 0000 [#1]
Jul 16 08:45:01 host kernel: Modules linked in: ip_vs_wrr ip_vs md5 ipv6 ipt_TOS iptable_mangle ip_conntrack_ftp ip_conntrack_irc ipt_REJECT ipt_LOG ipt_limit
iptable_filter ipt_multiport ipt_state ip_conntrack ip_tables autofs4 sunrpc dm_mirror dm_mod button battery ac parport_pc parport 8139too mii ext3 jbd
Jul 16 08:45:01 host kernel: CPU:    0
Jul 16 08:45:01 host kernel: EIP:    0060:[]    Not tainted VLI
Jul 16 08:45:01 host kernel: EFLAGS: 00010202   (2.6.9-67.0.4.EL)
Jul 16 08:45:01 host kernel: EIP is at find_vma+0x29/0x4d
Jul 16 08:45:01 host kernel: eax: 313a3919   ebx: 00c8479c   ecx: 313a3931   edx: c97ec6b4
Jul 16 08:45:01 host kernel: esi: de5b40a0   edi: c8929360   ebp: bff08518   esp: c85dcef4
Jul 16 08:45:01 host kernel: ds: 007b   es: 007b   ss: 0068
Jul 16 08:45:01 host kernel: Process sim (pid: 6909, threadinfo=c85dc000 task=c8929360)
Jul 16 08:45:01 host kernel: Stack: de5b40a0 de5b40d0 c011d901 00000000 00c8479c c85dcfc4 c032ebbf 00000007
Jul 16 08:45:01 host kernel:        0000000e 0000000b 00000000 00000000 00000000 00000000 00000000 00030001
Jul 16 08:45:01 host kernel:        00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Jul 16 08:45:01 host kernel: Call Trace:
Jul 16 08:45:01 host kernel:  [] do_page_fault+0x114/0x4dc
Jul 16 08:45:01 host kernel:  [] do_page_fault+0x0/0x4dc
Jul 16 08:45:01 host kernel:  [] error_code+0x2f/0x38
Jul 16 08:45:01 host kernel:  [] schedule_tail+0xfd/0x106
Jul 16 08:45:01 host kernel:  [] do_page_fault+0x0/0x4dc
Jul 16 08:45:01 host kernel:  [] error_code+0x2f/0x38
Jul 16 08:45:01 host kernel: Code: 5d c3 56 89 c6 53 89 d3 31 d2 85 c0 74 3c 8b 50 08 85 d2 74 0a 39 5a 08 76 05 39 5a 04 76 2b 8b 4e 04 31 d2 85 c9 74 22 8d 4
1 e8 <39> 58 08 76 0c 39 58 04 89 c2 76 0c 8b 49 0c eb 03 8b 49 08 85
Jul 16 08:45:01 host kernel:  <0>Fatal exception: panic in 5 seconds
Jul 16 10:12:18 host syslogd 1.4.1: restart.

Labels:

Monday, January 29, 2007

Misterious 500 - Internal Server Error

This is a very generic error but it means there is some critical issue with the server. One time we experienced with this because our codebase was getting heavier and the default value of memory_limit=8M in php.ini wasn't enough. Solution: increase this value to something higher

Labels:

Monday, April 24, 2006

LVS-Tun & ISPs

LVS is a software load balancing solution. It's open-source software, built directly in Linux kernel and it's free. The director (load balancer) can be in one DC, and the real servers are in different DCs. The director only needs good bandwidth, Pentium 4 or even P3 is fine since it's Layer 4 switching (less overhead than Layer 7, eg: HAProxy). The incoming traffic flows from Client -> Director -> Worker. The returning traffic: Worker -> Client. As you can see, the director has a much higher throughput since it only handles incoming requests. The return packets come directly from the workers. We current manage several LVS setups. One example: 3 directors, 12 real servers, in over 5 different DCs spanning across US and Europe. It's quite easy to set up and manage. Reference: LVS-Tun is an LVS original. It is based on LVS-DR and has the same high scalability/throughput of LVS-DR. LVS-Tun can be used with realservers that can tunnel (==IPIP encapsulation). The director encapsulates the request packet inside an IPIP packet before sending it to the realserver. The realserver must be able to decapsulate the IPIP packet. Initially only Linux could decapsulate IPIP packets, but recently FreeBSD and W2K can now do it too (hmm 2005, I think Microsoft has dropped support for IPIP). With LVS-DR, the realservers can have almost any OS.

Unlike LVS-DR, with LVS-Tun the realservers can be on a network remote from the director, and can each be on separate networks. Thus the realservers could be in different countries (e.g. a set of ftp mirror sites for a project). If this is the case, the realservers will be generating reply packets with VIP:port->CIP (where port is the LVS'ed service). Not being on the VIP network, the routers for the realservers will have to be programmed to accept outgoing packets with src_addr=VIP:port. Routers normally drop these packets as an anti-spoofing measure. If you aren't in control of the routers, you'll just have to inform the people who are, that packets from VIP:port are valid for your business. If they don't want to help you with your business, then you should find another provider who will. Read more here and here

To detect if the ISPs allow LVS-TUN, follow the tests on this page, more specifically, this test:

realserver# traceroute -s VIRTUAL_IP -n CLIENT_IP
 1  * * *
 2  * * *
 3  * * *
 4  * * *
 5  * * *
 6  * * *
 7  * * *
 8  * * *
Be patient and wait on the director to see something similar to the following
director# tcpdump -ln host CLIENT_IP
tcpdump: listening on eth0
19:20:20.310162 CLIENT_IP > VIRTUAL_IP: icmp: CLIENT_IP udp port 33483 unreachable
19:22:40.639844 CLIENT_IP > VIRTUAL_IP: icmp: CLIENT_IP udp port 33511 unreachable
19:22:45.641061 CLIENT_IP > VIRTUAL_IP: icmp: CLIENT_IP udp port 33512 unreachable
19:23:30.664315 CLIENT_IP > VIRTUAL_IP: icmp: CLIENT_IP udp port 33521 unreachable
If you don't see anything response on the director, it might be the realserver cannot get any packet out to the client because the ISP's router dropped these packets.

It is very important that ISPs see the demand/request for LVS-TUN setups to distinguish it from malicious network attacks. Security is good but cannot be too strict or rigid to have flexibility, growth for business. If you have experienced setting up LVS-TUN with other ISPs, webhosting companies, please let me know to add to the list.

List of ISPs support LVS-TUN (allow outgoing spoofed-yet-valid packets for the realservers):

  1. LayeredTech: at Savvis building in Dallas, their other DataBank DC blocks this. Currently working with LT to unblock. Updated: LT is very accommodating for their clients, they exclude our load balancer's IP address in the router filter list.
  2. Hivelocity: blocked but then unblocked, willing to make an exception.
  3. 1paket at Lambdanet in Germany
  4. SoftLayer: custom router setup
  5. WebNX in LA
List of ISPs do NOT support LVS-TUN (drop these packets and are not willing to make exception):
  1. ThePlanet: denied, not willing to make exception in network filter for this type of packets, against their AUP

Labels: