Pages

Tuesday, August 31, 2010

Measuring and monitoring system performances through sysstat

Being an administrator of a corporation to manage production box is an daunting task.One has to be very much aware what is going on into the box/servers by looking into it through some tools.One important package I am talking about is called "sysstats" ,which has so many important tool to disclose all the information needed by an administrator.

I do not issue any guarantee that this will work for you.

So this article I am using Arch Linux . As it doesn't come with base installation so I have to get it(sysstat) separately.

bhaskar@bhaskar-laptop_07:05:26_Tue Aug 31:~> sudo pacman -S sysstat
warning: sysstat-9.0.6.1-1 is up to date -- reinstalling
resolving dependencies...
looking for inter-conflicts...

Targets (1): sysstat-9.0.6.1-1

Total Download Size: 0.00 MB
Total Installed Size: 1.14 MB

Proceed with installation? [Y/n]


here I said N or no ,because I have already installed it long time back.You see that below:

bhaskar@bhaskar-laptop_07:08:31_Tue Aug 31:~> sudo pacman -Qi sysstat
Name : sysstat
Version : 9.0.6.1-1
URL : http://pagesperso-orange.fr/sebastien.godard/
Licenses : GPL
Groups : None
Provides : None
Depends On : glibc
Optional Deps : tk: to use isag
gnuplot: to use isag
Required By : None
Conflicts With : None
Replaces : None
Installed Size : 1168.00 K
Packager : Sergej Pupykin
Architecture : i686
Build Date : Mon 01 Mar 2010 03:51:14 AM IST
Install Date : Tue 02 Mar 2010 10:10:31 PM IST
Install Reason : Explicitly installed
Install Script : No
Description : A collection of performance monitoring tools



Here are the files installed by the package in the system..

bhaskar@bhaskar-laptop_07:10:35_Tue Aug 31:~> sudo pacman -Ql sysstat
sysstat /etc/
sysstat /etc/cron.daily/
sysstat /etc/cron.daily/sysstat
sysstat /etc/cron.hourly/
sysstat /etc/cron.hourly/sysstat
sysstat /etc/rc.d/
sysstat /etc/rc.d/sysstat
sysstat /etc/sysconfig/
sysstat /etc/sysconfig/sysstat
sysstat /etc/sysconfig/sysstat.ioconf
sysstat /etc/sysstat/
sysstat /etc/sysstat/sysstat
sysstat /usr/
sysstat /usr/bin/
sysstat /usr/bin/iostat
sysstat /usr/bin/isag
sysstat /usr/bin/mpstat
sysstat /usr/bin/pidstat
sysstat /usr/bin/sadf
sysstat /usr/bin/sar
sysstat /usr/lib/
sysstat /usr/lib/sa/
sysstat /usr/lib/sa/sa1
sysstat /usr/lib/sa/sa2
sysstat /usr/lib/sa/sadc
sysstat /usr/share/
sysstat /usr/share/doc/
sysstat /usr/share/doc/sysstat-9.0.6.1/
sysstat /usr/share/doc/sysstat-9.0.6.1/CHANGES
sysstat /usr/share/doc/sysstat-9.0.6.1/COPYING
sysstat /usr/share/doc/sysstat-9.0.6.1/CREDITS
sysstat /usr/share/doc/sysstat-9.0.6.1/FAQ
sysstat /usr/share/doc/sysstat-9.0.6.1/README
sysstat /usr/share/doc/sysstat-9.0.6.1/sysstat-9.0.6.1.lsm
sysstat /usr/share/locale/
sysstat /usr/share/locale/af/
sysstat /usr/share/locale/af/LC_MESSAGES/
sysstat /usr/share/locale/af/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/da/
sysstat /usr/share/locale/da/LC_MESSAGES/
sysstat /usr/share/locale/da/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/de/
sysstat /usr/share/locale/de/LC_MESSAGES/
sysstat /usr/share/locale/de/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/es/
sysstat /usr/share/locale/es/LC_MESSAGES/
sysstat /usr/share/locale/es/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/fi/
sysstat /usr/share/locale/fi/LC_MESSAGES/
sysstat /usr/share/locale/fi/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/fr/
sysstat /usr/share/locale/fr/LC_MESSAGES/
sysstat /usr/share/locale/fr/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/id/
sysstat /usr/share/locale/id/LC_MESSAGES/
sysstat /usr/share/locale/id/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/it/
sysstat /usr/share/locale/it/LC_MESSAGES/
sysstat /usr/share/locale/it/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/ja/
sysstat /usr/share/locale/ja/LC_MESSAGES/
sysstat /usr/share/locale/ja/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/ky/
sysstat /usr/share/locale/ky/LC_MESSAGES/
sysstat /usr/share/locale/ky/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/lv/
sysstat /usr/share/locale/lv/LC_MESSAGES/
sysstat /usr/share/locale/lv/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/mt/
sysstat /usr/share/locale/mt/LC_MESSAGES/
sysstat /usr/share/locale/mt/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/nb/
sysstat /usr/share/locale/nb/LC_MESSAGES/
sysstat /usr/share/locale/nb/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/nl/
sysstat /usr/share/locale/nl/LC_MESSAGES/
sysstat /usr/share/locale/nl/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/nn/
sysstat /usr/share/locale/nn/LC_MESSAGES/
sysstat /usr/share/locale/nn/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/pl/
sysstat /usr/share/locale/pl/LC_MESSAGES/
sysstat /usr/share/locale/pl/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/pt/
sysstat /usr/share/locale/pt/LC_MESSAGES/
sysstat /usr/share/locale/pt/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/pt_BR/
sysstat /usr/share/locale/pt_BR/LC_MESSAGES/
sysstat /usr/share/locale/pt_BR/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/ro/
sysstat /usr/share/locale/ro/LC_MESSAGES/
sysstat /usr/share/locale/ro/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/ru/
sysstat /usr/share/locale/ru/LC_MESSAGES/
sysstat /usr/share/locale/ru/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/sk/
sysstat /usr/share/locale/sk/LC_MESSAGES/
sysstat /usr/share/locale/sk/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/sv/
sysstat /usr/share/locale/sv/LC_MESSAGES/
sysstat /usr/share/locale/sv/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/vi/
sysstat /usr/share/locale/vi/LC_MESSAGES/
sysstat /usr/share/locale/vi/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/zh_CN/
sysstat /usr/share/locale/zh_CN/LC_MESSAGES/
sysstat /usr/share/locale/zh_CN/LC_MESSAGES/sysstat.mo
sysstat /usr/share/locale/zh_TW/
sysstat /usr/share/locale/zh_TW/LC_MESSAGES/
sysstat /usr/share/locale/zh_TW/LC_MESSAGES/sysstat.mo
sysstat /usr/share/man/
sysstat /usr/share/man/man1/
sysstat /usr/share/man/man1/iostat.1.gz
sysstat /usr/share/man/man1/isag.1.gz
sysstat /usr/share/man/man1/mpstat.1.gz
sysstat /usr/share/man/man1/pidstat.1.gz
sysstat /usr/share/man/man1/sadf.1.gz
sysstat /usr/share/man/man1/sar.1.gz
sysstat /usr/share/man/man8/
sysstat /usr/share/man/man8/sa1.8.gz
sysstat /usr/share/man/man8/sa2.8.gz
sysstat /usr/share/man/man8/sadc.8.gz
sysstat /var/
sysstat /var/log/
sysstat /var/log/sa/


Now it puts a crontab entry to run daily on the installed system..although you can control it according your choice..

bhaskar@bhaskar-laptop_07:13:35_Tue Aug 31:/etc/cron.daily> cat sysstat
#!/bin/sh
# Generate a daily summary of process accounting. Since this will probably
# get kicked off in the morning, it would probably be better to run against
# the previous days data.
/usr/lib/sa/sa2 -A &


This package come with so many binary and all of them are very useful tool. I will explain all of them one by one.First tool is callled sar..and it will output like this..

bhaskar@bhaskar-laptop_07:15:59_Tue Aug 31:~> sudo sar
Password:
Linux 2.6.34-ARCH (bhaskar-laptop) 08/31/2010 _i686_ (2 CPU)

06:27:36 AM LINUX RESTART

06:28:02 AM CPU %user %nice %system %iowait %steal %idle
06:38:02 AM all 8.57 0.00 1.80 6.59 0.00 83.04
06:48:02 AM all 22.57 0.00 5.11 5.40 0.00 66.92
06:58:02 AM all 16.56 0.00 5.58 3.81 0.00 74.05
07:08:02 AM all 7.46 0.00 2.79 4.52 0.00 85.23
Average: all 13.79 0.00 3.82 5.08 0.00 77.31


Now bit of explanation is required for the fields it shows which I enumerated below:

Sar is system activity reporter.

%user and %nice refer to your software programs, such as MySQL or Apache.
%system refers to the kernel’s internal workings.
%iowait is time spent waiting for Input/Output, such as a disk read or write. Finally, since the kernel accounts for 100% of the runnable time it can schedule, any unused time goes into %idle.

It come along with another two binary relates sar is called sa1 and sa2.What does this fellows do to sar??

The sa1 script logs sar output into sysstat's binary log file format, and sa2 reports it back in human readable format.Clear!

And the reports it provide kept in a dir called /var/log/sa and file with date attached to it...

bhaskar@bhaskar-laptop_07:25:28_Tue Aug 31:/var/log/sa> ls
sa23 sa24 sa25 sa30 sa31 sar24 sar30


"-W" this flag to sar shows the swap related activity on the system

bhaskar@bhaskar-laptop_07:31:09_Tue Aug 31:~> sudo sar -W
Password:
Linux 2.6.34-ARCH (bhaskar-laptop) 08/31/2010 _i686_ (2 CPU)

06:27:36 AM LINUX RESTART

06:28:02 AM pswpin/s pswpout/s
06:38:02 AM 0.00 0.00
06:48:02 AM 0.00 0.00
06:58:02 AM 0.03 1.67
07:08:02 AM 0.01 3.20
07:18:02 AM 0.13 4.02
07:28:01 AM 0.12 0.36
Average: 0.05 1.54



"-r" option to sar show memory related thig from the system:

bhaskar@bhaskar-laptop_07:37:50_Tue Aug 31:~> sudo sar -r
Linux 2.6.34-ARCH (bhaskar-laptop) 08/31/2010 _i686_ (2 CPU)

06:27:36 AM LINUX RESTART

06:28:02 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit
06:38:02 AM 204468 815896 79.96 31784 399596 1412880 43.21
06:48:02 AM 62120 958244 93.91 61984 407388 1566932 47.93
06:58:02 AM 40344 980020 96.05 93368 348428 1622948 49.64
07:08:02 AM 27240 993124 97.33 126952 312652 1576108 48.21
07:18:02 AM 72528 947836 92.89 96988 334604 1454800 44.50
07:28:01 AM 58732 961632 94.24 95268 344540 1486996 45.48
Average: 77572 942792 92.40 84391 357868 1520111 46.49


"-b" option shows some paging statistics :

bhaskar@bhaskar-laptop_07:41:19_Tue Aug 31:~> sudo sar -b
Linux 2.6.34-ARCH (bhaskar-laptop) 08/31/2010 _i686_ (2 CPU)

06:27:36 AM LINUX RESTART

06:28:02 AM tps rtps wtps bread/s bwrtn/s
06:38:02 AM 25.49 19.80 5.69 951.22 108.95
06:48:02 AM 35.10 20.35 14.76 259.20 337.45
06:58:02 AM 24.73 16.87 7.86 159.71 230.64
07:08:02 AM 53.55 42.64 10.91 356.85 273.27
07:18:02 AM 61.84 53.96 7.88 477.19 293.56
07:28:01 AM 5.12 1.15 3.96 49.08 74.69
07:38:01 AM 5.75 2.34 3.41 90.68 61.61
Average: 30.23 22.45 7.78 334.87 197.18



Anyway you can fuse your terminal reports into the excel sheet to manage thing as the origianl page is shown the way how to do it.Kndly visit the origianl pacakge author page to see the options.

Ok now lets talk about another binary come with package called "pidstat" What it does??

pidstat command is used to monitor processes and threads currently being managed by the Linux kernel. It can also monitor the children of those processes and threads.

on my system it shows the thing going underneath:

bhaskar@bhaskar-laptop_07:41:22_Tue Aug 31:~> sudo pidstat -d 2
Password:
Linux 2.6.34-ARCH (bhaskar-laptop) 08/31/2010 _i686_ (2 CPU)

07:46:56 AM PID kB_rd/s kB_wr/s kB_ccwr/s Command
07:46:58 AM 990 0.00 11.82 0.00 kjournald
07:46:58 AM 3948 0.00 1.97 1.97 plugin-containe

07:46:58 AM PID kB_rd/s kB_wr/s kB_ccwr/s Command

07:47:00 AM PID kB_rd/s kB_wr/s kB_ccwr/s Command
07:47:02 AM 990 0.00 6.00 0.00 kjournald
07:47:02 AM 3948 0.00 2.00 2.00 plugin-containe

07:47:02 AM PID kB_rd/s kB_wr/s kB_ccwr/s Command
07:47:04 AM 3835 0.00 2.00 2.00 firefox

07:47:04 AM PID kB_rd/s kB_wr/s kB_ccwr/s Command

07:47:06 AM PID kB_rd/s kB_wr/s kB_ccwr/s Command
07:47:08 AM 990 0.00 8.00 0.00 kjournald
07:47:08 AM 3948 0.00 2.00 2.00 plugin-containe

07:47:08 AM PID kB_rd/s kB_wr/s kB_ccwr/s Command

07:47:10 AM PID kB_rd/s kB_wr/s kB_ccwr/s Command
07:47:12 AM 990 0.00 12.00 0.00 kjournald
07:47:12 AM 3948 0.00 2.00 2.00 plugin-containe






"-d" option provide I/O statistics.

Now get the memory utilisation stat through "-r" flag of this binary.

bhaskar@bhaskar-laptop_07:50:18_Tue Aug 31:~> sudo pidstat -r
Linux 2.6.34-ARCH (bhaskar-laptop) 08/31/2010 _i686_ (2 CPU)

07:51:41 AM PID minflt/s majflt/s VSZ RSS %MEM Command
07:51:41 AM 1 0.45 0.00 1752 620 0.06 init
07:51:41 AM 1023 0.54 0.00 2148 948 0.09 udevd
07:51:41 AM 2772 0.02 0.00 5080 428 0.04 syslog-ng
07:51:41 AM 2773 0.09 0.00 5396 1620 0.16 syslog-ng
07:51:41 AM 2804 0.03 0.00 3420 700 0.07 ntpd
07:51:41 AM 2805 0.25 0.00 8232 3848 0.38 named
07:51:41 AM 2824 0.04 0.00 2372 760 0.07 xinetd
07:51:41 AM 2834 0.23 0.00 18672 8492 0.83 httpd
07:51:41 AM 2874 0.05 0.00 18724 6760 0.66 httpd
07:51:41 AM 2875 0.15 0.00 18688 7660 0.75 httpd
07:51:41 AM 2876 0.15 0.00 18688 7660 0.75 httpd
07:51:41 AM 2877 0.14 0.00 18824 7844 0.77 httpd
07:51:41 AM 2878 0.15 0.00 18688 7660 0.75 httpd
07:51:41 AM 2879 0.15 0.00 18688 7660 0.75 httpd
07:51:41 AM 2917 0.13 0.00 8468 1888 0.19 master
07:51:41 AM 2931 0.12 0.00 8540 1760 0.17 pickup
07:51:41 AM 2932 0.12 0.00 8592 1776 0.17 qmgr
07:51:41 AM 2936 0.04 0.00 1800 628 0.06 crond
07:51:41 AM 2946 0.16 0.00 2788 1216 0.12 dbus-daemon
07:51:41 AM 2954 0.60 0.00 15052 3608 0.35 hald
07:51:41 AM 2955 0.19 0.00 3516 1172 0.11 hald-runner
07:51:41 AM 2984 0.08 0.00 3580 992 0.10 hald-addon-inpu
07:51:41 AM 2986 0.08 0.00 3580 988 0.10 hald-addon-rfki
07:51:41 AM 2987 0.08 0.00 3580 984 0.10 hald-addon-leds
07:51:41 AM 2996 0.07 0.00 3576 992 0.10 hald-addon-gene
07:51:41 AM 2998 0.08 0.00 3580 1000 0.10 hald-addon-stor
07:51:41 AM 3008 0.08 0.00 3244 1012 0.10 hald-addon-acpi
07:51:41 AM 3023 0.03 0.00 6216 688 0.07 rpcbind
07:51:41 AM 3027 0.02 0.00 3148 308 0.03 famd
07:51:41 AM 3036 0.02 0.00 1960 356 0.03 gpm
07:51:41 AM 3044 0.24 0.00 13912 2152 0.21 gdm-binary
07:51:41 AM 3060 3.17 0.04 143100 19616 1.92 dropbox
07:51:41 AM 3072 0.04 0.00 1752 536 0.05 agetty
07:51:41 AM 3073 0.04 0.00 1752 540 0.05 agetty
07:51:41 AM 3074 0.04 0.00 1752 540 0.05 agetty
07:51:41 AM 3075 0.04 0.00 1752 536 0.05 agetty
07:51:41 AM 3076 0.04 0.00 1752 532 0.05 agetty
07:51:41 AM 3077 0.04 0.00 1752 536 0.05 agetty
07:51:41 AM 3132 0.25 0.00 17028 2804 0.27 gdm-simple-slav
07:51:41 AM 3155 162.21 0.01 82076 35784 3.51 Xorg
07:51:41 AM 3194 0.28 0.00 18224 2236 0.22 console-kit-dae
07:51:41 AM 3195 0.03 0.00 3528 644 0.06 ntpd
07:51:41 AM 3272 0.03 0.00 3172 332 0.03 dbus-launch
07:51:41 AM 3277 0.24 0.00 5448 2540 0.25 upowerd
07:51:41 AM 3331 0.42 0.00 17664 5400 0.53 polkit-gnome-au
07:51:41 AM 3335 0.25 0.00 5640 2672 0.26 polkitd
07:51:41 AM 3336 0.16 0.00 14888 2064 0.20 gdm-session-wor
07:51:41 AM 3349 0.11 0.00 22176 1432 0.14 gnome-keyring-d
07:51:41 AM 3367 0.73 0.00 25064 5380 0.53 gnome-session
07:51:41 AM 3385 0.03 0.00 3172 332 0.03 dbus-launch
07:51:41 AM 3386 0.09 0.00 2616 1116 0.11 dbus-daemon
07:51:41 AM 3388 0.02 0.00 3540 224 0.02 ssh-agent
07:51:41 AM 3391 0.18 0.00 6776 2580 0.25 gconfd-2
07:51:41 AM 3396 0.69 0.01 22052 7000 0.69 gnome-settings-
07:51:41 AM 3401 0.13 0.00 6352 1708 0.17 gvfsd
07:51:41 AM 3404 0.79 0.00 55332 12088 1.18 metacity
07:51:41 AM 3405 0.90 0.01 42768 11040 1.08 gnome-panel
07:51:41 AM 3407 0.21 0.00 7932 2764 0.27 gvfs-gdu-volume
07:51:41 AM 3409 0.25 0.00 13512 2556 0.25 udisks-daemon
07:51:41 AM 3410 0.03 0.00 5000 512 0.05 udisks-daemon
07:51:41 AM 3425 0.11 0.00 38324 1492 0.15 gvfs-fuse-daemo
07:51:41 AM 3429 1.24 0.02 53156 8800 0.86 nautilus
07:51:41 AM 3431 0.25 0.00 41896 2352 0.23 bonobo-activati
07:51:41 AM 3442 27.99 0.00 20928 6888 0.68 multiload-apple
07:51:41 AM 3445 0.44 0.01 21036 7000 0.69 battstat-applet
07:51:41 AM 3446 0.67 0.00 50716 11744 1.15 gweather-applet
07:51:41 AM 3448 0.74 0.01 41420 11764 1.15 clock-applet
07:51:41 AM 3450 0.70 0.00 41736 11532 1.13 wnck-applet
07:51:41 AM 3451 0.42 0.00 20500 6644 0.65 notification-ar
07:51:41 AM 3452 0.29 0.00 16120 4172 0.41 polkit-gnome-au
07:51:41 AM 3454 0.35 0.00 17840 5308 0.52 gdu-notificatio
07:51:41 AM 3480 0.08 0.00 17236 1460 0.14 gnome-screensav
07:51:41 AM 3502 0.17 0.00 6856 2396 0.23 gvfsd-trash
07:51:41 AM 3509 0.18 0.00 4828 1888 0.19 system-tools-ba
07:51:41 AM 3516 0.13 0.00 6484 1740 0.17 gvfsd-burn
07:51:41 AM 3522 0.76 0.00 12724 10152 0.99 SystemToolsBack
07:51:41 AM 3528 0.76 0.01 47896 12500 1.23 gnome-terminal
07:51:41 AM 3529 0.05 0.00 1796 580 0.06 gnome-pty-helpe
07:51:41 AM 3531 0.15 0.00 4924 1884 0.18 bash
07:51:41 AM 3538 0.07 0.00 4412 844 0.08 screen
07:51:41 AM 3539 0.13 0.00 4676 1372 0.13 screen
07:51:41 AM 3540 0.16 0.00 4924 1892 0.19 bash
07:51:41 AM 3546 0.14 0.00 4000 900 0.09 su
07:51:41 AM 3547 0.20 0.00 4900 1876 0.18 bash
07:51:41 AM 3577 0.04 0.00 2144 844 0.08 udevd
07:51:41 AM 3578 0.03 0.00 2144 864 0.08 udevd
07:51:41 AM 3611 0.16 0.00 4924 1872 0.18 bash
07:51:41 AM 3617 0.26 0.01 4884 1796 0.18 bash
07:51:41 AM 3628 0.16 0.00 4924 1876 0.18 bash
07:51:41 AM 3635 0.21 0.01 5136 1164 0.11 bash
07:51:41 AM 3655 0.21 0.00 4924 1920 0.19 bash
07:51:41 AM 3674 0.14 0.00 4000 904 0.09 su
07:51:41 AM 3675 0.16 0.00 4900 1868 0.18 bash
07:51:41 AM 3684 0.27 0.01 5552 2740 0.27 bash
07:51:41 AM 3701 1.87 0.00 4924 1956 0.19 bash
07:51:41 AM 3708 0.13 0.00 4604 1372 0.13 thunderbird
07:51:41 AM 3721 0.10 0.00 4736 1420 0.14 run-mozilla.sh
07:51:41 AM 3725 8.07 0.08 272920 81716 8.01 thunderbird-bin
07:51:41 AM 3835 234.27 0.12 749812 283212 27.76 firefox
07:51:41 AM 3848 0.01 0.00 4608 448 0.04 firefox_cpu_lim
07:51:41 AM 3849 9.11 0.00 1916 588 0.06 cpulimit
07:51:41 AM 3948 9.40 0.03 112288 36312 3.56 plugin-containe
07:51:41 AM 3956 0.13 0.00 18688 7656 0.75 httpd
07:51:41 AM 3965 0.70 0.00 41128 10936 1.07 notification-da
07:51:41 AM 5130 0.13 0.00 1780 636 0.06 sadc
07:51:41 AM 5859 0.19 0.00 3656 812 0.08 pidstat


Now lets talk about two very important tool which will provide different way to view things. Those are called "sadc" and "sadf".I will cover one after another below.

SADC:
It is system activity and data collector daemon.Even w can use it manually too!.Sadc command intened to run behind the sar command.Actually it will write the binary format of the statistics it collect day by day and put into a dir i.e /var/log/sadd,where dd stands for the particular day.As the man page said it can only provide local activity,means runs on the same host it installed.

I am putting here few example stright out of the manual page for clear understanding.Here we go:

/usr/lib/sa/sadc 1 10 /tmp/datafile
Write 10 records of one second intervals to the /tmp/datafile binary file.

/usr/lib/sa/sadc -C Backup_Start /tmp/datafile
Insert the comment Backup_Start into the file /tmp/datafile.


So move onto the next tool called sadf.

SADF:
This tool actually dispaly the collected data by sar in different format.Which is wonderful..because you can fuse your data to various places to get lot many information.It will essentially provide XML,CVS format data .

once again I am putting example stright out of the manual page for easy understanding.Here we go:

sadf -d /var/log/sa/sa21 -- -r -n DEV
Extract memory, swap space and network statistics from system activity file 'sa21', and display them in a format that can be ingested by a
database.

sadf -p -P 1
Extract CPU statistics for processor 1 (the second processor) from current daily data file, and display them in a format that can easily be
handled by a pattern processing command.



Hope this will help.

Cheers!
Bhaskar

Thursday, August 19, 2010

Few tricks and info about sudo

Working in a multi-admin environment ;where more then one administrator controlling servers,as often the case with most of the big corporates.Then you need a mechanism to deal with that which not allowed each other to overlap their work and keep track who is firing what.Sudo is that kind tool ,which is quite indispensable in the multi-admin production environment.

I do not issue any guarantee that this will work for you.

Most of the GNU/Linux distribution come with sudo..if not then please download it through by it(OSes) package manager. It should be in the repository of that distribution.

Once installed a configuration file related to it placed at /etc named sudoers . So you need edit it according to your requirement to get thing going with this tool.

Tool for to edit that file is called "visudo" ..which nothing but a vi/vim editor with a lock..means when someone editing others won't allow to do anything in it.Clear?? right.

You need to called it like this:

root@bhaskar-laptop_08:37:05_Thu Aug 19:/home/bhaskar # visudo

and the file /etc/sudoers should open in it,but with a temporary location and place with a lock.

Ok..now few internals entry need to visit for the sake of clarity of it's function.So here we go:

Suppose we want to allow sudo with some specific host with specific users on it to allow use of sudo.Did I confuse you with the last statement??not worry ...I will explain it in details..read on:

The careful reader will note that there was a bit of a change here. The line used to read jim ALL=(ALL) ALL?, but now there's only one ALL left. Reading the man page can easily leave you quite confused as to what those three ALL??s meant. ALL refers to machines- the assumption is that this is a network wide sudoers file. In the case of this machine (lnxserve) we could do this:

jim lnxserve= /bin/kill, /usr/sbin/jim/

Now let me explain that a host/machine name called "lnxserve" has a user called "jim" and heis entitled to run those two command right side of the "=" .

So what was the (ALL)? for? Well, here;s a clue:

jim lnxserve=(paul,linda) /bin/kill, /usr/sbin/jim/

Yes this line bring another twist into the previous line.Here it says.. a machine called "lnxserv" with a user called "jim" who will be able to run command as paul and linda with specified command mentioned.

That says that jim can (using sudo -u ) run commands as paul or linda. Yes it sometimes necessary to do it because of various reason in the production environment.I not going into that details ,because that might take another whole article to talk about.

This is perfect for giving jim the power to kill paul or linda's processes without giving him anything else. There is one thing we need to add though: if we just left it like this, jim is forced to use sudo -u paul or sudo -u linda every time. We can add a default runas_default:

Defaults:jim timestamp_timeout=-1, env_delete+=BOOP, runas_default=linda

So jim can easily run command as linda by default.I am going to put some line straight out of the man page for clarity:

To get a file listing of an unreadable directory:

$ sudo ls /usr/local/protected

To list the home directory of user yaz on a machine where the file system holding ~yaz is not exported as root:

$ sudo -u yaz ls ~yaz

To edit the index.html file as user www:

$ sudo -u www vi ~www/htdocs/index.html

To view system logs only accessible to root and users in the adm group:

$ sudo -g adm view /var/log/syslog

To run an editor as jim with a different primary group:

$ sudo -u jim -g audio vi ~jim/sound.txt

To shutdown a machine:

$ sudo shutdown -r +15 "quick reboot"

To make a usage listing of the directories in the /home partition. Note that this runs the commands in a sub-shell to make the cd and file redirection
work.

$ sudo sh -c "cd /home ; du -s * | sort -rn > USAGE"


Hope this will help.

Cheers!
Bhaskar

Monday, August 16, 2010

How to properly run fsck on (/) root or other partitions including LVM

As it is an important issue to deal with low level thing in the server archtecture. Being an GNU/Linux administrator/NOC/Ops one has to have the clear cut understanding what they are doing.Because handling the production box require lot of common sense and in depth knowlegde about the platform/OS.

So without much ado lets play with it or let me show you the simple tricks.

I do not issue any guarantee that this will work for you.

So the first question come into the mind why the hell you need to check the filesystem?? Specially the root(/) part of it...sound pretty dull and boring...huh..please don't ignore this.You know ignorance is a sin...so do not commit it.

Now filesystem can be corrupted in various ways..few common ways are :

1) Not properly shutdown the server(although most of the cases journaling will do the healing)

2) Sudden power cut left your system down with lot of processing going on

3)Somebody has done something special(bad sense) to corrupt the data on that particular partition.

It is a bad idea and not recommended to run fsck(yes,this is the inbuilt tool you need to use)the mounted partition or drive.So don't do that.

Now, running fsck on other partition like /home,/var,/usr ...

First and foremost thing to be done is get into a single user mode..how do you do that?

ok once you type init 1 at the terminal prompt you will be taken to the singe user mode.From there simply unmount the partions as show below:

root@bhaskar-laptop_08:16:36_Mon Aug 16:/home/bhaskar # init 1 ---> this will bring to the single user mode

root@bhaskar-laptop_08:16:36_Mon Aug 16:/home/bhaskar # umount /dev/sda2 ---> assuming this partion hold the /home section.

Now run the fsck:


root@bhaskar-laptop_08:18:02_Mon Aug 16:/home/bhaskar # fsck -yfv /dev/sda2


Ok..let me explain the flags or switch I passed with the fsck .

y------> it will try to detect and fix any filesystem related corruption without manual intervention.

f-----------> this will force check even the system check says it's clean.

v--------> It will provide you the verbose explanation what that comming going through on the terminal screen.

Now a major problem in our hand. That we find out that root(/)partition of the filesystem gor corrupted due to some reasons.So we need to fix that issue to get back the system as soon as possible on the track.

For this kind of problem..it significant that on a mounted system you just cannot run fsck...as I said earlier..becauase it will corrupt the data on it.So we need a installation cd/dvd for our rescue. The first cd/dvd will do the job for us or get a systemrescuecd to do that.

Once you boot with one of those cd/dvd and put the below text at the command prompt it presents:

#linux rescue nomount

Now once you fire that one you are on the prompt so you can begin work on that.First we need to do is fire a mknod command.Now ask me why need to do that???

Because we had passed the option nomount in the last section so it will not parse any file system or it will not initialize any filesystem or create any device to operate on.If you try to run fsck now it will fail.

So to run correctly the fsck to on a filesystem we need to create device file for that.For that we need to run mknod.But to use mknod we need to know the Major number and Minor number of the device.Lets get those number...wait before that I need to tell you few thing about what Major number and Minor number of a device and how it signifies.

What is Major Number and Minor number??

Traditionally, the major number identifies the driver associated with the device. For example, /dev/null and /dev/zero are both managed by driver 1, whereas virtual consoles and serial terminals are managed by driver 4; similarly, both vcs1 and vcsa1 devices are managed by driver 7. Modern Linux kernels allow multiple drivers to share major numbers, but most devices that you will see are still organized on the one-major-one-driver principle.

The minor number is used by the kernel to determine exactly which device is being referred to. Depending on how your driver is written, you can either get a direct pointer to your device from the kernel, or you can use the minor number yourself as an index into a local array of devices. Either way, the kernel itself knows almost nothing about minor numbers beyond the fact that they refer to devices implemented by your driver.

So it's clear?? right.lets move on we need to find out the major number and minor number of the device to run mknod:

root@bhaskar-laptop_08:42:30_Mon Aug 16:/home/bhaskar # ls -al /dev/sda
brw-rw---- 1 root disk 8, 0 Aug 16 07:15 /dev/sda

See it will look like this...as 4the and 5th column holds the major number and minor number.Now create the device file:

#mknod /dev/sda b 8 0

It will create the device file.Once it's done you are safe to run fsck on that particular partition holding your root(/) filesystem.

#fsck -yfv /dev/YourRootPartition(sda,hda,....)

Now lets have some fun with LVM.

We need few tools to manipulate that kind of partition which will provide the lvm package within the os or in come inbuilt with other rescue cd.

We need to find out physical disk,volume group and logical partition ..where we are going to run fsck..right?

pvscan :Physical scanning of particular disk

root@bhaskar-laptop_08:44:06_Mon Aug 16:/home/bhaskar # pvscan
PV /dev/sda8 VG bhaskarlaptop lvm2 [46.15 GiB / 21.15 GiB free]
Total: 1 [46.15 GiB] / in use: 1 [46.15 GiB] / in no VG: 0 [0 ]

vgscan :Volume group scanning

root@bhaskar-laptop_08:50:24_Mon Aug 16:/home/bhaskar # vgscan
Reading all physical volumes. This may take a while...
Found volume group "bhaskarlaptop" using metadata type lvm2

lvscan :Logical volume scanning

root@bhaskar-laptop_08:52:00_Mon Aug 16:/home/bhaskar # lvscan
ACTIVE '/dev/bhaskarlaptop/data' [25.00 GiB] inherit

Now it is not activates then you need to activate the specific logical volume like this:

#lvchange -ay "yourLogicalVolume"

The final step:

Run the fsck on logical volume:

#fsck -yfv /YourLogicalVolume


Hope this will help.

Cheers!
Bhaskar