Monday, 23 May 2016

NFS common errors and troubleshooting - Linux/Unix

I have seen some of the most common NFS Error/Issues which occurs in very common now and then to most of Linux/Unix based system admins. So I decided to put at one palace. Hope this helps most of them.

Environment: Linux/Unix

Error: "Server Not Responding"

Check your NFS server and the client using RPC message and they must be functional/online. 

use ping, traceroute to check are they reaching each other, if not check your NIC using ethtool to verify IP address.

sometimes due to heavy server or network loads causes the RPC message response to time out causing error message. try to increase timeout option.

Error: "rpc mount export: RPC: Timed out " 

NFS server or client was unable to resolve DNS. check forward/reverse DNS name resolution works. 
Check your DNS servers or /etc/hosts

 Error: "Access Denied" or "Permission Denied"

check export permission for the NFS file systems.
#showmount -e nfsserver  ==> client 
#exportfs -a ==> server

check you dont have any syntax issues in file /etc/exports(e.g  space, permissions, typos..etc) 

Error: "RPC: Port mapper failure - RPC: Unable to receive"

NFS requires both NFS service and portmapper service running on both client and the server

#rpcinfo -p
       or
#/etc/init.d/portmap status

if not, start the portmap service

Error: "NFS Stale File Handle"

system call 'open' calls to access NFS file in the same way application uses local file they by returns a file descriptor or handle which programs useses I/O commands to identify the file manipulations

When an NFS file share is either unshared or NFS server changes the file handler, and any NFS client which attempts to do further I/O on the share will receive the 'NFS Stale File Handler'.

on the client :

umount -f /nfsmount or if it is unable to inmount and remount 
kill the processes which uses that /nfsmount

or 

incase if above options didn't work, you can reboot the client to clear the stale NFS.

Error: "No route to host"

this could be reported when client attempts to mount the NFS file system, even when the client can ping them successfully.

This can be due to RPC messages being filtered by either host firewall, client firewall or network switch. verify firewall rules. 
stop suing iptables and try to check the port 2049 

Hope this helps all who might use NFS most of the times. I have figured out these commonly in my experience.

Thanks for sharing !

Sunday, 15 May 2016

CentOS/RHEL 7 kernel dump & debug

Applies : CentOS / RHEL / OEL 7 

Arch : x86_64

When kdump enabled, the system is booted from the context of another kernel. This second kernel reserves a small amount of memory, and its only purpose is to capture the core dump image in case the system crashes. Since being able to analyze the core dump helps significantly to determine the exact cause of the system failure.

Configuring kdump :

kdump service comes with kexec-tools package which needs to be installed

#yum install kexec-tools

Modify the amount of memory needs to be configured for kdump and set crashkernel=<size> parameter


# cat /etc/default/grub
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="rd.lvm.lv=centos/swap vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos/root crashkernel=128M  vconsole.keymap=us rhgb quiet"
GRUB_DISABLE_RECOVERY="true"
#

Re-generate grub and reboot to make kernel parameter effect

# grub2-mkconfig -o /boot/grub2/grub.cfg
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-3.10.0-123.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-123.el7.x86_64.img
Warning: Please don't use old title `CentOS Linux, with Linux 3.10.0-123.el7.x86_64' for GRUB_DEFAULT, use `Advanced options for CentOS Linux>CentOS Linux, with Linux 3.10.0-123.el7.x86_64' (for versions before 2.00) or `gnulinux-advanced-1a06e03f-ad9b-44bf-a972-3a821fca1254>gnulinux-3.10.0-123.el7.x86_64-advanced-1a06e03f-ad9b-44bf-a972-3a821fca1254' (for 2.00 or later)
Found linux image: /boot/vmlinuz-0-rescue-ae1ddf63f5e04857b5e89cd8fcf1f9e1
Found initrd image: /boot/initramfs-0-rescue-ae1ddf63f5e04857b5e89cd8fcf1f9e1.img
done
#

Modify Kump in /etc/kdump.conf

By default vmcore will be stored in /var/crash directory and if you like it needs to be dumped in which ever partition or disk or you want or NFS it must be defined here.

ext3 /dev/sdd1
or
net nfs.yourdomain.com:/export/dump

compress the vmcore file to reduce the size 
core_collector makedumpfile -c

when crash is captured, root fs will be mounted and /sbin/init is run. change the behaviour as below
default reboot

Start your kdump: 

# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-3.10.0-123.el7.x86_64 root=UUID=1a06e03f-ad9b-44bf-a972-3a821fca1254 ro rd.lvm.lv=centos/swap vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos/root crashkernel=128M vconsole.keymap=us rhgb quiet

# grep -v  '#' /etc/sysconfig/kdump | sed '/^$/d'
KDUMP_KERNELVER=""
KDUMP_COMMANDLINE=""
KDUMP_COMMANDLINE_APPEND="irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug"
KEXEC_ARGS=""
KDUMP_BOOTDIR="/boot"
KDUMP_IMG="vmlinuz"
KDUMP_IMG_EXT=""
#

# systemctl enable kdump.service
# systemctl start kdump.service
# systemctl is-active kdump
active
#

Test your configuration 

# echo 1 > /proc/sys/kernel/sysrq
# echo c > /proc/sysrq-trigger



You could see that the crash was generated and we could install debug kernel packages to analyse crash. 

#yum install crash

I was able to download from https://oss.oracle.com/ol7/debuginfo/ and check your kernel version to download the version of debug kernel.

#rpm -ivh kernel-debuginfo-common-x86_64-3.10.0-123.el7.x86_64.rpm \
               kernel-debuginfo-3.10.0-123.el7.x86_64.rpm \
               kernel-debug-debuginfo-3.10.0-123.el7.x86_64.rpm

# ls -lh /var/crash/127.0.0.1-2016.05.15-04\:50\:40/vmcore
-rw-------. 1 root root 168M May 15 04:51 /var/crash/127.0.0.1-2016.05.15-04:50:40/vmcore
#

# crash /var/crash/127.0.0.1-2016.05.15-04\:50\:40/vmcore /usr/lib/debug/lib/modules/`uname -r`/vmlinux

WARNING: kernel version inconsistency between vmlinux and dumpfile

      KERNEL: /usr/lib/debug/lib/modules/3.10.0-123.el7.x86_64/vmlinux
    DUMPFILE: /var/crash/127.0.0.1-2016.05.15-04:50:40/vmcore
        CPUS: 1
        DATE: Sun May 15 04:50:38 2016
      UPTIME: 00:10:24
LOAD AVERAGE: 0.02, 0.07, 0.05
       TASKS: 104
    NODENAME: slnxcen01
     RELEASE: 3.10.0-123.el7.x86_64
     VERSION: #1 SMP Mon Jun 30 12:09:22 UTC 2014
     MACHINE: x86_64  (2294 Mhz)
      MEMORY: 1.4 GB
       PANIC: "Oops: 0002 [#1] SMP " (check log for details)
         PID: 2266
     COMMAND: "bash"
        TASK: ffff880055650b60  [THREAD_INFO: ffff880053fb2000]
         CPU: 0
       STATE: TASK_RUNNING (PANIC)

crash>


crash> bt
PID: 2266   TASK: ffff880055650b60  CPU: 0   COMMAND: "bash"
 #0 [ffff880053fb3a98] machine_kexec at ffffffff81041181
 #1 [ffff880053fb3af0] crash_kexec at ffffffff810cf0e2
 #2 [ffff880053fb3bc0] oops_end at ffffffff815ea548
.
.
.
crash> files
PID: 2266   TASK: ffff880055650b60  CPU: 0   COMMAND: "bash"
ROOT: /    CWD: /root
 FD       FILE            DENTRY           INODE       TYPE PATH
  0 ffff880053c47a00 ffff8800563383c0 ffff880055bad2f0 CHR  /dev/tty1
  1 ffff8800542a9100 ffff88004dd4ff00 ffff88004dc0b750 REG  /proc/sysrq-trigger
.
.
.
That will conclude the article. 

References : 

Sunday, 17 April 2016

Disaster Recovery using Relax-and-Recover (REAR) - Redhat Linux

This is very simple to use, I wished to write it anyway because before I do OS/application/security patches and much more I wanted to ensure that I would keep the complete backup of server in worst case as this be being production critical.

we would first need to install rear package which can be downloaded from the epel repository. Before I proceed I would provide you environment.

Environment : Oracle Linux 6 with Redhat kernel(2.6.32-573.el6.x86_64)
rear verion : rear-1.18-3.el6.x86_64
DR copy :     NFS storage(nfsserver.testlabs.com)

You would have your rear package in EPEL repository and can be downloaded from their mirror, else just copy & paste below 
hostname#cat > /etc/yum.repos.d/epel.repo 

[epel]
name=Extra Packages for Enterprise Linux 6 - $basearch
failovermethod=priority
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6

cntrl-d

hostname# yum install rear 

Note: Make sure you are already installed with 'genisoimage' and 'syslinux' without which rear wont be able to install. 
hostname#yum install genisoimage

Let rear know as to which location it needs to be backuped ? It could be defined below :

hostname#cat >/etc/rear/local.conf 
OUTPUT=ISO
BACKUP=NETFS
BACKUP_URL="nfs://nsfserver.testlabs.com/dr/"

cntrl-d


Resulting ISO image would be used for DR recovery purpose which takes backup from the NFS in order to restore files/dirs. 

hostname# rear -v mkbackup
Relax-and-Recover 1.18 / Git
Using log file: /var/log/rear/rear-hostname.log
Creating disk layout
Creating root filesystem layout
TIP: To login as root via ssh you need to set up /root/.ssh/authorized_keys or SSH_ROOT_PASSWORD in your configuration file
Copying files and directories
Copying binaries and libraries
Copying kernel modules
Creating initramfs
Making ISO image
Wrote ISO image: /var/lib/rear/output/rear-hostname.iso (74M)
Copying resulting files to nfs location
Encrypting disabled
Creating tar archive '/tmp/rear.NZP1vXar0Vmq5nr/outputfs/hostname/backup.tar.gz'
Archived 14 MiB [avg 3584 KiB/sec]
.
.
.
.

Archived 5644 MiB [avg 8268 KiB/sec]OK
Archived 5644 MiB in 700 seconds [avg 8256 KiB/sec]

You would have all your system files backuped to NFS server, you can just confirm by logging to storage box.

nsfserver#pwd
/dr/hostname
nsfserver# /dr/hostname# ls
./                  README              backup.log          rear-hostname.iso
../                 VERSION             backup.tar.gz       rear.log
nsfserver#  /dr/hostname#

When your server unable to boot or any libraries corruption etc etc for any reasons you can copy the iso image from NFS path and boot from CDROM and recover.

I tested and I would share to all readers.

I will corrupt the server, remove binaries, remove bootable files etc etc and will restore from ISO image from the NFS location.

hostname# rm -rf /boot/*
hostname# ls -l /boot
total 0
hostname# 

hostname# rm -rf /bin/*
hostname# ls
-bash: /bin/ls: No such file or directory
hostname#

when you boot from ISO image you need to choose Recover hostname from the below options


After booting, it would go to RESCUE shell, check you are able to reach to your NFS server in order to restore the files.

RESCUE: rear recover



This would start copying all your data from the NFS to the client server. It might take minutes/hours depending on the data. Once its been completed just boot up your client machine it would be operational.

hostname: # ls -l /boot/ | wc -l
15
hostname:

Now you have restored you system with backup.

There can't be excuse incase if you are not using this tool. its very easy, simple and would request readers to take 1 copy of ISO image would save lot of time and efforts in worst cases. Plan for the best, but prepare for the worst.

Thanks for all who read this post.

Sunday, 20 March 2016

Xen disk hot addition/removing from guests

Since I had to add new disk for the guest to add swap space, Xen allows you to hot add (and remove) disks to a guest domU while the system is running.  

Lets take a look at how to add disks to the guests :

I would share image based disk from the Xen Dom0 which would be available to the guest, so later it could be formated, mounted and do anything just like a block device. xm-attach would be used to make this online. 

I just created a image file for 4GB to add swap partition for guest server 
xendom0#dd if=/dev/zero of=testvm-swapdisk.img bs=1M count=4k

xm block-attach <Domain> <BackDev> <FrontDev> <Mode> [BackDomain]
    Domain   - Guest domain which needs to attach disk
    BackDev  - Location of the block device
    FrontDev - The device name to assign the new device in the domU
    Mode       - read/write mode

xendom0# xm block-attach testvm testvm-swapdisk.img /dev/xvdd w 

On Guest :

testvm ~]# lsblk -i | tail -1
xvdd                        202:48   0    4G  0 disk
testvm ~]#fdisk /dev/xvdd 
testvm ~]#mkswap /dev/xvdd1
testvm ~]#swapon /dev/xvdd1
testvm ~]# swapon -s
Filename                                Type            Size    Used    Priority
/dev/dm-1                               partition       819196  0       -1
/dev/xvdd1                              partition       4192928 0       -2

remove disk from guest:

If you need to remove disks which are no longer needed for the guests, you need to unmount, delete the partitions. from the xen Dom0 domains you can detach the disk.

xendom0#xm block-detach testvm /dev/xvdd

Make sure, you need to edit the xen config file for permanent changes so that it would be available for next reboot. 

Sunday, 28 February 2016

Centralized Log Management using rsyslog CentOS 6/7

​I am creating an centralized log server where it can store all the logs from the clients. In order to do that make sure you have enough space to store logs for all the clients. I would also configure the log rotation to save space on the disk.

Environment - CentOS/Redhat 6.6
rsyslog version5.8.10

rsyslog would be installed by default. incase its not there use an yum to install.
#yum install rsyslog

It could be helpful incase you read man rsyslog.conf documentation. It has mainly 3 parts 

1. Modules - rsyslog follows modular design
2. Global directives - Set global properties for rsync
3. Rules - what to be logged and where

Destination log server would contain all the logs ( audit, sudo, su, history, kernel, ..etc) to be logged from all the clients to the centralized log server.

Let's configure server first,  

Edit, 
#vim /etc/rsyslog.conf 

#Make sure you have syslog reception as TCP/UDP communication. 
$ModLoad imudp
$UDPServerRun 514

$ModLoad imtcp
$InputTCPServerRun 514

# Create a template so that all the logs would write to respective clients with the name of the program being logged in below destination path. You can keep your priority.facilitator to be marked and would be sent to rsyslog daemon which used to log centrally.

$template TmplAuth,"/scratch/remote-sys-logs/%fromhost%/%PROGRAMNAME%.log"
authpriv.*   ?TmplAuth
*.info,mail.none,authpriv.none,cron.none,local6.*  ?TmplAuth

# Since I needed audit.log there by I would create a new rule to make sure that it would reach to the same destination folder.

$template TmplAudit,"/scratch/remote-sys-logs/%fromhost%/audit.log"
local6.*        ?TmplAudit

# logging all the bash terminal commands and storing in the centralized location.

$template TmplCmds,"/scratch/remote-sys-logs/%fromhost%/hist.log"
local0.debug    ?TmplCmds

save and quit the file.

# mkdir /scratch/remote-sys-logs
# service rsyslog restart

Since the logs would be big, I would like to rotate in such way that two files in which last rotated file would be zipped, whereas last but one not to be zipped. The maximum time I would like to keep the logs are for 60 days. It would store with the date format as a extension. These would be processed once the rsyslog restarts.

Edit, 
#vim /etc/logrotate.d/remote-sys-logs
/scratch/remote-sys-logs/*/*.log {
    daily
    dateext
    rotate 2
    compress
    create 644 root root
    notifempty
    missingok
    maxage 60
    sharedscripts
    postrotate
     /bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true
    endscript
}

Client: 

Edit. 
vim /etc/rsyslog.conf

# the below module doesn't exist by default, make an entry so that it has an ability to convert any standard text file into a syslog message.
$ModLoad imfile

# Enter the module definition as below to copy logs from audit to centralized log server, failing to do this so would not be able to write to central log.


$InputFileName /var/log/audit/audit.log
$InputFileTag audit:
$InputFileStateFile audit.log
$InputFileSeverity info
$InputFileFacility local6
$InputRunFileMonitor

# forward all the logs to the centralized server. 

*.*                     @centrallogserver:514

save and quit
#service rsyslog restart

In order to log all the bash commands to the logger, make an entry in /etc/bashrc global config file.


#export so that all the new bash commands are being logged to the file
export PROMPT_COMMAND='RETRN_VAL=$?;logger -p local0.debug "$(whoami):[$$] $(history 1 | sed "s/^[ ]*[0-9]\+[ ]*//" ) [$RETRN_VAL]#"'

exit or fork the bash shell to log all the history to the centralized log server.

Hope this helps someone who would like to create central log server. Below are the references that could be checked to suit your requirements.

Thanks for studying and re-sharing !

References :
https://en.wikipedia.org/wiki/Syslog - Syslog facility and priorities explained 
man logrotate.conf
man rsyslog.conf

Monday, 18 January 2016

Rescue environment paravirtualized VM Xen Virtualization - Redhat 7 on OVM333

Objective: how to work on rescue environment from the dom0, where mount ISO image bypassing pygrub and renaming root volume groups.

Environment: 
            : Oracle Virtual server 3.3.3 X86_64 (HVM)
            : Redhat 7.0 x86_64 (Redhat OS)

recently I had to rename volume groups in one of my guest machines, it was Redhat and was paravirtualized. since the root partition(/root) was running on the LV, I had to boot the guest in the rescue environment, since I had to boot guest directly from the kernel and initrd used in installations, so I copied to /OVS/Repositories/redhat7/vmlinuz and /OVS/Repositories/redhat7/initrd.img to my OVM HV. I would now tell my guest to use kernel and initrd which i had copied to boot from the rescue environment.

kernel='/OVS/Repositories/redhat7/vmlinuz'
ramdisk='/OVS/Repositories/redhat7/initrd.img'
extra="rescue method=/mnt" or extra="install=hd:xvdc rescue=1 xencons=tty"

There are two ways making for rescue environment:

1. If your ISO image is being mounted to some temp mount point (mount -o loop redhat7.iso /mnt), provide the rescue path location pointing to that. 

extra="rescue method=/mnt"

2. If you have your ISo image being mounted to the guest OS and you know the name of the block device, then you could provide the name of the disk for rescue 

In my case I know the name of the block device was xvdc. 

extra="install=hd:xvdc rescue=1 xencons=tty"

This would pass as an extra arguments to kernel, and would tell anaconda installer to get the install files from the location.


I remember I had in the past written post on how to rename VG/LV for CentOS 6(http://goo.gl/M71G5a), however i didn't find here much difference except it was GRUB2.

I brought down the LV's offline and renamed the VG, and entries in /etc/fstab and /boot/grub/grub.cfg was changed and remade grub.

- scan all disks in the LV - lvscan, if the disks are not offline, make it offline


sh-4.2# lvchange -an /dev/<vgname>/swap
sh-4.2# lvchange -an /dev/<vgname>/root

- Change the VG name 
sh-4.2# lvm vgrename <old_vgname> <new_vgname>
Volume group "<old_vgname>" successfully renamed to "<new_vgname>"

- make sure you point all your /etc/fstab entries to new volume group.

- let your /boot/grub/grub.conf know the changes made to your new volume group, and generate the GRUB config file
sh-4.2# grub2-mkconfig -o /boot/grub/grub.cfg

Once all done, remove entries made for rescue in the config file, and then boot.

OVM# xm create -c redhat7.cfg


Thank you for reading and re-sharing.

Sunday, 10 January 2016

Security updates and installation using YUM - RHEL 5/6/7

Hello All, 

I had come across a situation where I wanted to check, verify and update the security on the different releases of RHEL, since I was not able to find all at one place. I thought of putting across all at one place. I thought of keeping all at once place helps and so sharing in public !
Operating systems
Explanation on security updates on RHELRHEL 5RHEL 6RHEL 7
yum could install the security updates
using the plugin yum-security
yum install yum-securityyum install yum-plugin-securityNo plugin required as it is
already part of yum
list all available errata without installingyum list-secyum updateinfo list available
list all available security updates without
installing
yum list-security --securityyum updateinfo list security all
yum updateinfo list sec
list of currently installed security updatesyum list-secyum updateinfo list security installed
list all security update with verbose descriptionsyum list-sec
apply all security updates from RHNyum -y update --security
updates based on CVE referenceyum update --cve <CVE>
view available advisories by severityyum updateinfo list
more detailed information about
advisory before applying
yum updateinfo RHSA-2015:XXXX
apply only one specific advisoryyum update --advisory=RHSA 2015:XXXX
More information could be found atman yum-security

First post in year 2016, wishing all of you - HAPPY NEW YEAR :)

Thanks