Level: Introductory Vallard Benincosa (vallard@us.ibm.com), Certified Technical Sales Specialist, IBM
20 Jul 2008 Learn these 10 tricks and you'll be the most powerful Linux® systems
administrator in the universe...well, maybe not the universe, but you will
need these tips to play in the big leagues. Learn about SSH tunnels, VNC,
password recovery, console spying, and
more. Examples accompany each trick, so you can duplicate them on your
own systems.
The best systems administrators are set apart by their efficiency.
And if an efficient
systems administrator can do a task in 10 minutes that would take another
mortal two hours to complete, then the efficient systems administrator
should be rewarded (paid more) because the company is saving time, and
time is money, right?
The trick is to prove your efficiency to management. While I won't attempt to cover
that trick in this article, I will give you 10 essential
gems from the lazy admin's bag of tricks. These tips will save you
time—and even if you don't get paid more money to be more efficient,
you'll at least have more time to play Halo.
Trick 1:
Unmounting the unresponsive DVD drive
The newbie states that when he pushes the Eject button on the DVD drive of
a server running a certain Redmond-based operating system, it will eject
immediately. He then complains that, in most enterprise Linux servers, if a
process is running in that directory, then the ejection won't happen. For
too long as a Linux administrator, I would reboot the machine and get my
disk on the bounce if I couldn't figure out what was running and why it
wouldn't release the DVD drive. But this is ineffective.
Here's how you find the process that holds your DVD drive and eject it to
your heart's content: First, simulate it. Stick a disk in your DVD drive,
open up a terminal, and mount the DVD drive:
# mount /media/cdrom
# cd /media/cdrom
# while [ 1 ]; do echo "All your drives are belong to us!"; sleep 30; done
Now open up a second terminal and try to eject the DVD drive:
# eject
You'll get a message like:
umount: /media/cdrom: device is busy
Before you free it, let's find out who is using it.
# fuser /media/cdrom
You see the process was running and, indeed, it is our fault we can not
eject the disk.
Now, if you are root, you can exercise your godlike powers and kill
processes:
# fuser -k /media/cdrom
Boom! Just like that, freedom. Now solemnly unmount the drive:
# eject
fuser is good.
Trick 2: Getting
your screen back when it's hosed
Try this:
# cat /bin/cat
Behold! Your terminal looks like garbage. Everything you type looks like
you're looking into the Matrix. What do you do?
You type reset. But wait you say,
typing reset is too close to typing
reboot or shutdown.
Your palms start to sweat—especially if you are
doing this on a production machine.
Rest assured: You can do it with the confidence that no machine will be
rebooted. Go ahead, do it:
# reset
Now your screen is back to normal. This is much better than closing the
window and then logging in again, especially if you just went through five
machines to SSH to this machine.
Trick 3:
Collaboration with screen
David, the high-maintenance user from product engineering, calls: "I need
you to help me understand why I can't compile supercode.c on these new
machines you deployed."
"Fine," you say. "What machine are you on?"
David responds: "
Posh." (Yes, this fictional company has
named its five production servers in honor of the Spice Girls.) OK,
you say. You exercise your godlike root powers and on another machine
become David:
# su - david
Then you go over to posh:
# ssh posh
Once you are there, you run:
# screen -S foo
Then you holler at David:
"Hey David, run the following command on your
terminal: # screen -x foo."
This will cause your and David's sessions to be joined together in the
holy Linux shell. You can type or he can type, but you'll both see what
the other is doing. This saves you from walking to the other floor and
lets you both have equal control. The benefit is that David can watch your
troubleshooting skills and see exactly how you solve problems.
At last you both see what the problem is: David's compile script
hard-coded an old directory that does not exist on this new server. You
mount it, recompile, solve the problem, and David goes back to work. You
then go back to whatever lazy activity you were doing before.
The one caveat to this trick is that you both need to be logged in as the
same user. Other cool things you can do with the screen command include
having multiple windows and split screens. Read the man pages for more on
that.
But I'll give you one last tip while you're in your screen session. To
detach from it and leave it open, type:
Ctrl-A D
. (I mean, hold down the Ctrl
key and strike the A key. Then push the D key.)
You can then reattach by running the
screen -x foo command again.
Trick 4: Getting
back the root password
You forgot your root password. Nice work. Now you'll just have to
reinstall the entire machine. Sadly enough, I've seen more than a few
people do this. But it's surprisingly easy to get on the machine and
change the password. This doesn't work in all cases (like if you made a
GRUB password and forgot that too), but here's how you do it in a normal
case with a Cent OS Linux example.
First reboot the system. When it reboots you'll come to the GRUB screen as
shown in Figure 1. Move the arrow key so that you stay on this screen
instead of proceeding all the way to a normal boot.
Figure 1. GRUB screen
after reboot
Next, select the kernel that will boot with the arrow keys, and type
E to edit the kernel line. You'll then see something like
Figure 2:
Figure 2. Ready to edit
the kernel line
Use the arrow key again to highlight the line that begins with
kernel, and press E to edit the kernel
parameters. When you get to the screen shown in Figure 3, simply append
the number 1 to the arguments as shown in
Figure 3:
Figure 3. Append the
argument with the number 1
Then press Enter, B, and the kernel will boot up to
single-user mode. Once here you can run the
passwd command, changing password for user
root:
sh-3.00# passwd
New UNIX password:
Retype new UNIX password:
passwd: all authentication tokens updated successfully
Now you can reboot, and the machine will boot up with your new password.
Trick 5: SSH
back door
Many times I'll be at a site where I need remote support from someone who
is blocked on the outside by a company firewall. Few people realize that
if you can get out to the world through a firewall, then it is relatively
easy to open a hole so that the world can come into you.
In its crudest form, this is called "poking a hole in the firewall." I'll
call it an SSH back door. To use it, you'll need a machine on the
Internet that you can use as an intermediary.
In our example, we'll call our machine blackbox.example.com. The machine
behind the company firewall is called ginger. Finally, the machine that
technical support is on will be called tech. Figure 4 explains how this is
set up.
Figure 4. Poking a hole in
the firewall
Here's how to proceed:
- Check that what you're doing is allowed, but make sure you ask the
right people. Most people will cringe that you're opening the
firewall, but what they don't understand is that it is completely
encrypted. Furthermore, someone would need to hack your outside
machine before getting into your company. Instead, you may belong to
the school of "ask-for-forgiveness-instead-of-permission." Either way,
use your judgment and don't blame me if this doesn't go your way.
- SSH from ginger to blackbox.example.com with the
-R flag. I'll assume that you're the root user
on ginger and that tech will need the root user ID to help you with
the system. With the -R flag, you'll forward
instructions of port 2222 on blackbox to port 22 on ginger. This is
how you set up an SSH tunnel. Note that only SSH traffic can come into
ginger: You're not putting ginger out on the Internet naked.
You can
do this with the following syntax:
~# ssh -R 2222:localhost:22 thedude@blackbox.example.com
Once you are into blackbox, you just need to stay logged
in. I usually enter a command like:
thedude@blackbox:~$ while [ 1 ]; do date; sleep 300; done
to keep the machine busy. And minimize the window.
- Now instruct your friends at tech to SSH as thedude into blackbox
without using any special SSH flags. You'll have to give them your
password:
root@tech:~# ssh thedude@blackbox.example.com
.
- Once tech is on the blackbox, they can SSH to ginger using the
following command:
thedude@blackbox:~$: ssh -p 2222 root@localhost
- Tech will then be prompted for a password. They should enter the root
password of ginger.
- Now you and support from tech can work together and solve the problem.
You may even want to use screen together! (See
Trick 4.)
 |
Trick 6: Remote
VNC session through an SSH tunnel
VNC or virtual network computing has been around a long time. I typically
find myself needing to use it when the remote server has some type of
graphical program that is only available on that server.
For example, suppose in Trick 5, ginger
is a storage server. Many storage devices come with a GUI program to
manage the storage controllers. Often these GUI management tools need a
direct connection to the storage through a network that is at times kept
in a private subnet. Therefore, the only way to access this GUI is to do
it from ginger.
You can try SSH'ing to ginger with the -X
option and launch it that way, but many times the bandwidth required is
too much and you'll get frustrated waiting. VNC is a much more network-friendly tool
and is readily available for nearly all operating systems.
Let's assume that the setup is the same as in Trick 5, but you want tech to
be able to get VNC access instead of SSH. In this case, you'll do something
similar but forward VNC ports instead. Here's what you do:
- Start a VNC server session on ginger. This is done by running
something like:
root@ginger:~# vncserver -geometry 1024x768 -depth 24 :99
The options tell the VNC server to start up with a resolution of
1024x768 and a pixel depth of 24 bits per pixel. If you are using a
really slow connection setting, 8 may be a better option. Using
:99 specifies the port the VNC server will
be accessible from. The VNC protocol starts at 5900 so specifying
:99 means the server is accessible from
port 5999.
When you start the session, you'll be asked to
specify a password. The user ID will be the same user that you
launched the VNC server from. (In our case, this is root.)
- SSH from ginger to blackbox.example.com forwarding the port 5999 on
blackbox to ginger. This is done from ginger by running the
command:
root@ginger:~# ssh -R 5999:localhost:5999 thedude@blackbox.example.com
Once you run this command, you'll need to keep this SSH session open
in order to keep the port forwarded to ginger. At this point if you
were on blackbox, you could now access the VNC session on ginger by
just running:
thedude@blackbox:~$ vncviewer localhost:99
That would forward the port through SSH to ginger. But we're
interested in letting tech get VNC access to ginger. To accomplish
this, you'll need another tunnel.
- From tech, you open a tunnel via SSH to forward your port 5999 to port
5999 on blackbox. This would be done by running:
root@tech:~# ssh -L 5999:localhost:5999 thedude@blackbox.example.com
This time the SSH flag we used was -L, which
instead of pushing 5999 to blackbox, pulled from it. Once you are in
on blackbox, you'll need to leave this session open. Now you're ready
to VNC from tech!
- From tech, VNC to ginger by running the command:
root@tech:~# vncviewer localhost:99
.
Tech
will now have a VNC session directly to ginger.
While the effort might seem like a bit much to set up, it beats flying
across the country to fix the storage arrays. Also, if you practice this a
few times, it becomes quite easy.
Let me add a trick to this trick: If tech was running the Windows®
operating system and didn't have a command-line SSH client, then tech can
run Putty. Putty can be set to forward SSH ports by looking in the options
in the sidebar. If the port were 5902 instead of our example of 5999, then
you would enter something like in Figure 5.
Figure 5. Putty can
forward SSH ports for tunneling
If this were set up, then tech could VNC to localhost:2 just as if tech
were running the Linux operating system.
Trick 7:
Checking your bandwidth
Imagine this: Company A has a storage server named ginger and it is being
NFS-mounted by a client node named beckham. Company A has decided they
really want to get more bandwidth out of ginger because they have lots of
nodes they want to have NFS mount ginger's shared filesystem.
The most common and cheapest way to do this is to bond two Gigabit
ethernet NICs together. This is cheapest because usually you have an extra
on-board NIC and an extra port on your switch somewhere.
So they do this. But now the question is: How much bandwidth do they
really have?
Gigabit Ethernet has a theoretical limit of 128MBps. Where does that
number come from? Well,
1Gb = 1024Mb; 1024Mb/8 = 128MB; "b" = "bits," "B" = "bytes"
But what is it that we actually see, and what is a good way to measure it?
One tool I suggest is iperf. You can grab iperf like this:
# wget http://dast.nlanr.net/Projects/Iperf2.0/iperf-2.0.2.tar.gz
You'll need to install it on a shared filesystem that both ginger and
beckham can see. or compile and install on both nodes. I'll compile it in
the home directory of the bob user that is viewable on both nodes:
tar zxvf iperf*gz
cd iperf-2.0.2
./configure -prefix=/home/bob/perf
make
make install
On ginger, run:
# /home/bob/perf/bin/iperf -s -f M
This machine will act as the server and print out performance speeds in
MBps.
On the beckham node, run:
# /home/bob/perf/bin/iperf -c ginger -P 4 -f M -w 256k -t 60
You'll see output in both screens telling you what the speed is. On a
normal server with a Gigabit Ethernet adapter, you will probably see about
112MBps. This is normal as bandwidth is lost in the TCP stack and
physical cables. By connecting two servers back-to-back, each with two
bonded Ethernet cards, I got about 220MBps.
In reality, what you see with NFS on bonded networks is around
150-160MBps. Still, this gives you a good indication that your bandwidth
is going to be about what you'd expect. If you see something much less,
then you should check for a problem.
I recently ran into a case in which the bonding driver was used to bond
two NICs that used different drivers. The performance was extremely poor,
leading to about 20MBps in bandwidth, less than they would have gotten had
they not bonded the Ethernet cards together!
Trick 8: Command-line scripting and utilities
A Linux systems administrator becomes more efficient by using command-line
scripting with authority. This includes crafting loops and knowing how to
parse data using utilities like awk,
grep, and sed. There
are many cases where doing so takes fewer keystrokes and lessens the
likelihood of user errors.
For example, suppose you need to generate a new
/etc/hosts file for a Linux cluster that you are about to install. The
long way would be to add IP addresses in vi or your favorite text editor.
However, it can be done by taking the already existing /etc/hosts file and
appending the following to it by running this on the command line:
# P=1; for i in $(seq -w 200); do echo "192.168.99.$P n$i"; P=$(expr $P + 1);
done >>/etc/hosts
Two hundred host names, n001 through n200, will then be created with IP
addresses 192.168.99.1 through 192.168.99.200. Populating a file like this
by hand runs the risk of inadvertently creating duplicate IP addresses or
host names, so this is a good example of using the built-in command line
to eliminate user errors. Please note that this is done in the bash shell,
the default in most Linux distributions.
As another example, let's suppose you want to check that the memory size
is the same in each of the compute nodes in the Linux cluster. In most
cases of this sort, having a distributed or parallel shell would be the
best practice, but for the sake of illustration, here's a way to do
this using SSH.
Assume the SSH is set up to authenticate without a password. Then run:
# for num in $(seq -w 200); do ssh n$num free -tm | grep Mem | awk '{print $2}';
done | sort | uniq
A command line like this looks pretty terse. (It can be worse if you put
regular expressions in it.) Let's pick it apart and uncover the mystery.
First you're doing a loop through 001-200. This padding with 0s in the
front is done with the -w option to the
seq command. Then you substitute the
num variable to create the host you're going to
SSH to. Once you have the target host, give the command to it. In this
case, it's:
free -m | grep Mem | awk '{print $2}'
That command says to:
- Use the
free command to get the memory size
in megabytes.
- Take the output of that command and use
grep to get the line that has the string
Mem in it.
- Take that line and use
awk to print the
second field, which is the total memory in the node.
This operation is performed on every node.
Once you have performed the command on every node, the entire output of
all 200 nodes is piped (|d) to the
sort command so that all the memory values are
sorted.
Finally, you eliminate duplicates with the uniq
command. This command will result in one of the following cases:
- If all the nodes, n001-n200, have the same memory size, then only one
number will be displayed. This is the size of memory as seen by each
operating system.
- If node memory size is different, you will see several memory size
values.
- Finally, if the SSH failed on a certain node, then you may see some
error messages.
This command isn't perfect. If you find that a value of memory is
different than what you expect, you won't know on which node it was or how
many nodes there were. Another command may need to be issued for that.
What this trick does give you, though, is a fast way to check for something and
quickly learn if something is wrong. This is it's real value:
Speed to do a quick-and-dirty check.
Trick 9: Spying
on the console
Some software prints error messages to the console that may not
necessarily show up on your SSH session. Using the vcs devices can let you
examine these. From within an SSH session, run the following command on a
remote server: # cat /dev/vcs1. This will show
you what is on the first console. You can also look at the other virtual
terminals using 2, 3, etc. If a user is typing on the remote
system, you'll be able to see what he typed.
In most data farms, using a remote terminal server, KVM, or even Serial
Over LAN is the best way to view this information; it also provides the
additional benefit of out-of-band viewing capabilities. Using the vcs
device provides a fast in-band method that may be able to save you some
time from going to the machine room and looking at the console.
Trick 10:
Random system information collection
In Trick 8, you saw an example of using the command
line to get information about the total memory in the system. In this
trick, I'll offer up a few other methods to collect important information
from the system you may need to verify, troubleshoot, or give to remote
support.
First, let's gather information about the processor. This is easily done as
follows:
# cat /proc/cpuinfo
.
This command
gives you information on the processor speed, quantity, and model. Using
grep in many cases can give you the desired
value.
A check that I do quite often is to ascertain the quantity of processors
on the system. So, if I have purchased a dual processor quad-core server, I
can run:
# cat /proc/cpuinfo | grep processor | wc
-l
.
I
would then expect to see 8 as the value. If I don't, I call up the vendor
and tell them to send me another processor.
Another piece of information I may require is disk information. This can
be gotten with the df command. I usually add the
-h flag so that I can see the output in
gigabytes or megabytes.
# df -h also shows how the disk was
partitioned.
And to end the list, here's a way to look at the firmware of your
system—a method to get the BIOS level and the firmware on the NIC.
To check the BIOS version, you can run the
dmidecode command. Unfortunately, you can't
easily grep for the information, so piping it
is a less efficient way to do this. On my Lenovo T61 laptop, the output
looks like this:
#dmidecode | less
...
BIOS Information
Vendor: LENOVO
Version: 7LET52WW (1.22 )
Release Date: 08/27/2007
...
This is much more efficient than rebooting your machine and looking at the
POST output.
To examine the driver and firmware versions of your Ethernet adapter, run
ethtool:
# ethtool -i eth0
driver: e1000
version: 7.3.20-k2-NAPI
firmware-version: 0.3-0
Conclusion
There are thousands of tricks you can learn from someone's who's an expert
at the command
line. The best ways to learn are to:
- Work with others. Share screen sessions and watch how others
work—you'll see new approaches to doing things. You may need to swallow
your pride and let other people drive, but often you can learn a lot.
- Read the man pages. Seriously; reading man pages, even on commands you
know like the back of your hand, can provide amazing insights. For
example, did you know you can do network programming with
awk?
- Solve problems. As the system administrator, you are always solving
problems whether they are created by you or by others. This is called
experience, and experience makes you better and more efficient.
I hope at least one of these tricks helped you learn something you didn't
know. Essential tricks like these make you more efficient and add to your
experience, but most importantly, tricks give you more free time to do
more interesting things, like playing video games. And the best
administrators are lazy because they don't like to work. They find the
fastest way to do a task and finish it quickly so they can continue in their
lazy pursuits.
Resources Learn
Get products and technologies
-
Order the SEK for Linux,
a two-DVD set containing the latest IBM trial software for Linux from DB2®,
Lotus®, Rational®, Tivoli®, and WebSphere®.
-
With
IBM trial software,
available for download directly from developerWorks, build your next development
project on Linux.
Discuss
About the author  | |  | Vallard Benincosa is a lazy Linux Certified IT professional working for
the IBM Linux Clusters team. He lives in Portland, OR, with his wife and
two kids. |
Rate this page
|