System Adminstration

Advanced Linux system administration

Summary
Derek Balling reviews some steps to advanced system administration that he learned at the O'Reilly Open Source Software Convention. (3,500 words)
By Derek Balling at Linuxworld.com

Bryan Andregg, the senior systems administrator and network administrator for Red Hat Software, is probably in a unique position to give a class on Linux system administration. As the copy from the O'Reilly Open Source Software Convention's Linux conference notes reads, "Mr. Andregg has probably made every mistake in the book and is happy to share them so that others don't do the same." At last month's O'Reilly conference, Andregg gave a full-day seminar on the ins and outs of Linux system administration, from the simple to the decidedly advanced. After all, a problem that may be elementary to one person may be hideously complex to another, only because the latter person hasn't yet had to deal with it, hasn't yet had to think about it, or never really twisted it around in a manner that makes it easy to tackle. Andregg assumed nothing about the skill level of his students, and in so doing was able to ease into some very complex issues for Linux administration.

After putting some simple issues aside, the first thing Andregg really dove into was the File-system Hierarchy Standard, or FHS. The FHS is a standard directory structure that was designed in the hopes that all Linux distributions would put their binaries, libraries, configuration files, etc. in the same place. In this way, installed software would always know where to find the things it was looking for, and how to put itself into a place that would be correct.

Andregg went through the list of directories defined by the FHS and described what should be in each directory and subdirectory thereof. For example, Andregg described the differences between /bin and /sbin, /usr/bin/ and /usr/sbin, etc. For some, this may seem like a very silly thing to have to describe, but for others -- possibly new to Linux administration or fresh from formatting over a Windows NT installation -- the differences are perhaps more subtle and necessitate explanation.

The next important topic was hard disk partitioning. This is an important issue that generates a variety of opinions; if you get five system administrators in a room together and ask them to describe their preferred partitioning scheme for a client and a workstation, chances are that you will have five distinctly different methodologies for each of the two systems. Andregg's partitioning scheme for servers is fairly consistent with what many people would recommend: assign separate partitions to /, /home, /var/log, /usr, /var, and to any server directories that are needed (such as /home/httpd, /home/ftp, /var/spool/news, etc.) The client partitioning scheme Andregg presents is notably lacking in something, that being a /home partition. Andregg, like many others, firmly believes that the user's home directory should be stored on a centralized server and mounted remotely by the client workstation. He recommends partitions for /, /usr, /opt, /var, /usr/src, and a swap partition on client machines.

This scheme allows, in the case of /home, all the user-owned files to be completely separate from the operating system. If a complete reinstall of the OS were for some reason required, the user's files would be intact and wouldn't even have to be touched. /var/log is made into its own partition because logs tend to rack up fast, especially if something interesting is happening. By moving the logs to their own partition, you can eliminate the chance of a potential log overflow affecting other applications. /var, by definition, is supposed to be its own mountpoint, because it is variable in size. This is where things like queues and spools are stored. Since their size is unpredictable in nature, it is best to move them to a partition where they cannot affect other applications. /usr's mountpoint tends to be just something of a personal thing, even though many do it. Ostensibly, it is there to allow the root partition to be as close to exact in size as it needs to be, with all the room for growth being in /usr, on its own partition.

Both the FHS and disk partitioning schemes have separate /var directories so that the files in /var/log don't fill your system partitions and cause problems. Despite having all that room to work with, your log files still will get large and need to be rotated, with old data removed to make room for new data. Linux systems -- servers especially -- can generate copious amounts of logging data, and one somewhat daunting task administrators face is how to sift through the large amount of noise in the logs looking for the important lines.

Rolling over the log
Andregg has created an application called Logrok (see the Resources section below for links to this and other resources mentioned in this article) which is designed to make life easier when it comes to getting that useful info from the logs. Logrok can be told to look for certain patterns, ignore others, and report the results to any number of people. One important thing to remember when using any log-parsing utility like this, though, is that you need to ease into it. Filter nothing, and then find things you can safely ignore. If you just tell Logrok (or any other log-parsing utility) the things you want to see and tell it to ignore everything else, you will undoubtedly miss errors you didn't know you needed to watch for. It is far more useful to tell it specific things to ignore, and then look for other things you can ignore in the subsequent reports it generates for you. This is more of a trial-and-error process and may take a bit longer, but it gives the administrator a far greater chance to see the problems he or she needs to see, instead of having them shunted aside into /dev/null by an overzealous filter.

Upgrades and installation management
No session about Linux system administration would be complete without a discussion of package managers. Admittedly, Andregg's discussion was to a certain degree centered specifically on the Red Hat package manager (RPM). But given that Red Hat is his employer, that's understandable: RPMs are what he deals with every day, and what he's most familiar with.

Package managers of any kind can have their functions broken down into a few simple actions: install, erase, upgrade, build, query, and verify.

During installation, a package manager selects the installation medium (disk, FTP, Web, etc.), then tries to find conflicts. A conflict can occur if, for instance, the new package wants to own files that are already owned by another package. For example, if package foo were installed and it owned the file /etc/foo.rc, and package bar also wanted to own file /etc/foo.rc, there would be a conflict when the bar package is installed. Services can also be in conflict. For instance, the Sendmail package could tell the system that it is providing SMTP. If, later, the administrator attempted to install qmail via the package manager, it would note that it wanted to provide SMTP, but that Sendmail was already doing so. Even if the packages did not share common files, the package manager would know that the system could not have two SMTP daemons and it would flag this as a conflict. At that point, the package manager wouldn't actually perform the installation; instead, it would generate an error and advise the administrator to either correct the problem (usually by removing one of the conflicting application), or allow it to be forced or overridden.

When the time comes to uninstall a package, the system must also compare the files that are found on the system with the MD5 checksums of the files it originally installed. If the user changed any files,such as configuration files, mail spools, etc., the package manager will note this to the user and preserve the changed files, forcing the user to delete them manually. This is a safety measure used to ensure that user data is never accidentally destroyed, unless it is the stock data which can easily be replaced from the original package.

Password authentication with PAM
Another thing Andregg touched on was the subject of pluggable authentication modules (PAM). PAM was designed as an answer to a problem many administrators have faced when converting from storing passwords in /etc/passwd to using shadow passwords. In a PAM-less world, any application which tries to read the password (screen locks, user maintenance utilities, etc.) would first need to be modified to use the shadow password and then recompiled. PAM provides a different access method, allowing PAM modules to be installed for password files, SecurID cards, or just about any authentication scheme you can think of. This allows applications to simply query PAM for authentication information, and PAM then handles the problem of comparing that data to what is acceptable. Using PAM also assists in deployment of alternate authentication technologies, such as SecurID cards or other token-based systems.

Scaling admin tasks: Authentication and configuration
Scaling Linux systems to 1,000 users is a radically different situation than scaling up to meet the needs of 20 or 30 users. Authentication over large password files can be cumbersome, especially if, as Andregg puts it, it's 9:00 a.m. and everyone is coming to their desks and signing on at the same time. For this, he recommends a distributed architecture of Network Information System or Yellow Pages servers. In this manner, one master server might have several slave servers assigned to it, dividing the authentication workload up amongst them.

One problem that can crop up in large installations is that the sheer quantity of home directories will cause performance degradation. Due to the way Unix files and directories are stored, accessing a single directory among a thousand will cause the entire list of directories to be read. Depending on the operation being performed, this can have a severe negative impact on performance.

Taking into account the needs of your environment, the problem might be easily solved by something as simple as breaking up the home directories by user ID. For example, you might set up 26 top-level home directories and move /home/dballing and /home/derek into /home/d/dballing and /home/d/derek respectively. Another possible solution to the problem is to set up separate servers for different departments' needs. Perhaps the sales force can be moved to one server, and the developers to another, causing the number of directories to be reduced to something easily manageable. It's possible that some combination of the two will be the best answer for you.

With 1,000 users come 1,000 workstations -- all of which need to be configured. Andregg asserts that workstations should be generic. After all, your home directory is stored on a file server somewhere; why should your actual workstation be configured any differently? Andregg's claim is rooted in the presumption that, if a workstation machine dies, the administrator should be able to replace it silently with an identically configured workstation from a standby pool, and the user shouldn't be able to tell the difference. To this end, he recommends Kickstart, a utility for creating a floppy disk which will contain a predefined installation configuration, and installs that configuration onto a machine. The installation can be configured to obtain its IP address automatically via the network, and to retrieve the installation files it needs via the local network -- eliminating the need to run around the office with a CD-ROM.

Once you have the system installed, the next challenge is to decide how you want to mount your network devices, dynamically or statically. With dynamic automounting, you are locked into to the principle of not committing resources until you demand them. Automounting network drives also makes the task of dealing with missing mounts (when the network drive you've mounted is inaccessible due to network problems) easier. One disadvantage to automounting is that tab-completion will not work on network drives unless the network device is actually hard mounted.

Perhaps what I found most amusing, as a mail administrator myself, was Andregg's very simple m4 file for creating a workstation's Sendmail configuration:

include(`/usr/lib/sendmail-cf/m4/cf.m4')
FEATURE(nocanonify)
FEATURE(nullclient,'mail.redhat.com')

Using this configuration, the local workstation, when given mail to handle, will do absolutely nothing except to forward it on to Mail.redhat.com (or any other local mail server you configure there) for that machine to handle. It was nice to see someone else who had the exact same theory on how client workstations should behave when it comes to mail, since that config is virtually identical to what we use at my job.

Keeping workstation configurations in sync
Once you have the client workstations deployed, how do you keep them in sync with each other? Andregg's answer is simple: a concurrent version system, or CVS. You can store the configuration files for a generic workstation in a CVS repository, and configure the client machines so that they resync to the CVS repository on a regular basis. In this way, even if your users decide they want to tinker with their configurations, the changes will get overwritten by the data from the CVS repository unless the user can convince the CVS administrator to check in the change in the file.

While this may seem a bit harsh to users, imagine how cruel it would be to force administrators to worry about 1,000, 2,000, or 10,000 different configurations on machines. Users must be instructed to make their preferences known in their /home directory as per-user changes rather than as global defaults. In this way, a user can get her custom settings even as her box remains identical to everyone else's for ease of administration.

There are times when it might become necessary to take a more active role in file distribution, perhaps pushing configuration changes out to the clients rather than waiting passively for the clients to reboot and update themselves. For this, Andregg recommends rdist or rsync, which allow the server to push files out to the client workstations. There are important security considerations to dwell upon here, however, because using these utilities means allowing remote root logins to the client workstations (since only root will be able to maintain many of the files you want to push out).

To ensure that client configuration files remain the same, the next step is to configure the only thing that must be different from workstation to workstation: the IP address. Thankfully, this is very easily done these days, using the Dynamic Host Configuration Protocol (DHCP). DHCP allows the client workstation to, upon booting, send a broadcast message to the network, asking essentially, "Who am I and how do I talk to the world?" A DHCP server on that network would then respond to the workstation, dynamically assigning it an IP address, gateway, DNS servers, etc. DHCP also has advantages insofar as it makes the sometimes necessary and painful process of readdressing (from one ISP to another, for example) into a simple task. You simply tell the DHCP to stop assigning addresses in one address space, give it address ranges in your new range, and wait for machines to reboot or for their leases to expire on their assigned IP addresses.

Playing it safe: Security
The next problem an administrator may face is that of the road warrior, the travelling salesman or executive on the road. How do you allow her or him to access to the internal networks and still make sure that no one can read confidential data going to and coming from the user's laptop? Andregg suggests cryptographic IP encapsulation (CIPE), which creates a virtual private network (VPN) between two remote networks by encrypting packets on the way out and wrapping them inside another IP packet, which will be decrypted on the other end.

To protect the company network from the outside world, Andregg recommends using Linux's native firewalling code (via either the older ipfwadm system, or the IP Chains included in the 2.2 kernels and beyond). IP firewalling allows the user to create highly configurable chains, or steps, to which each inbound (or outbound) packet will be subjected. The packets can be compared for their contents, their source or destination ports, their eventual destination, or any of a number of other criteria. After a packet has been identified, it can be accepted, rejected, modified, passed to another chain of rules to be processed, or just dropped on the floor and forgotten. Using IP firewalling allows the Linux administrator unlimited control over which machines can access which services and hosts. Using the IP masquerading features, machines on the internal network could be given address space that is not routable on the public Internet (and yet they could still access the Web). Direct attacks on these machines would then be impossible, because a remote user couldn't directly address a packet to them.

RAID
The last, often confusing, issue Andregg tackled was that of RAID storage on Linux. He described the many different types of RAID systems and their respective virtues and costs, gave examples of configuration files which would represent the various levels of RAID storage, and described what kernel options need to be enabled at compile time so that the kernel could actually handle the RAID devices that were being created.

Table 1. Various forms of RAID compared
Raid Type Description Benefits Downsides
Linear Concatenation of several disk partitions to create a single large partition. Allows the user to add more space to a partition as needed by adding partitions to the linear array. Not the most efficient use of disk space, even when the benefits of ext2fs are thrown into the mix. No data integrity benefit over standard hard disk configuration.
Raid 0 Similar to Linear, except that partitions are striped, with some data being stored on each partition. Improves performance because disk access is distributed among several partitions. No data integrity benefit. Unable to grow the logical partition beyond its original designated size.
Raid 1 Mirroring. An identical workspace is created where any writes to one logical volume are also performed on the second logical volume. Data integrity. If any drive in the first logical volume fails, the second logical volume will assume control. When the first volume is repaired or replaced, it will be rebuilt from the secondary. Tremendous disk capacity issues. You need twice the space you actually use, so that an identical mirror can be created.
Raid 0+1 Combines striping and mirroring. Takes your drives and creates a Raid 0 array out of them. Then takes that and mirrors it to another drive array. Provides both the performance boost of striping and the data integrity of mirroring. Same disk capacity issues as Raid 1.
Raid 4 Block interleaved parity. Data is striped across a drive array, with the last drive in the array containing checksums of the data on the previous drives. If any drive dies, the missing data can be extrapolated from the remaining data plus the checksum. Allows the option of spare disks, which would allow the RAID system to notify you of a deceased drive and then immediately start rebuilding it on the fly so that there is decreased chance of another failure causing data loss. Provides both striping performance and data integrity. One drive array is sacrificed to data integrity, others to (optional) hot spares.
Raid 5 Block interleaved distributed parity. The same as Raid 4, except that the parity checksums don't all reside on one drive array. Same benefits as Raid 4, plus removes some I/O bottlenecking on the parity disk array. Same as Raid 4.

Better performance through tweaking
Before he ended the class, Andregg touched very briefly on a couple of ways that administrators can tweak their systems for better performance. For disk space, he recommended a number of ideas that would be beneficial.

Because of the way the ext2fs filesystem works, there are generally a number of backup copies of the superblock stored throughout the drive. However, on a large drive, you may end up committing far more space than you either want or need to those backups. If the first twenty backups of the superblock fail, chances are the rest aren't going to help you much anyway. For large arrays, he recommends using the sparse-super flag when creating the ext2fs filesystem.

One other thing you can do to your disks to help keep the system running at peak performance is to change the reserved blocks values. By default, ext2fs reserves a certain amount of the drive for the root (or other) users. This is either a set number of blocks or a percentage of the drive. On a large drive, it may be completely unnecessary to reserve potentially hundreds of megabytes as breathing room for the root user. Using the tune2fs utility, an administrator can reduce this reserved space to something he or she feels more comfortable with.

Andregg has given this talk at a number of conferences, including LISA and USENIX. Judging by the response the attendees gave him, I suspect he will continue to keep the tutorial current and present it, or its descendant, at future conferences. Even if you are an experienced sysadmin, I can firmly recommend his talk -- Andregg's presentation is top-notch, he will keep you interested even while discussing those topics that you already know, and you will probably pick up any number of new tidbits.

Discuss this article in the LinuxWorld forums (3  postings)
(Read our forums FAQ to learn more.)

About the author
Derek Balling works for Yahoo! in Santa Clara, CA. He is also the author of the open source monitoring project Pong3. When he can find spare time, you will often find him ignoring the real world and playing Half-Life far more than any person ought to be allowed to.


Advertisement: Support LinuxWorld, click here!




Resources Additional resources:

Feedback: lweditors@linuxworld.com
Technical difficulties: webmaster@linuxworld.com
URL: http://www.linuxworld.com/lw-1999-09/lw-09-admin.html
Last modified: Monday, October 04, 1999