Business Automation Bulletin 95.5/ Published Bimonthly / April/May 1995


I KNOW I LEFT MY DATA HERE SOMEWHERE

The Lowdown on Data Storage


This Bulletin is the fourth in a series covering new developments in computer technology and their value to business. The first two examined computers and their components (i.e., chips, RAM, disks, monitors, etc.) and the third was about printers. This one reviews data storage--not PC disk drives, which were discussed previously--but the high- capacity storage systems used in networks and firm-wide computer systems. Future installments will cover communications, networking and computer/telephone integration.

There are three types of storage systems to consider: magnetic disks, optical disks and tape. Each has an important place in business computing, as described below.

MAGNETIC DISK SYSTEMS

Magnetic disk drives have been the staple of all computer storage systems for 25 years, and this won't change any time soon. Some experts used to think other technologies might one day supplant magnetic disks, but this never happened because disk drives improved so rapidly. Although nearly every other part of the computer has seen revolutionary change, storage has evolved instead. In short, disk drives just keep going on, getting bigger, faster and more reliable, while their price keeps coming down.

This rapid cost/performance gain has had major impact on computing. Beyond just hardware, it has fueled a major change in software. Disk space is so cheap that software developers now give little, if any, consideration to how much disk capacity their programs consume. There's one small drawback: disk speeds have improved, but not as much as cost and capacity. There are ways to compensate for this change, however, as described in the following sections.

System storage vs. PC storage

Clearly, the most efficient way to use disk capacity is to store all the shared data centrally, rather than having every user keep a copy on his or here own PC. In fact, this is the most critical function of a "file server" on a network, or the central computer in a multi-user system. Besides efficiency, another benefit of central storage is the ease and simplicity of backing up the data files centrally.

Network buyers are sometimes concerned that it'll take longer to access centrally stored data than to access data on the users' PCs, but such fears are seldom justified. File servers usually have the fastest disk interfaces (called "fast wide SCSI") and are designed to keep the most frequently used data available instantaneously (in RAM). That way it doesn't have to be read repeatedly from disk, which takes approximately 1000 times as long. PCs use this technique too (called "caching"), but not as efficiently as a well-designed server. In addition, many new central storage systems use a technique called "RAID", see below, that dramatically improves retrieval speed. As a result, network file servers are usually faster (even counting the transmission between the server and the user) than retrieving data from the disks on the users' PCs.

RAID: What it is and why it helps

RAID stands for "Redundant Array of Inexpensive Disks" and it's the storage technology of the future. RAID systems have multiple disk drives (usually five or more), arranged so that the data is spread across several of them. That way, when a retrieving block of data, the RAID reads its parts simultaneously, which is much faster than reading from a single drive.

RAID storage systems have another major benefit too: reliability. They spread the data across several drives, as described above, but they save one drive for redundancy. This drive contains special "check" values that it computes uniquely based on the live data on the other drives. Then, if one of the failed with live data crashes, the RAID keeps going by reading the "check" values and undoing the prior calculation "on the fly" to reconstruct the missing data from the drive that crashed. Also, most RAID systems are "hot-swapable", which means that a failed drive can be replaced (and all its data can be reconstructed) while the RAID system keeps on running.

The lone drawback of RAID storage is cost. Someday RAIDs will cost little more than single- or dual-drive systems with the same capacity. For now, however, there's enough of a price differential (e.g., a 10-gigabyte RAID might cost 20% to 50% more than two 5-GB drives) that buyers really have to need the speed or the increased reliability to justify it.

OPTICAL STORAGE

There are two main types of optical storage, CD-ROM and magneto-optical. They don't have the speed or capacity of magnetic disks, but both have a much lower cost per megabyte of storage space.

CD-ROMs

Most CD-ROMs used for business today (as opposed to consumer titles) are prerecorded reference volumes (such as legal or school libraries). Because most reference material is usually referred to sporadically, it's usually most cost-effective to configure the CD-ROMs just like normal network disk drives. That way everyone can share access to them, rather than giving each user his own CD-ROM drive and passing disks around as they're needed. Some combination of permanent drives (mounted with the most popular disks) plus a "CD-ROM jukebox" (holding less frequently used titles) is usually the best configuration for most firms. Jukeboxes hold from 6 to 50 individual disks and can "play" up to three individual CD-ROMs at a time.

CD-ROM drives have varying speeds, with the fastest (so far) having the ability to transfer data from the drive to the server at nearly one megabyte per second. This is called 6X speed because it's six times as fast as the original CD-ROM drives. The slowest drives still available, 2X, are fine for most reference use, but 4X drives (mandatory for graphic and video), don't cost too much more and are preferred for network use.

Magneto-optical

Optical disks, like CD-ROMs, have more storage capacity and lower prices than magnetic disks. The difference is that they're re-recordable, just like magnetic media. Unlike CD-ROMs, however, optical disks come in a carrier, and they can't be read by the same drives.

The primary application for optical disk is data archiving. They're best for situations where occasional rapid look-ups are needed, making tape impractical. [Note: Affordably-priced recordable CD-ROM drives have recently come out. They can also be used for archiving but, unlike optical disk, they can only be recorded on once, not re-recorded and overwritten like magnetic disks.] Magneto-opticals are configured just like CD-ROMs, as either single drives or jukeboxes, and can be accessed in the same way.

TAPE STORAGE

As has always been true, magnetic tape is the least expensive way to store data. That's why it's still the most common archival medium. Tape's biggest draw-back is that it's "sequential"; which means tapes have to be read from front to back. You can't skip around picking data from all over them like you can with disks. They also degrade in time, depending on the environment they're stored in. This can make it difficult--or impossible--to read data stored on old tapes.

Today, most tape drives use cartridges instead of the bulky old reel-to-reel tapes. And, yes, there are "tape storage systems" available for them too, like jukeboxes for CD-ROMs, only much slower.

STORAGE OPTIMIZATION

The "hottest" concept in data storage today is an automated archiving procedure called Hierarchical Storage Management (HSM). With HSM, a software program tracks the usage of all your data files and optimizes where they are stored. Frequently used files are kept on magnetic disk for instant retrieval. Seldom-used data is moved to optical disk, where storage is cheaper, but it's still readily available. Data that's rarely used is moved automatically once again . . . to a tape.

There are two really innovative parts about this approach:

This approach is particularly useful for big networks (and "mainframes") because it takes over the burden of managing the storage, a very time-consuming task. This lets companies get away with a smaller network management staff. Be wary, however, HSM is new and still a bit flaky and difficult to install properly.



Back to the top of this Bulletin
Back to the Business Automation home page

Mail to: brooks@bizauto.com with any questions or comments
Copyright © 1995 Business Automation Associates, Inc.