March 25, 2012

Tape salvage - How Tape Works

The use of tape for data storehouse and data rescue in the computer commerce goes back many decades. Tape in case,granted a solid and robust means of storing code and data, along with a far lower cost-of-ownership than the ready hard disk options.

Today the cost of hard disk has plummeted, but tape storehouse is still determined to be the best ready form of long-term archival storehouse in terms of price and resilience.

Those of us who were born long sufficient ago can remember treading gingerly, and speaking in low-tones when passing the floor of the computer construction where the hard disks were housed. Disks were unreliable, low volume, and high-priced to run whereas to recover data from tapes was fast sufficient and likely to work.




The idea of "near-line" storehouse developed, and still exists in the world of mainframe, As/400 and large scale Unix computing. Years ago a invite to recover a file would consequent in a message popping up on the computer operator's screen to fetch the open reel tape labelled Kv19473D and load it on drive 15. The data was recovered from the tape after only a short delay to the user.

These days the operator has been substituted by some form of robotic tape library, and the open reel tape by a tape cartridge that can be handled mechanically (for example Ibm 3590 and Ts1120, Stk 9840, 9940 and T10000, and of procedure Lto Ultrium and Dlt). This process industrialized into Hierarchical storehouse Management, also named Hsm, and allows for "infinite" storehouse (as infinite as you can afford space, tape drives and media).

With smaller systems, this includes some systems that would look pretty large today such as MicroVax, there was a more procedural use of tape data storage. Partly this was due to the cost of robotic equipment, but mostly as the rise of the mini and micro computer coincided with the start of the rise in lower cost more trustworthy hard disks and the idea of client/server and the daily tape backup as a source or data only required when a failure occurred and so to avoid requiring hard drive data rescue work.

Attempts at introducing Hsm into this market, using intermediate storehouse such as visual Disk and using tape for longer term archive, came and went throughout the 90's but largely tape was used as a backup and retrieval medium.

How Tape storehouse differs from Disk

Setting aside the material differences and the low-level recording technologies used the general concepts are no different between magnetic tape and hard disk. Each uses magnetism to encode data on a suitably receptive recording medium.

The real differences are in implementation and usage, and reflect the major corporal differences between the two.

The short retort to "what is the difference" is that disk is a random passage medium and tape is a sequential one. To go into greater depth, disks are commonly pre-formatted with a known number of recordable "sectors" whereas tape is written on-the-fly.

The sequential passage nature of tape reflects its corporal character, it is long and narrow and to get to some data at the far end the drive has to traverse the distance of the tape. With disk recording to recover any recorded sector all the drive must to is position the read head to the right track and wait for the data to spin past. So an passage time of small fractions of seconds versus anyone up to a join of minutes, you wouldn't get far implementing random passage on tape.

The issue of formatting though is far from clear. Early open reel tapes, Exabytes and quarter-inch cartridges (the older version of Slr often known as streamers) had erase mechanisms that cleaned the tape ahead of the write head so recording was always to blank tape.

The smaller quarter-inch cartridges, Dc2000 and more recently Travan, Adr and Ditto, were formatted with sectors (usually during manufacture). The very first Dc2000 drives ran from the diskette controller in a Pc and operated like diskettes. So in principles they were random access, but the practical passage time would few habitancy would live long sufficient to use them in that manner for a considerable volume of data.

Newer tape formats (Sdlt, Lto Ultirum, 3590, 3570 and many others), whilst not being pre-formatted with data sectors do have a lot of servo data written to them during compose and if they are erased become useless. This includes servo tracking data that is used to sustain in the data alignment process now that the recording densities have increased and there is exiguous or no space left unused.

One, often unwelcome, feature of tape storehouse is the idea of "the last thing you wrote is the last thing you can recover". With a hard disk each sector is uniquely addressable. If data is written to sector 79 it has no impact upon sectors 78 and 80. With tape, as soon as recording finishes the drive determines that the last thing written is the new end-of-data. So if you have a tape containing 400Gb and write 2Mb to the start of it, there is just under 400Gb sitting on the tape that cannot be accessed without recourse to a tape data rescue service.

In Data rescue parlance this is over-writing or re-initialisation. Don't be fooled into mental that there is any opportunity of getting the data back that has as a matter of fact been over-written, that is the stuff of science fiction, but the remaining but inaccessible data can often be recovered from the tape

The advantage that tape gives is that each file is normally stored contiguously and there are none of the frailties of file allocation tables complicated when accessing the data.

This is all commonly true, but there are no rules. Some tape recording formats (Legato Networker, NetBackup and Arcserve surrounded by them) take data from complicated sources and intertwine it on the tape (sometimes known as multiplexing or multi-streaming). As said earlier there is nothing to stop the development of random-access tape, but the shape is wrong and it would never catch on.

There are, however, compromises. Ibm 3570 and Stk 9840 exertion to split the distinction between the two styles of recording. They use a tape cassette, so the tape is on two reels within the case rather then like Dlt and Ultrium where there is a singular spool and the tape is transferred to a take-up reel within the drive. The "start" of tape is as a matter of fact in the middle, so at load time the tape is half way from whether end, and the data is stored on complicated tracks so that the drive can position over and along the tape to uncover data. So a nod towards random-access and faster passage time than your mean tape though the time to recover data from any singular file is still commonly considerably longer that with disk.

Tape storehouse Concepts

We can set aside the actual recording technique and put the clock back to the 9-track ½-inch open reel tape for the concepts complicated in tape data storehouse and tape data recovery. This type of tape was sublime during the 1980's and to an extent the drives that followed had to imitate the methodology followed in order to replace it. This means that an Ultrium drive, a Dlt drive and a Dat drive all take data and give it back exactly as the open reel drive did, even though they use radically different recording formats.

With the open reel tape data was transferred to the drive as a sequence of data buffer loads named blocks. The drive would encode each of these with its own identification and error correction data, and with a gap in between each one. This inter-block gap is why you might sometimes hear habitancy saying that they "used a larger block size to increase capacity". On open reel tapes the gap was of a fixed size so the smaller the block size the greater the number of blocks required to store an number of data. The greater the number of blocks, the greater the number of gaps and so capacity was lost. Then again, with older tapes the larger the block size the more opportunity of hitting an unusable area of tape so the whole thing was a bit hit-and-miss.

With contemporary drives the data block is merely what you send to the drive, and what you get back. Internally it is a matter of encoding and has exiguous to do with how data is as a matter of fact stored.

The above had a join of exceptions, notably the earlier Exabyte 8mm helical scan drives. These split data into 1024 byte sections when writing to tape and would not share a 1024 byte storehouse unit between user data blocks. The consequence of this was that if you write 1025 byte blocks to a tape then each was written as 2048 bytes and the capacity of a tape was halved. There are exceptions to all rules.

Tape concepts

So, tape drives article to theoretically blank tape, have no pre-formatting, and if you article data at the start you have lost all things that you have overwritten and anyone after the point at which you stop writing.

There is nothing whether right or wrong with any of this, it is just they way they are. What tape gives you is high volume, low cost per gigabyte storehouse that you can drop on the floor, pick up and read afterwards. Don't try that with a hard disk and then expect to be able to as a matter of fact recover your data.

File Marks aka Tape Marks

These are a sub-divider that you won't find on a disk. A file mark is a data pattern encoded by the drive and used to allow spacing to a singular position on a tape. You want to recover data from backup set 3, well the backup software doesn't read through backup sets 1 and 2 first, it skips file marks and then starts to read and recover data once it has found set 3.

With 4mm Dat there is an added type of file mark named as the set mark. This allows there to be two safe bet types of data marker, though only Sytos Plus, Sbackup and a few ownership formats ever made use of this feature.

Helical Scan drives, Ait, Exabyte and Dat, encode file marks so that they can be found during high speed crusade operations. Normally, as with a video recorder, the tape moves gently during reading. It would take 2 to 3 hours to position down the tape at reading speed so they kick the drive into fast seek and can then get to the next file mark in a fraction of the time. In video terms this is a "fast-forward" and enable fast passage when recovering data from the tape.

Don't be fooled by the name though. They sound like small exiguous markers when as a matter of fact they can be some megabytes in size on some types of tape.

End of Recorded Media

When reading from a tape you might encounter a condition named "End of Recorded Media", sometimes reported as "Blank Check". On older drives when recording completed the drive would erase a distance of tape afterwards. Subsequent reading attempts would run into this distance of blank tape and know they had reached the end. contemporary drives encode a data pattern, similar in size to a file mark, that denotes the end of recording. Data rescue via general means stops at this point, there is no way past and devotee rescue methods and technology need to be employed to gain passage to this lost data.

Mainframe, and some midrange, systems did not rely upon the drive reporting that the end of data had been reached but relied upon their own devices. Ibm systems would encode a double file mark, Hp systems used a triple file mark. These patterns denoted logical end of data.

These systems will still rely on their logical mechanism for saying "that's it", but the drive will still do its own thing. The occasion recording stops the Eod is written and that is that without pro data rescue assistance.

Block Modes

Variable Block Mode

Disks are typically formatted with recordable sectors each of 512 bytes. Ibm for the As/400 use 520 or 522 bytes. Tapes, of course, have to be different.

Modern tape drives can article in whether fixed block of variable block mode. This is to enable them to plug into systems that have differing pedigrees.

Mainframe systems, for example Ibm 380/390 and As/400 (Ok it is not a mainframe but it behaves like one), write data in chunks that were the precise size for their purpose. The label block at the start of an Ibm Labelled tape was defined as being 80 bytes long, so an 80 byte block was written to the tape. Since 80 byte blocks were not a practical proposition when dealing with open real tapes the actual data was written in larger chunks exiguous in size only by ready memory in whether the principles of the tape drive formatter.

Fixed Mode Recording

Smaller systems and cheaper drives tended to deal with data pretty much as they did with disk. It did not matter how big the data was it would be written out in equal sized chunks and the drives ready in this shop segment obliged. The early quarter inch cartridge drives would only article data in 512 byte sections. Smaller Unix systems and Pc systems have a tradition of recording in this manner still do. The only real distinction between disk and tape here is that the tape block sizes for fixed mode recording have typically extended to 64Kb or higher.

Later drives have been designed to be backwards compatible with this more primitive format and with the more high-priced drives that operate in variable mode, or to be plugged in as direct exchange for these drives and so can operate as whether Fixed or variable Block recording devices.

Block Numbering

Early drives relied on skipping file marks to position along tape, but later tape devices introduced the idea of block numbering. So each tape block has a unique number starting at 0.

This partly explains why the tape block sizes used have increased over time. The Scsi specification describes the block number using 3 bytes, a maximum of 16,777,215 blocks. With 512 byte blocks this would mean that the maximum capacity of tape would be in the region of 9Gb, not very helpful if writing to an 800Gb Ultrium 3 data cartridge.

Recording Techniques

Three basal tape storehouse formats have industrialized since the late 1980s.

Multi-track parallel
Helical Scan
Serpentine

Although the ground between parallel and serpentine formats has done more recently with drives having elements of both formats.

½" open reel - Aka known as 9-track parallel

The drive records 9 tracks of data at once to the tape surface. Recording begins at the corporal start of the tape (Pbot) and ends at the corporal end of the tape (Peot). This format industrialized from the punch card idea with the eight bit byte and a parity bit. So this is one byte at a time recording.

The capacity of these tapes is tiny by today's standards. Nrzi recording format managed a improbable 23Mb at 800bpi on a 2400 foot tape. In its heyday, with a heavy 6250 bits-per-inch the capacity rose to an impressive 180Mb.

Helical Scan

We are all more customary with helical scan than we might realise. It is a technology that was industrialized for video recording (Vhs and Video8) and sound recording (Dat).

The tape is wrapped colse to a cylinder that contains the read and write heads. The tape moves gently whilst the cylinder spins quickly with each rotation allowing data to be written and then read back to check (Read-after-write).

The name Helical scan springs from the patter described by the head passing along a gently intriguing tape as "describing a quantum of a helix". (it is probably a more marketable name than "diagonal data")

Exabyte Corporation took the Sony Video8 8mm recorder, added a Scsi interface and some added checking and came up with a 2Gb data storehouse format which was way ahead of its rivals, albeit briefly.

Hp and Sony adapted developments in the audio shop with 4mm media named Digital Audio Tape, added added error correction and came up with Dds Dat. Sony later created Ait based, an 8mm helical scan format and even one of the Stk mainframe drives used this technique.

Serpentine

The name arises from the pattern of the recording being transmit and backwards for a number of tracks, apparently a bit snake-like in character (according to some imaginative marketing person).

Early drives had a pair of recording heads, one for forwards recording and one for reverse. The drive would article forwards until corporal end-od-tape (Peot), reverse until corporal beginning-of-tape (Pbot), then re-position the heads and repeat the process. Early drives recorded 4 tracks, the most recent article hundreds and overlap with the parallel formats by recording some tracks simultaneously.

Equally parallel format recording drives now records along the tape forwards and then reverse so they have become practically serpentine.

In the data rescue context there is the issue that corporal damage impacts complicated places in the recording since the drive passes over each area of tape. Of procedure this is only an issue if the tape snaps or becomes crumpled, and there is an discussion as to how likely this is compared with helical scan devices which have a much more complicated tape path. We have no intention of entering the affray between exponents of each style of recording.

Conclusion

Tape still has a major part to play in data safety and the long term archival of important information. As a data rescue devotee I see both failed hard disk drives and damaged tapes, and whilst tape rescue comes with its own set of challenges that can make it a tortuous process, seldom is a tape a complete failure and the data rescue success rate is well over 95%.

Tape salvage - How Tape Works

Refrigeration Compressor Troubleshooting Thierry Daniel Henry Skills Home Made Light