Knowledge Store: StorageNBackup

Showing posts with label StorageNBackup. Show all posts

Wednesday, February 20, 2013

SAN BASICS QUIZ

SAN BASICS QUIZ

1. This is a technology for transmitting data between computer devices at data rates of up to 10 Gbps. It is especially suited for connecting computer servers to shared storage devices and for interconnecting storage controllers and drives.

What is it?

2. This is a type of SAN switch that provides features such as storage virtualization, quality of service (QoS), remote mirroring, data sharing, protocol conversion and advanced security. These switches are an important part of storage area management, a methodology that is gaining in importance as networks become increasingly complex and expensive to deploy, operate and maintain.

What is it?

3. This is a general term for several approaches to using the Internet Prototol (IP) in a SAN usually over Gigabit Ethernet. Proponents claim that it offers a number of benefits over Fibre Channel, and will promote the widespread adoption of SANs that was predicted when they were first introduced.

What is it?

4. This type of backup can be conducted through a SAN or with a tape device directly attached to the storage subsystem. Some of the advantages of this type of backup include shorter backup and recovery times and less disruption to other systems and applications.

What is it?

5. This is a transmission technology based on the Ethernet frame format and protocol used in local area networks (LANs). It provides a data rate of 1 billion bits per second (one gigabit).

What is this?

6. Storage networks must partition their physical disks into logical entities so that host servers can access storage area network storage. This term can be used to refer to an entire physical disk, or a subset of a larger physical disk or disk volume.

What is this?

7. This is a is a circuit board and/or integrated circuit adapter that provides input/output (I/O) processing and physical connectivity between a server and a storage device.

What is this?

8. This is a standard designed to enable Fibre Channel communications to run directly over Ethernet. It makes it possible to move Fibre Channel traffic across existing high-speed Ethernet infrastructures and extend the reach and capability of SANs.

What is this?

9. This is an IP-based storage networking standard for linking data storage facilities, developed by the Internet Engineering Task Force (IETF). These types of SANs are popular in SMBs because they are often easier to set up and less expensive than Fibre Channel SANs.

What is this?

10. This is a method of optimizing the efficiency with which the available space is utilized in SAN. Added benefits of using this include reduced consumption of electrical energy, smaller hardware space requirements and reduced heat generation as compared with traditional networked storage systems.

SAN BASICS QUIZ

SAN BASICS QUIZ

What is it?

5. This is a transmission technology based on the Ethernet frame format and protocol used in local area networks (LANs). It provides a data rate of 1 billion bits per second (one gigabit).

What is this?

7. This is a is a circuit board and/or integrated circuit adapter that provides input/output (I/O) processing and physical connectivity between a server and a storage device.

What is this?

SAN switching: How to configure a SAN switch

What is involved in configuring a SAN switch?

The answer to this question depends on whether you're starting a new SAN fabric or adding a switch to an existing one. If you're starting a new fabric, the configuration of the switch is much easier. All switches have a default setup and IP address. You'll need to connect to the IP address using a browser or command line and carry out some changes on the switch so it's configured correctly for the environment. For new switches, the only changes you must make are to configure the IP address, subnet mask and default gateway to allow you to connect to it via a browser or whatever transport you choose. All the settings will work for the new fabric.

The first switch in the fabric is called the principal switch and this switch holds the master database for fabric configuration. When other switches are added to the fabric, they download that information from the principal switch. All switches also have a domain ID which can be statically configured or allocated from the principal switch.

I tend to configure switches with static domain IDs so I can guarantee a particular domain ID will never be allocated to two different switches. If two switches have been allocated the same ID this could cause fabric segmentation, outages in the fabric and denial of service to logical unit numbers (LUNs).

For best practice you should also ensure unused switch ports are disabled. This will prevent unauthorised devices logging into the fabric and causing disruption to traffic. This should be done following initial port testing on a switch you're about to add to the fabric but before you add it to the fabric.

How do I decide what topology to implement?

Before starting on SAN topology it's important to say that when redundancy is required the known best practice is to implement two SAN fabrics and have known devices connected to both of them. This ensures that if a host is connected to both fabrics it will still be able to operate effectively if there's a switch, HBA [host bus adapter] or even an entire fabric failure. In these answers I'm going to assume that if redundancy is required then two identical SAN fabrics will be implemented.

There are a number of topologies that can be used when configuring a fabric, although there are three I'd recommend, depending on the size of the fabric and number of switches.

The single switch fabric has one switch. Director-class switches can be purchased with hundreds of ports, although they're expensive compared to low-capacity switches such as those with 32 ports. The single switch SAN offers the lowest possible latency between the host and its associated storage as all devices are connected to the single switch.

As most SANs grow over time, it's more likely that an organisation with a small SAN -- possibly a single switch SAN -- will add more switches as the number of devices grows. As you add devices this brings us to the second type of SAN, which is a mesh fabric. This type of fabric is one where every switch in the fabric is connected to every other switch in the fabric. The connections are via ISLs or inter-switch links. In this configuration the host will have to go through a maximum of one ISL to get to the storage it uses. When using a mesh configuration it's favourable to group a host and its storage on the same switch so that a host will not have to traverse an ISL to get to its storage. As the mesh grows, the number of ISLs on a single switch grows at the rate of one for every additional switch. After a certain point there's little benefit to adding extra switches as many of the additional ports are required for ISLs.

When you get to a large number of ports this is where the third type of fabric comes in, which is called core-edge. This configuration uses a large switch at the core of the fabric to which you would generally attach storage. Hosts are attached to smaller edge switches which are also attached to the core via ISLs. This topology can grow to hundreds or thousands of ports while ensuring hosts only have to traverse a maximum of two switches to access storage. Hosts that require very low latency or very high throughput can be connected to the core.

What is zoning and masking, and why is it important?

Zoning is a procedure that takes place on the SAN fabric and ensures devices can only communicate with those that they need to. Masking takes place on storage arrays and ensures that only particular World Wide Names [WWNs] can communicate with LUNs on that array. If the correct masking is applied to the storage array then there's no absolute necessity to configure zoning on the SAN, although using zoning and masking is always to be recommended.

There are two distinct methods of zoning that can be applied to a SAN: World Wide Name zoning and port zoning.

WWN zoning groups a number of WWNs in a zone and allows them to communicate with each other. The switch port that each device is connected to is irrelevant when WWN zoning is configured. One advantage of this type of zoning is that when a port is suspected to be faulty a device can be connected to another port without the need for fabric reconfiguration. A disadvantage is that if an HBA fails in a server the fabric will need to be reconfigured for the host to reattach to its storage. WWN zoning is also sometimes called 'soft zoning.'

Port zoning groups particular ports on a switch or number of switches together, allowing any device connected to those ports to communicate with each other. An advantage of port zoning is that you don't need to reconfigure a zone when an HBA is changed. A disadvantage is that any device can be attached into the zone and communicate with any device in the zone.

My opinion is that neither is particularly superior to the other, and what I find is that the type of zoning used is generally determined by what a particular consultant or organisation has done in the past.

What do I need to know about fan-in and fan-out?

The fan-in ratio denotes the number of hosts connected to a port on a storage array. There are many methods that have been used to determine the optimum number of hosts connected to a storage port, but in my experience there are no hard and fast rules to determine an absolute number.

My recommendation would always be to assess the throughput of each host you want to connect to a port, determine the maximum throughput of that port, and add hosts such that the total throughput is slightly higher than the throughput of that port. It's very important, however, to ensure you have good utilisation statistics available to detect any time period where the port is heavily utilised and could be causing a bottleneck to your SAN fabric.

There are a number of reasons why it's difficult to give a host count as an optimum fan-out ratio. These include: differing port speeds -- a 4 Gbps port can obviously handle twice the throughput of a 2 Gbps port and will allow you to add roughly double the number of hosts; and multipathing -- if a host has two HBAs, traffic will either be aggregated down those two HBAs in an active-active mode or all the traffic will go down one HBA and nothing down the other if the connection is active-passive.

These scenarios will have a big impact on how many hosts you can add to a particular port. In normal operating circumstances, you can connect double the [amount of] HBAs to a particular port as they will all be doing half the work of the host. This is in a multipathing environment. If, however, there's an issue with the SAN and a device has failed over from its active port to its passive port the remaining ports may be required to carry out twice the standard workload. This can cause poor performance if you oversubscribe hosts to storage ports.

What are the main steps when configuring a storage-area network (SAN) switch?

What are the main steps in a SAN switch configuration?

Determining the steps in a storage-area network (SAN) switch configuration depends on whether you are building a new SAN fabric or adding a switch to an existing one.

If this is the first switch in the SAN fabric, it will be the principal switch. You will need to configure IP details so that you can communicate with the switch via a Web browser or acommand line interface. Once this is complete you should check each port for device connectivity and then disable all switch ports to ensure that rogue devices cannot join the fabric.

When you add devices to the switch you will need to determine the required port speed and topology. Disk and host devices should normally be configured with a fabric topology, while tape devices can be either fabric or loop. Once the correct devices are added to the switch you will need to configure zones, which are groups of ports or worldwide names (WWNs) that allow devices to talk to each other. Once zones are configured, devices should be able to communicate correctly.

Tuesday, February 19, 2013

Storage Networking : Configuring Disk Arrays

It involves some tedium, but configuring disk arrays is the most critical part of building a SAN. Here's what you need to know.

The most critical, sometimes tedious, part of setting up a SAN is configuring each individual disk array. In this Storage Networking 101, we'll delve into best practices and cover the general concepts you must know before configuring SAN-attached storage.

There are three general steps when configuring a disk array:

First, you create a RAID set. It can be any type of RAID the array supports, and we'll just assume RAID-5 for this article so that we can talk about hot spares.
You can either slice up the RAID set to present multiple LUNs to a host, or you can create "RAID Groups," as most vendors call it. This is a completely optional step, but it can make your life easier.
Third, you must assign LUNs to a host.

Create a RAID Set

The first step can be done many ways. Say you have an array that holds 14 disks per tray, and you have four trays. One option is to create two (or more) RAID-5 volumes on each tray. You can then assign part or all of each RAID-5 volume to various hosts. The advantage to this method is that you will know which hosts use what specific disks. If the array with three additional trays was purchased at the same time, it actually makes more sense to allocate the RAID sets vertically, so that a single tray failure doesn't take out the RAID volume. With only four trays this means you'll have three disks worth of usable space per 4-disk RAID-5 volume: probably not a good use of space.

Useful Terms

Fibre Channel
RAID
VLAN
LUN
HBA

More often people will create huge RAID-5 sets on the arrays. There's a balance between performance and resiliency that needs to be found. More disks mean better performance, but it also means that two disk failures at once could take out all of your data. Surprisingly, multiple disk-at-once failures are quite common. When the array starts rebuilding data onto a previously unused disk, it frequently fails.

Configure RAID Groups

The second step causes quite a bit of confusion. Regardless of how you've configured the RAID sets in the array, you'll need to bind some amount of storage to a LUN before a host can use it. The LUN can be an entire RAID-5 set (not recommended), or it can be a portion. The partitioning method ensures that you aren't giving too large a volume to a host. There are many reasons for this:

Some file systems cannot handle a 1TB or larger volume
Your backup system probably won't be able to backup a file system that's larger than a single tape
The important one: more LUNs presented to the host (seen as individual disks by the OS) means that separate I/O queues will be used

Back to the second step: Raid Groups. A partitioned RAID-5 set of 1TB, for example, into 100GB chunks, will provide 10 LUNs to deal with. If you don't care what nodes use what disks, you can just throw these LUNs into a group with other LUNs. I prefer to keep one RAID group per host, but others see that as limiting flexibility. Some hosts need a dedicated set of disks, where you know that only one host will be accessing the disks. A high-traffic database server, for example, should not have to contend with other servers for I/O bandwidth and disk seeks. If it truly doesn't matter to you, simply create a bunch of LUNs, and assign them to random groups.

It is also important to create and assign "hot spare" coverage. Spare disks that are left inside the array are "hot" spares. They can be "global," so that any RAID volume in the event of a failure uses them, or they can be assigned to specific RAID volumes. Either way, ensure you have a hot spare, if you can afford the lost space. If not, be sure to monitor the array closely—you'll need to replace any failed disk immediately.

This is where it gets tricky. Different storage arrays will have different terminology, and different processes for assigning LUNs or groups of LUNs to a host.

Assign Your LUNS

Step three, "assign LUNs to a host," means that you're going to map WWNs to LUNs on the array. If you didn't, then any host zoned properly could see all the volumes on the array, and pandemonium would ensue. Be cautious about certain cheaper storage arrays, too. They may not even have this feature by default, until you purchase a license to enable it. While the purveyors of limited-use technology call this feature "WWN Masking" or "SAN-Share," the market leaders in the SAN space realize that it's required functionality.

The most common approach is to create a "storage group," which will contain "hosts" and "LUNs" (or RAID groups with many LUNs). Whatever diverging terminology is used, the universal concept is that you need to create a host entry. This is done by manually entering in a WWN, or connecting the host and zoning it appropriately so that the array can see it. Most arrays will notice the new initiator and ask you to assign it a name. Once your hosts, and all their initiator addresses, are known to the array, it can be configured to present LUNs to the host.

One final note about array configuration. You'll be connecting two HBAs to two different fabrics, and the array will have one controller in each fabric. The host needs to be configured for multipathing, so that either target on the array can disappear and everything will continue to function. We'll dedicate an entire article to host configuration, including multipathing and volume managers, but be aware that the disk array side often needs configuring too. The majority of disk arrays require that you specify what type of host is being connected, and what type of multipathing will be used. Without multipathing, LUNs need to be assigned to specific controllers, so that the appropriate hosts can see them.

Once LUNs are assigned to a host, they should be immediately available to the operating system, viewed as distinct disks.

Think about this for a moment. You've taken individual disks, and combined them into RAID volumes. Then, you've probably partitioned them into smaller LUNs, which is handled by the disk array's controllers. Now the host has ownership of a LUN, comprised of possibly 10 different disks, but each LUN is smaller than individual disks. The host OS can choose to stripe together multiple LUNs, or even partition individual LUNs further.

SAN Zoning

What is SAN zoning?

The basic premise of zoning is to control who can see what in a SAN. There are a number of approaches broken down according to server, storage and switch. I will also talk about initiators and targets. On any server -- even NT -- there are various mechanisms to control what devices an application can see and whether or not the application can talk to another device. At the lowest level, an HBA's firmware and/or driver has a masking capability to control whether or not the server can see other devices. In addition, the operating system can be configured to control which devices it tries to mount as a storage volume. Finally, many people use extra-layered software for volume management, clustering and file system sharing, which can also control applications access.

For storage zoning, if you ignore JBODS and the earlier RAID subsystems on most disk arrays, there is a form of selective presentation. The array is configured with a list of which servers can access which LUNs on which ports and quite simply ignores or rejects access requests from devices that are not in those lists. In terms of switch zoning, most if not all Fibre Channel switches support some form of zoning to control which devices on which ports can access other devices or ports (I will talk more about this in more detail). One other category that controls access is the virtualization. But I will save that discussion for another day.

What type of SAN zoning should you use?

My simple advice is, broadly speaking, to use a little of each of these approaches. Control what devices/LUNs are mounted on the server using some operating system or software capability (i.e., do not use a mount-all approach). Use selective presentation on the storage array, and use zoning in the fabric. Why do I say this? Using a network analogy, you do not want a PC to hack into your files on your corporate systems. To prevent someone from doing this, you have access control lists on the files in the file systems. On the shares, you have firewalls, security gateways, packet filtering, etc. Each of these elements does a complementary and slightly different job in protecting your data.

How exactly does zoning work?

I have answered the question in its broadest sense. Now to be a bit more technically precise. In very simple terms, when a node comes up and connects to a fabric, the first really useful thing it does is a fabric logon. This is how the device gets its 24-bit address, which will be used for routing in the fabric (SID or DID usually refer to a source address or destination address of this form). The device already has its World Wide Name, or several as each port on a node or device will have a unique port WWN, usually programmed in hardware. There is also a node WWN that identifies the node or device, and should show up the same on each port. The next step occurs when a device logs on to the name server service in the SAN and registers itself. The SAN builds up a database of all the devices in the fabric using a mapping of the node and port WWNs to the 24-bit address as well as the capabilities of each device. This includes whether the device in is an FCP device -- one that talks SCSI commands over Fibre Channel.

Finally, a server will ask the name server to send back a list of what other FCP devices it can see in the fabric. This is where zoning kicks in. The name server only returns a list of those FCP devices that are in the same zone (or a common zone). In other words, I only find out about the devices I am supposed to know about.

The server, therefore, has a list of the 24-bit addresses of all the devices it is supposed to be able to see. It will then typically do a port logon to each one in turn to try and find out what sort of FCP/SCSI device it is. This is similar to normal SCSI where the SCSI controller/server does a scan of the bus and queries the properties of each device it can see on the bus.

That, in a nutshell, is zoning.