Manage Learn to apply best practices and optimize your operations.

Storage literally 'on the edge' gives new meaning to fault tolerance

With storage operations literally living on a cliff in California, preparedness keeps Sundt Construction's storage secure.

When rolling blackouts hit California in 2001: "We looked pretty good because we were well prepared," says Robb Good, vice president and director of information systems at Sundt Construction Inc., based in San Diego. Subsequently, at the time of September 11, "when executives called me to ask whether we could survive disruption of our corporate IT site, they were very pleased to learn that we could," Good said.

The foresight demonstrated in Sundt's IT plans stems in part from serendipity. Good explains that, well before September 11, Sundt had completely reorganized its IT functions from a centralized operation to a highly distributed environment. "We started to realize this could be a healthy thing if we approached it from the right perspective," Good said. One goal became identifying what was truly a mission critical application. Once those applications were identified, Sundt slowly started distributing them so that one location would no longer be home to all mission critical apps. Along the way, the company also discovered that they were vulnerable for another reason: the main IT operation was located not only at the top of a 200-foot cliff but directly over a fault line!

"We realized that if we lost that one office, it could shut us down for 60 days," explained Good. Fortunately, Good had clear marching orders. "Management doesn't understand information technology; they just want us to make sure there are no problems," he said.

"We dropped our hub-and-spoke private WAN cloud and implemented a fully meshed VPN," said Good. This further enhanced the IT department's ability to guarantee availability of services, since one office was no longer the hub. Sundt also began converting several applications into a thin-client Citrix environment. "Our Prolog application was one of the first programs we converted," he said.

"With the dot-bomb failures, we started picking up really good barely used equipment for pennies on the dollar," Good said. Indeed, he added, "We were acquiring two or three servers for every one new server I had budgeted." This influx of hardware allowed Sundt to expand the Citrix and SQL environment (clustering SQL servers and creating redundant Citrix farms). In total, Sundt's network embraced seven main offices and 28 remote sites connected via VPN. Applications supported include Prolog Manager, Microsoft's SQL Server 2000, several internally written custom SQL applications, Novell GroupWise 6.0, other Microsoft 2000 servers and video conferencing.

There are also 500 desktop PCs running Windows 2000 with Office 2000. The storage infrastructure consists of four servers running Windows 2000 and SQL Server 2000 -- with two located in the Southern California data center and two in the Arizona data center.

When California was facing the threat of rolling blackouts, the effort to distribute resources was extended to include immediate failover protection. At first, Good said, "We attempted to use SQL replication to copy the Prolog data to our Tucson location. However, we quickly discovered that Prolog was not truly SQL complaint and our tables were becoming corrupt."

Good starting researching a product from NSI Software called Double-Take. He downloaded the company's white paper and discussed its potential as a solution with his staff. This eventually led NSI to implement Double-Take, which provides continuous real-time replication and automatic failover.

Sundt's challenge was to provide an immediate, high-availability storage failover solution. The application was hosted out of Sundt's data center-on-a-fault in Southern California.

To ensure business continuity, companies typically replicate the backup server data to a third server in a remote location. Sundt took a slightly different approach that would make the most of existing IT resources and minimize cost. In Sundt's network, each backup server also acts as the remote disaster recovery server. For instance, the backup server in Southern California is the disaster recovery server for the Arizona data center and vice versa. The data centers replicate data using the Double-Take solution, which works asynchronously, with minimal impact on existing network and communication resources.

Each server group comprises a main SQL server and a backup SQL server. Within the two server groups, one server acts as the main production server from which users access information. The second server acts as the backup server. The data is replicated to the backup server continuously, creating a real-time mirror of the data. If the production server should fail, the backup server will automatically handle all user requests for the data. This instant failover between the servers ensures the data is available 24/7.

"By the time the September 11th events had taken place, we were well into our DR project for this program," Good said. "The combination of the Double-Take product and our extensive Citrix environment is what allowed us to create a fully redundant hot-site for this application."

What do you think of the Storage Innovators e-mail? Take our quick survey and let us know.

For more information on Sundt Construction take a look at its Web site.

For more on NSI Software click here.

For more information

>> Expert advice: Calculating availability

>> Webcast: 10 steps to high availability

>> Best Web Links: High Availability

Dig Deeper on Customer relationship management (CRM)

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.