Minimising downtime through resilience

Minimising downtime through resilience

When it comes to mission-critical applications, resilience is the key to success. Anand Subbiah, Secure Logiq Middle East Regional Director, explains why resilience is preferable to recovery and how to ensure you achieve it.

When it comes to safety and security, resilience is key. You need your surveillance system to be working 24/7 and you need the confidence that if something does go wrong, there will be no impact to your security operations. This is particularly true in mission critical applications where downtime is not something that can be accommodated.

Take, for example, the surveillance system for a stadium. This is a crucial life safety application that is used to ensure fan safety on match days. If the solution isn’t working properly, or parts of it have a technical issue, this means the security team can’t properly monitor and protect the crowd so the stadium cannot open its gates. It is the same concept with a shopping malls or airports. For some of these venues, if they lose 25% of their surveillance cameras, then they are required to close their doors, costing operators millions of dollars per hour.

The best way to tackle any potential issue is to be proactive and build redundant systems with resilience built in. This can be achieved through a combined hardware and software approach where smart and informed choices help you to build a failsafe system.

Hardware first

Choosing the right hardware is vital for every mission critical application. When it comes to security, resilience is always preferable to recovery, and this is where robust computing can help. A fault tolerant computing approach (99.999% uptime) will comprise of two identical servers running as an always on redundant pair. Depending on the application, these servers could be in the same physical space, or networked together and separated in different locations. With automatic real time asynchronous switching between the servers, the result is all of your critical applications will be available at all times.

The unique regulatory requirements found in the Middle East is also an important factor when it comes to hardware decisions. Retention time for captured footage is typically longer in the region than anywhere else in the world, while CCTV camera counts for individual systems are typically higher too. With more data to store for longer periods, high-capacity solutions such as SAN (storage area network) become an important tool.

There are also factors such as erasure encoding solutions. RAID has been the traditional way of solving approaching this, but it is not the only answer. There are similar solutions available that have been created to do the same job as RAID, but in a more distributed manner, dispersing the parity out across numerous drives. This kind of solution can provide data protection at a capacity higher than ever before and ensures rebuilds are faster. This will add up to your mission critical application being restored to peak performance faster.

Software solutions

To continue with the same analogy, if you want the engine of your security system to perform to its best, then you need to tune it with the right software. Software that manages resources for the ‘always on’ applications common in security systems ensure that if there were to be a failure, application operations are not interrupted. Downtime is eliminated and data is not lost in the event of any component or indeed server failure or localised disaster.

Understanding the application, specialist solutions can be designed by linking two servers together via a virtualisation platform that pairs protected virtual machines to create a single operating environment. The entire application environment, including data in memory, is replicated by the software, ensuring applications continue to run without interruption or data loss. If one physical machine should fail, the application continues to run on the other physical machine without any interruptions or data loss. If a hardware component fails, the software should be able to substitute the healthy component from the second system until the failed component is repaired or replaced.

A further important factor is the ability to mirror the archive data so that you are able to meet your regulatory requirements even if your surveillance system suffers from a failure. You should also look for software that also allows storage and even GPUs to be replicated giving VMS resilience even when failover is not an option.

A further way that software can enhance resilience is via system health monitoring. This is more of a proactive solution to monitor critical hardware components and raise an alert in the event that something unexpected was occurring. Using this kind of monitoring tool means that any issues that are detected can be dealt with proactively, before system malfunction, ensuring the systems remain functional when you need them.

If designed well and using the best software components to tune your hardware and storage engine, elegant solutions can be found to meet the always on requirements of mission critical systems. To achieve this, you need a video specialist who understands IT, rather than an IT hardware provider who hasn’t got a clue about video.

Recommended solutions

Hardware:

Seagate SAN storage tuned for security systems with Secure Logiq servers. While many manufacturers have SAN offerings, it is the ability to tune the solution specifically for security applications and provide a full end to end offering that is vital for mission critical applications.

Seagate ADAPT – Autonomic Distributed Allocation Protection Technology. ADAPT functions like RAID in that it is a data protection scheme, but it disperses the parity out to a number of drives. This structure enables the Seagate RAID controllers to take advantage of the combined performance of all those drives versus being tied to a single drive.

Software:

Stratus everRun and Secure Logiq Logiqal CORE software will both add resilience to your mission critical systems through virtualisation.

Secure Logiq Logiqal Healthcheck Pro is a surveillance-centric cloud-based hardware monitoring and alerting utility that will monitor the key components of the Secure Logiq hardware in your system.