|
White Paper
|
|
Symlabs Virtual Directory Server has been specially designed to provide high availability for existing LDAP Directory deployments with uptime requirements exceeding the "five figure range" (99.999%). In order to support such stringent requirements, special care must be undertaken for the architecture and deployment. The following "best practices" show how existing customers have successfully implemented 99.999% available LDAP directory infrastructure by including Symlabs Virtual Directory Server. As organizations move to more complex identity management and virtual directory solutions, the presence of this level of underlying reliability becomes a critical system success factor. |
When considering 99.999% uptime, the following assumptions are made:
The 99.999% uptime does not include any scheduled downtime. This availability means that within a calendar year, the overall downtime is barely over five minutes. That is an extremely tight window for any scheduled downtime in any case!
Symlabs Virtual Directory Server itself is a front-end to the LDAP directory service. It uses (connects to, and exchanges data with) LDAP directory servers. The combination of the back-end LDAP directory servers and the network between the machines running Virtual Directory Servers must be at least 99.999% available.
|
There are two distinct "best practice" scenarios for deploying Symlabs Virtual Directory Server:
Multiple independent machines running individual instances of Virtual Directory Server. This is the preferred and default configuration whenever Virtual Directory Server is stateless, i.e. does NOT keep data on local storage. The majority of Symlabs Virtual Directory Server deployments are stateless. Our customers have run this configuration successfully with Virtual Directory Server on Sun SPARC with Solaris 8, 9 and 10, as well as Solaris 9/Intel and Linux RedHat Enterprise 4.
A clustered configuration. In this case, Symlabs Virtual Directory Server is running on a cluster of machines (at this point in time, Virtual Directory Servers for Sun SPARC have been successfully deployed by Symlabs customers using Sun Cluster and Veritas Cluster under Solaris 9). |
|
|
|
|
|
Figure 1: Typical Deployment of a highly available Virtual Directory Server Installation |
In order to guarantee 99.999% uptime for Virtual Directory Server, multiple instances must run simultaneously on multiple independent machines. The entire combination of machines, network, and the operating system must fulfill 99.999% uptime requirements. This is the norm for NIBS-compliant systems running Solaris 8, 9, and 10 in the Telecommunications industry, for example.
A hardware load-balancer is used to provide fail-over between the multiple instances running Symlabs Virtual Directory Server. Each instance running Virtual Directory Server must be sized accordingly in order to handle the full traffic expected at any time. The hardware load-balancers must periodically health-check all Virtual Directory Server instances using one of the following methods:
HTTP GET - Symlabs Virtual Directory Server can listen on any arbitrary port and provide an immediate HTTP response based on its internal status.
LDAP Search - Symlabs Virtual Directory Server can answer a special LDAP search on a special suffix (i.e., a suffix that is outside the regular back-end directory suffix).
For 99.999% uptime, a timeout of approximately 10 seconds should be configured for the load-balancer, regardless of which method is used.
Note: Most load-balancers also support simplistic health-check mechanisms, either based on the "ping" availability of a server, or a simple TCP connect. This health-check mechanism must NOT be used, because it is not reliable enough to verify whether any Virtual Directory Server instance is truly running. Instead, one of the two previously listed methods should be used.
Symlabs Virtual Directory Server contains the most comprehensive functionality to provide high availability to LDAP back-end servers of all virtual directory or LDAP proxy implementations. High availability is provided by routing around problems encountered, using proactive health checking and fail-over. When configured properly, this functionality supports 99.999% uptime requirements and potentially higher, as required by many organizations that utilize large multiple LDAP and virtual directory infrastructures to offer services governed by SLAs (Service Level Agreements).
Symlabs Virtual Directory Server must be configured to pro-actively monitor back-end servers for availability. The following health checks are possible:
LDAP SEARCH or LDAP BIND - Virtual Directory Server issues a configurable LDAP SEARCH or BIND request to a back-end server. If this request is successful, the server is deemed to be up and running. If the request returns an error, the server is deemed to be down. If the request is not answered within a configurable time-out, the request can be retried a configurable number of times, and if the request succeeds during the retries, the server is deemed to be up. Otherwise, the server is deemed to be down.
Optionally, the replication queue to the server can also be checked. If the queue is not found to be within a configurable threshold, the server is deemed to be down (since the server is lagging behind too much to be considered as "good").
When configuring health checking to back-end servers, a careful balance must be found for the maximum time required for a unsuccessful health-check and a subsequent fail-over. On one hand, the maximum yearly downtime to meet 99.999% availability is just over five minutes, so the failure detection and fail-over time should be as small as possible. On the other hand, "false negatives" should be avoided in order to prevent "flip-flops", i.e. fail-overs based on a slow server that is mistaken to be down because of a time-out threshold that is set too low.
Symlabs Virtual Directory Server supports multiple fail-over methods that can be harnessed depending on the LDAP back-end infrastructure:
LDAP master server implementations
LDAP replica server implementations
Multi-site LDAP server configurations
The following diagrams illustrate how Symlabs Virtual Directory Server performs in each of these situations:
The recommended fail-over algorithm for LDAP master servers (write requests) is fail over with fail-back to first available server. Multiple LDAP master servers are configured in a decreasing priority list. Virtual Directory Server will always prefer the lowest available server (i.e. highest preference) in the list. This will ensure that simultaneous write requests to different master servers are avoided as much as possible. This ensures a minimum of the potential replication collisions that can theoretically happen when simultaneous write requests are issued to different master servers. If the server with the highest preference becomes unavailable, requests are failed over to the server with the next highest preference. When the original server (with a higher preference) becomes available again, all new requests are routed to it again, and Virtual Directory Server stops using the server with the lower preference.
|
|
|
|
|
|
Figure 2: Fail-over and fail-back for master servers
Replica Server Implementations
The recommended fail-over mechanism for LDAP replica servers (read requests) is load-balancing using a connection pool, combined with fail-over.
|
|
|
|
|
Figure 3: Load-balancing and fail-over for connection pools to replica servers
Multi-Site Configurations
The recommended fail-over method for multi-site configurations is load-balancing with weighted preferences. Symlabs Virtual Directory Server will balance the load between local back-end servers. If too few local servers are available, fail-overs to remote back-end servers (with a lower preference) will occur. Automatic fail-back will happen when local servers become available again.
|
|
Figure 4: Load-Balancing and Fail-over for multi-site configurations
The concept of "Five Nines", i.e. 99.999% uptime, implies that there are some scenarios for failures. The following lists potential failure scenarios, and how they are handled. Estimated times for temporary outages are also given based on past experience with Symlabs Virtual Directory Server in combination with common LDAP servers.
Failures on the host system where the Virtual Directory Server is running are typically mitigated by configuring a redundant infrastructure that runs the Virtual Directory Server. This is done in combination with a hardware load-balancer that is only used to provide fail-over between the multiple Symlabs Virtual Directory Server instances, and has been described above.
Symlabs Virtual Directory Server can transparently route around problems encountered when back-end LDAP Directory servers become unavailable. This is true in any situation, for example where the IP address hosting the back-end LDAP service is completely unreachable, the back-end server connects but is hung, or there is no LDAP service listening on a port.
Since Virtual Directory Server makes proactive checks to all back-end servers for their current service health, there may be a small delay between the time when a service is going down, and the time when Virtual Directory Server notices this and takes appropriate action. This delay can be controlled by configuring the following health-checking parameters:
|
Parameter |
Description |
|
Timeout for each health probe |
The maximum amount of seconds that Virtual Directory Server should wait for a back-end server to respond. If this timeout is exceeded, the health probe is considered to have failed. |
|
Maximum number of health probes to each server |
Virtual Directory Server can probe a server again within the same health-check cycle if the previous probe has failed. |
|
Health triggering interval |
How often a health check should occur. |
The maximum delay time between a back-end server becoming unavailable, and Symlabs Virtual Directory Server noticing this is:
Health triggering interval + Maximum health probes per server * Timeout per probe
Note that the delay time is not dependent on the number of servers monitored - if the number of servers that needs to be monitored increases, this does not increase the delay time in any meaningful way.
Therefore, if a health check is triggered every 45 seconds, the timeout for reaching a server is defined as 15 seconds, and up to two health probes are sent, this would mean a maximum delay of 75 seconds before a problem is detected. If a health check is run every 20 seconds, with a maximum of one health check, and a 15 second timeout, then the maximum delay would be a maximum of 35 seconds before a problem is detected. Although it would seem at first glance obvious to reduce the timeouts and increase the frequency of health checks, care must be taken to give the health probes enough time and avoid "false negatives".
In most configurations, Symlabs Virtual Directory Server takes only a few seconds to start. In fact, most known configurations all start in less than 5 seconds - many of them even start in just over a second! Start-up time typically depends on the following factors:
The complexity of Virtual Directory Server's configuration
Symlabs Virtual Directory Server, upon start, compiles part of its configuration into a proprietary virtual machine code that is then executed within Virtual Directory Server's virtual machine. Compilation is usually very fast - tens of thousands of lines are usually compiled in less than a second. A very complex deployment scenario with many different triggered scripts only took about 2 seconds on a Ultra SPARC architecture machine with 4 CPUs. Therefore, compiling even very extensive, complex configurations are not expected to affect start-up time in any meaningful way.
Time required for opening connection pools
Many configurations define connection pools to one or more servers. These are typically pre-initialized and pre-authenticated, and this has to happen at start-up. Symlabs Virtual Directory Server will attempt to open all connections for its connection pools asynchronously, which means that many connections can be opened in parallel. However, depending on the back-end LDAP servers latency, this may or may not be an issue. Experience has shown that Virtual Directory Server usually opens and pre-authenticates (using LDAP BIND) more than 20 connections per second, and even higher in many cases, if the back-end servers can keep up. However, if connections to back-end LDAP servers have an unusually high latency, or there servers are unreachable, this can dramatically affect startup time as described above.
Even though Symlabs Virtual Directory Server is able to start up very quickly, failures that occur with back-end LDAP servers can prolong the start-up time, or rather the time until Virtual Directory Server is able to fulfill all requests efficiently.
When back-end servers are unreachable, start-up may only complete when the next health-check is scheduled and complete. The worst case scenario here is when TCP connections to the back-end servers time out, because in this case the health probe needs to wait until the time-out has expired. If the back-end server is reachable, but there is no LDAP service listening on it, then Virtual Directory Server will receive a TCP "Connection Refused" error straight away, and will mark the back-end as failed immediately without having to wait until the time-out. Therefore, the maximum start-up time can be considered a few seconds more than the time that may be required to detect a failure (the calculation is given above).
Symlabs Virtual Directory Server is the fastest and most flexible virtual directory product in the industry. With appropriate care to implement a suitable supporting architecture, it is capable of delivering the extremely high availability required for the most demanding environments. With built-in health monitoring and support for multiple failure recovery modes, Virtual Directory Server can dramatically improve the infrastructure that is typically deployed for a wide variety of applications. From customer access for wireless networks to identity management in financial services, Symlabs Virtual Directory Server is gaining recognition as a key component of high performance LDAP systems.
Symlabs has established itself as a leading provider of high performance components for LDAP and Identity Management solutions. The company's products include the fastest virtual directory server in the industry, and a broad, flexible, powerful suite of federation components compliant with Liberty Alliance and other standards. They integrate seamlessly into the identity management infrastructure provided by Symlabs' strategic partners Novell, HP, Oracle, Microsoft, IBM, and Sun. Companies such as network operators, financial services providers, and large enterprises that require industry-leading speed, flexibility, and reliability will find Symlabs products to be a perfect fit. For more information about Symlabs, please visit www.symlabs.com.