LDAP


freeIPA logo

There is a new project on the block: freeIPA. This is an effort to shore up the existing identity infrastructure such as kerberos, LDAP, Samba and RADIUS. and make it all work together out of the box. For version 1 we’ll be concentrating on the I for identity and in later versions we’ll be adding the very important policy and audit capabilities. If this kind of thing interests you enough to want to contribute we have plenty to do.

Project blurb:

FreeIPA (so far) is an integrated solution combining:* Linux (currently Fedora)
* Fedora directory server
* FreeRADIUS
* MIT Kerberos
* NTP
* DNS
* Samba
* Web and commandline provisioning and administration tools

The goal of this version is to allow an administrator to quickly install, setup, and administer one or more servers for centralized authentication and identity management.

Motivation

For efficiency, compliance and risk mitigation, organizations need to centrally manage and correlate vital security information including

* Identity (machine, user, virtual machines, groups, authentication credentials)
* Policy (configuration settings, access control information)
* Audit (events, logs, analysis thereof)

Because of its vital importance and the way it is interrelated, we think identity, policy, and audit information should be open, interoperable, and manageable. Our focus is on making identity, policy, and audit easy to centrally manage for the Linux and Unix world. Of course, we will need to interoperate well with Windows and much more.

We are looking to take concrete and useful steps and so have chosen initially to focus on Identity solutions for the Unix/Linux world with some support for Windows login.

We intend to tackle centralized management of policy and audit information next.

I haven’t blogged in a while, and the reason for that is really quite simple: when it comes to blogs, code comes first. Actually, that is probably better written as: when it comes to %x, code comes first.

A while ago I wrote about some of the issues that some people have with multi-master replication in a directory server. Something that comes up quite a bit on the Fedora directory server discussion lists is a request to automatically generate unix uid and gid in the uidNumber and gidNumber attributes of the posixAccount objectclass. As the ldup considered harmful document points out under section 4.2., Allocation of serial numbers, this is hard to do in a multi-master replication environment because two or more masters could allocate the same serial number at a roughly similar time without the ability to detect the clash until it is too late to prevent it. That would, of course, be bad.

I recently added the FDS solution to this problem - a general purpose serial number allocation plugin which is modestly called the distributed numeric assignment plugin, or “dna” for short. It will generate unique serial numbers in an MMR environment, including uidNumber and gidNumber. I wanted to solve this problem because allocating serial numbers is a reasonable and quite common thing to want to do, the directory server should probably do it for you, and as I mentioned, this subject does come up with reasonable frequency on the Fedora directory server discussion list. Essentially there are two main approaches to this problem in the wild:

  1. Have a single master do the allocation and then replicate the result to the other masters. This is not really solving the problem because there are two major undesireable properties of a such a system: there is a single point of failure at the allocator, and; there is a replication delay between creating or modifying an entry and having that entry become “whole” by having its serial numbers catch up with it.
  2. Have all masters get in a huddle and divvy out blocks of serial numbers per master. While this allows masters to independently allocate serial numbers (a good goal I’d say), it does mean that the masters must cooperate in order for one to get a new block of numbers. Perhaps that is ok for some systems, but it does require that all masters have at least indirect access to all other masters in order for such a protocol to work, and it wouldn’t be pretty, likely having lots of network chatter. That all seems a little too coupled for a loosely coupled replication scheme anyway.

A third approach is to combine those two by having a single master do the divvying. That produces a system with a single point of failure but gives the admin some time to get the allocating server back up before the system grinds to a halt one server at a time. So at least you know in advance you are doomed.

Yet another approach might be to divvy up large blocks of the number space among the masters. So large, in fact, that you bet on never creating more serial numbers at any one master than are available in the block. This is feasible, 2 billion or so could be split quite a few ways before you get close to the probability of overflow if we are talking about one allocation per user entry for instance. However, once the space has been split between your masters, what happens when you add a master? You’ve already allocated your number space, how do you reset it? Keep spare blocks? How many spare blocks is enough? How big a block is enough?

Of course, if one were to implement a multi-master attribute uniqueness scheme then that could be relied upon to reject non-unique serial numbers. Such a scheme would also be very network chatty and not in the spirit of loosely coupled replication. In any case, attempting to add a serial number and waiting for a rejection from across the network before trying again with the next serial number in line doesn’t sound too hot in the performance department to me.

Needless to say, my solution involved none of this. Actually, the answer I came up with is quite simple - don’t allocate a block, allocate a sequence. So, for example, master 1 allocates the sequence 1, 4, 7, and so on, while master 2 allocates 2, 5, 8. There are only two masters in that example, but astute readers will recognize that a third master could be added without any reconfiguration of the first two. Add a fourth? Now you need to reset the existing masters, giving them a starting number higher than previously allocated and a new sequence interval equal to or higher than the number of masters. This does of course produce fragmented sequences, so that if you were to combine the lists of numbers from all masters there would be some numbers missing towards the end of the list. Typically though, systems that rely upon this kind of feature value “unique” over “goes up in ones.” That fact also means that a typical deployment would make the sequence interval quite high in order to avoid the possibility that sequence configuration would need to be reset as masters are added.

The major advantages of this scheme are independent serial number allocation, economical use of the number space, no single point of failure, no network chatter typical of cooperative schemes, and a warm fuzzy feeling inside every time you add a new user to your system. It’s a win/win I think.

Oh, and if you really want to, you can configure the plugin to use “blocks” and allocate serial numbers monotonically. That would also be the typical single master deployment configuration.

Jim Yang of Identyx has recently been busy cooking up a little virtual directory coolness for Fedora directory server. That is Penrose is now integrated with FDS. Now that is what I call an Xmas present.

If you are looking for a software based hot fail over solution for directory server, consider using multi-master replication. Really. Even if you are one of those who cannot get along with the characteristics of loosely-consistent multi-master replication models, MMR can provide the best of both worlds if configured appropriately. Here’s the secret: despite allowing up to four masters, not one of those masters will force your clients to use it. In other words, you use one of the masters as your write master which is assigned the live virtual ip. Set up your heart-beat script to flip the virtual ip to one of the other masters when a problem is detected. Then you can work on bringing up the failed master. You won’t need to script any complicated consistency recovery mechanisms since the multi-master replication logic will recover consistency for you - that is what it is designed to do.

Using this topology you will have all the benefits of a single master plus up to three ready to go backup masters.

I know there is some talk about how bad for you multi-master replication is, and I am also aware that the source of the arguments generally come from this document, an IETF informational draft that expired 3 years ago. Since MMR is a feature of Fedora directory server I thought I would point out the response to that document on the FDS site.

Here it is:

This is in response to http://www.watersprings.org/pub/id/draft-zeilenga-ldup-harmful-02.txt

Lightweight Directory Update Protocol (LDUP) was the now defunct IETF LDAP replication working group. Many years of work went into the attempt to create a standard LDAP multi-master replication protocol, but little came of it. The Fedora DS MMR protocol is based on this work.

The author of the paper is technically correct. Any loosely consistent replication model can lead to inconsistencies, including single master replication models. I won’t go into too many details, but if you really want to know about different replication consistency models, read this - http://www2.parc.com/csl/projects/bayou/pubs/uist-97/Bayou.pdf

In general, the only way to ensure absolute consistency is to use something like two phase commit, used by some RDBMS products. In this case, your write operation does not return a response until that write operation has been successfully propagated to all systems in the replication topology (or rolled back from all in the case of failure).

There is no way for any LDAP loosely coupled replication to guarantee “read your writes” consistency. As an example, consider a single master case with one or more read only replicas. End user clients will typically be pointed to one or more read only replicas to use for searches for load balancing or failover purposes. If the client makes a write request, it will typically be sent a referral to the master, where the write operation will be performed. The write operation will return immediately to the client, without waiting for that write operation to be propagated to the replicas. If the client immediately performs a search request on a replica (which it has been configured to do so), that data may or may not be available, depending on the replication schedule, speed of the master, write performance of the replica, etc., etc.

Any kind of loosely coupled replication breaks ACID:

  • Consistency - the “view” of the data is different depending on which server you talk to
  • Isolation - the update is visible on the master before it is visible on a consumer
  • Durability - it’s possible that the change could be lost or refused due to an error condition on a replica

However, the “LDUP harmful” document states the following:

It is noted that X.500 replication (shadowing) model allows for
transient inconsistencies to exist between the master and shadow
copies of directory information.  As applications which update
information operate upon the master copy, any inconsistencies in
shadow copies are not evident to these applications.

This would be fine except that almost no real world deployments follow this model. Replicas are used for load balancing, failover, and data locality (e.g. putting a copy of the corporate data in a remote office). Therefore, in almost every LDAP deployment, clients _use_ the “shadow copies” directly. In almost every case, load balancing, failover, data locality, “no single point of failure” are the most salient concerns of network architects, and absolute data consistency is secondary.

The “LDUP harmful” document then goes on to give specific examples of where MMR can lead to inconsistencies. In almost every case, the MMR protocol can handle the inconsistency in a logical manner, or flag the inconsistency for operator intervention (with the operational attribute nsds5ReplConflict).

So, knowing that, you have to make your own trade-off between convenience and absolute consistency. MMR gives you the ability to have data locality with writes and no single point of write failure, at the cost of extra administrative overhead - monitoring, looking for conflicts (e.g. search for nsds5ReplConflict=*), and manually resolving them. MMR has been deployed for years (starting in 3/2001 with iPlanet DS 5.0), and in the vast majority of cases, data inconsistency just doesn’t happen.

In other words, FDS gives you the choice of using multi-master replication in your deployment topology. Don’t like it? Then nobody is going to make you use it. Need it? Then it is there. When it comes to complex functionality such as MMR, it is comforting to know that it has the robustness imbued by many years of bug fixing and deployment in demanding environments. It’s ready to use, no pioneering required.

Recently Carla Schroder wrote a three part series of articles on using Fedora directory server. She has some nice things to say about the server:

FDS scales nicely from tiny test systems to huge enterprise systems, which comes as no surprise to anyone who knows its history. It began life as the Netscape directory server (NDS), then became the iPlanet directory server, and then the SunONE directory server. You’ll find all of these ancestral LDAP servers still in service, handling very large loads with ease. To quote the FDS Web site: “The Fedora directory server is hardened by real world use, full featured, scales like a banshee, and already handles many of the largest LDAP deployments in the world.” So you could start your LDAP education with FDS, and stick with it as your needs grow.

Check out the articles here:

File under open source LDAP.

Brian K. Jones has blogged about his experience with the Fedora directory server. He likes it.

In contrast, the Fedora directory server has huge, enormous, steaming wads of documentation, a wiki that has a huge amount of more task-specific documentation written by those in the community who waded through one project or another and lived to tell about it, a mailing list that is extremely user-friendly, and even an IRC channel where you can talk directly to some of the folks writing the code, who are an immense help, and whose wisdom often makes it into the published wiki documentation for all to benefit from.

So, in short, Fedora directory server is a blazing fast directory server that supports multimaster replication (should you choose to use it), hot backups and restores, access control changes (and many other changes) without a server restart, running multiple instances on a single machine, and it stores its entire configuration in the directory itself, making it completely manageable using the LDAP protocol itself. If you’re a GUI fan, there’s also a graphical interface that lets you do everything from adding new users, to adding new objectclass and attribute definitions, to managing certificates and viewing logs. What’s more, with its PassSync utility, it can synchronize passwords easily with an Active directory server.

That about sums things up I’d say. Though, if you would like full enterprise support I recommend the Red Hat badged version.