Thu 15 Feb 2007
I haven’t blogged in a while, and the reason for that is really quite simple: when it comes to blogs, code comes first. Actually, that is probably better written as: when it comes to %x, code comes first.
A while ago I wrote about some of the issues that some people have with multi-master replication in a directory server. Something that comes up quite a bit on the Fedora directory server discussion lists is a request to automatically generate unix uid and gid in the uidNumber and gidNumber attributes of the posixAccount objectclass. As the ldup considered harmful document points out under section 4.2., Allocation of serial numbers, this is hard to do in a multi-master replication environment because two or more masters could allocate the same serial number at a roughly similar time without the ability to detect the clash until it is too late to prevent it. That would, of course, be bad.
I recently added the FDS solution to this problem - a general purpose serial number allocation plugin which is modestly called the distributed numeric assignment plugin, or “dna” for short. It will generate unique serial numbers in an MMR environment, including uidNumber and gidNumber. I wanted to solve this problem because allocating serial numbers is a reasonable and quite common thing to want to do, the directory server should probably do it for you, and as I mentioned, this subject does come up with reasonable frequency on the Fedora directory server discussion list. Essentially there are two main approaches to this problem in the wild:
- Have a single master do the allocation and then replicate the result to the other masters. This is not really solving the problem because there are two major undesireable properties of a such a system: there is a single point of failure at the allocator, and; there is a replication delay between creating or modifying an entry and having that entry become “whole” by having its serial numbers catch up with it.
- Have all masters get in a huddle and divvy out blocks of serial numbers per master. While this allows masters to independently allocate serial numbers (a good goal I’d say), it does mean that the masters must cooperate in order for one to get a new block of numbers. Perhaps that is ok for some systems, but it does require that all masters have at least indirect access to all other masters in order for such a protocol to work, and it wouldn’t be pretty, likely having lots of network chatter. That all seems a little too coupled for a loosely coupled replication scheme anyway.
A third approach is to combine those two by having a single master do the divvying. That produces a system with a single point of failure but gives the admin some time to get the allocating server back up before the system grinds to a halt one server at a time. So at least you know in advance you are doomed.
Yet another approach might be to divvy up large blocks of the number space among the masters. So large, in fact, that you bet on never creating more serial numbers at any one master than are available in the block. This is feasible, 2 billion or so could be split quite a few ways before you get close to the probability of overflow if we are talking about one allocation per user entry for instance. However, once the space has been split between your masters, what happens when you add a master? You’ve already allocated your number space, how do you reset it? Keep spare blocks? How many spare blocks is enough? How big a block is enough?
Of course, if one were to implement a multi-master attribute uniqueness scheme then that could be relied upon to reject non-unique serial numbers. Such a scheme would also be very network chatty and not in the spirit of loosely coupled replication. In any case, attempting to add a serial number and waiting for a rejection from across the network before trying again with the next serial number in line doesn’t sound too hot in the performance department to me.
Needless to say, my solution involved none of this. Actually, the answer I came up with is quite simple - don’t allocate a block, allocate a sequence. So, for example, master 1 allocates the sequence 1, 4, 7, and so on, while master 2 allocates 2, 5, 8. There are only two masters in that example, but astute readers will recognize that a third master could be added without any reconfiguration of the first two. Add a fourth? Now you need to reset the existing masters, giving them a starting number higher than previously allocated and a new sequence interval equal to or higher than the number of masters. This does of course produce fragmented sequences, so that if you were to combine the lists of numbers from all masters there would be some numbers missing towards the end of the list. Typically though, systems that rely upon this kind of feature value “unique” over “goes up in ones.” That fact also means that a typical deployment would make the sequence interval quite high in order to avoid the possibility that sequence configuration would need to be reset as masters are added.
The major advantages of this scheme are independent serial number allocation, economical use of the number space, no single point of failure, no network chatter typical of cooperative schemes, and a warm fuzzy feeling inside every time you add a new user to your system. It’s a win/win I think.
Oh, and if you really want to, you can configure the plugin to use “blocks” and allocate serial numbers monotonically. That would also be the typical single master deployment configuration.
4 Responses to “Serial numbers and MMR”
Leave a Reply
You must be logged in to post a comment.

March 9th, 2007 at 7:51 pm
I’m not sure how I came across your site via google. I found your uidNumber blog interesting. I came to similar conclusions that you did with your three options. However, I ended up using a fourth scenario. I have found that in most medium to large companies, that the HR system typically uses some scheme to assign a unique ‘employee number’ within the HR application (SAP, CYBORG, etc.). Typically this info is then injected/synced/etc into a general ldap directory (AD,SUN,Novell,RHDS,etc).
My solution was to use an algorithm which would calculate the uidNumber based on the employee number. This always ensures that regardless of which master server you update, it will always generate the same uidNumber for a given employee number.
The onus now hangs with the HR system to ensure that two or more employees do not share the same employee number which is highly unlikely and would get resolved very quickly if it did - people like to get paid after all. This approach in my opinion is more desirable because it ensures that uidNumbers can be pre-generated and assigned to every person even if they do not use or need a unix account.
Identity management is always the key in securing the IT infrastructure. It is mind boggling the number of times I have come to corporate environments where users have different uidNumbers on different hosts. Even when sysadmins seem to try to get a handle on the problem, they invent ridiculous scripts to keep password/shadow files, NIS and LDAP in sync.
For some reason, people dont seem to realize that unix only cares about the uidNumber and not the uid. This is especially true in large companies that grew by acquiring other smaller companies. Rather than fix the problem, regional groups tend to only focus on their infrastructure and miss the bigger identity problem.
I think that most directory vendors have missed and continue to miss the boat on this. M$ AD does this but it is only useful in a purely M$ environment because they use a 128 bit label and not a uidNumber. AD doesn’t solve the problem in a mixed unix/pc environment. Although, I suppose one could use the lower 32 bits of the 128 bit label and assign that as the uidNumber for unix. You could do this with most LDAPs that use the UUID attribute for a DN. Hmmm, I wonder what the collision rate would be ?
Out of curiosity, I once even used a CRC32 function to yield a unique uidNumber for a given DN. I generated a ldif file that contained 20 million DNs and did not get one collision with this.
Most modern unix systems support 32 bit uidNumbers. The “plugins” provided by most ldap vendors usually only do uidNumbers up to 65535 and most of them warn that plugins will impact performance and none of them will support any third party plugins - I don’t blame them.
How difficult is it for directory vendor (commercial or opensource) to use an algorithm to auto generate a unique 32 bit uidNumber or at least in a given range based on the DN or another unique attribute?
An algorithm that calculates the same uidNumber for a given DN or another unique attribute is always a better solution - IMHO.
I suppose that at some point when all the unixes are ready to support 128bit uidNumbers, then you can simply use the UUID that most LDAP servers implement as the uidNumber.
I would be very interested in your thoughts on this. Thanks.
March 12th, 2007 at 1:05 pm
I have heard of people calculating uidNumber, but for my tastes 32 bits are really not enough to be happy about the risk of collisions. Generating a duplicate uidNumber is a straight up and down security risk which is not something a vendor is in a position to sanction. In any particular deployment the risk can be assessed in context.
Calculating based on an already unique and known property is an interesting idea. As long as precautions are taken to automatically deal with collisions, such as a backup randomizing algorithm, then I am sure many would be happy with that. However in a multi-master environment it is not possible to deal with the collision before it actually collides, and that is why it was important to me to create a scheme that ensures zero collisions.
March 12th, 2007 at 6:11 pm
Sun directory server 6.0 replication
I read with interest Pete Browley’s blog posting on replication with OpenLDAP. A uidUn iqueness plugin exists in Sun directory server 5.2 and 6.0. Of course uidUniqueness replication will be available in OpenDS and if Pete would like to p…
March 12th, 2007 at 6:31 pm
Sun directory server 6.0 replication
I read with interest Pete Browley’s blog posting on replication with
OpenLDAP. A uidUniqueness plugin exists in Sun directory server 5.2 and
6.0 . Of course uidUniqueness replication will be available in OpenDS
and if Pete would like to port his …