October 2006


If you are looking for a software based hot fail over solution for directory server, consider using multi-master replication. Really. Even if you are one of those who cannot get along with the characteristics of loosely-consistent multi-master replication models, MMR can provide the best of both worlds if configured appropriately. Here’s the secret: despite allowing up to four masters, not one of those masters will force your clients to use it. In other words, you use one of the masters as your write master which is assigned the live virtual ip. Set up your heart-beat script to flip the virtual ip to one of the other masters when a problem is detected. Then you can work on bringing up the failed master. You won’t need to script any complicated consistency recovery mechanisms since the multi-master replication logic will recover consistency for you - that is what it is designed to do.

Using this topology you will have all the benefits of a single master plus up to three ready to go backup masters.

Over the last few months there have been a number of highly publicized thefts of databases containing the identity data for thousands of people, in some cases in millions. To some this might give the impression that the problem is getting worse quickly. Well I suppose that is part of the story, but a greater factor in this is that until recently these thefts were simply kept under wraps. What you don’t know can’t be raised in your defence. One might suppose that this change of heart in reporting these thefts is due to some realization that it is the right thing to do. But no, actually it has more to do with newly enacted state laws requiring that people be informed when their data has been stolen or may have been stolen, and no doubt companies in states without those laws consider reporting thefts in order to prevent new laws. My question is, do these laws go far enough?

In the wake of the Enron scandal public companies have been required to, among other things, enforce and monitor much stricter rules governing access to data and reporting of that access. That is data pertinent to the running of the business. The focus of the rules are to protect shareholders who have a financial stake in the company. But what about the members of the public who have their data compiled into these vast databases without any say so or control? What protections do they have? I dont believe it is enough to force reporting of stolen identity data, embarressing though it may be. Without responsibility the report merely equates to “You’re screwed, sucks to be you.” If you have any doubt that that is all it amounts to then consider these facts that can be accessed at Identity Theft Resource Center:

1. Victims now spend an average of 600 hours recovering from this crime, often over a period of years. Three years ago the average was 175 hours of time*, representing an increase of about 2470%.

2. Based on 600 hours times the indicated victim wages, this equals nearly $16,000 in lost potential or realized income.

3. While victims are finding out about the crime more quickly, it is taking far longer than ever before to clear their records and recover from the situation.

4. Even after the thief stops using the information, victims struggle with the impact of identity theft. That might include increased insurance or credit card fees, inability to find a job, higher interest rates and battling collection agencies and issuers who refuse to clear records despite substantiating evidence of the crime. This “tail” may continue for more than 10 years after the crime was first discovered.

5. Based on the ITRC study, today the business community loses between $40,000 - $92,000 per name in fraudulent charges, based on reported fraud losses seen by surveyed victims. While this conflicts with other findings by other groups, there was a wide range of responses by the ITRC study respondents. The answer is that we may never know the true financial impact of this crime due to mis-classification of identity theft crime definitions by the business community and by victims.

6. The emotional impact on victims is likened to that felt by victims of more violent crime, including rape, violent assault and repeated battering. Some victims feel dirty, defiled, ashamed and embarrassed, and undeserving of assistance. Others report a split with a significant other or spouse and of being unsupported by family members.

7. Today victims spend an average of $1,400 in out-of-pocket expenses, an increase of 85% from years past.

8. Approximately 85% of victims found out about the crime due to an adverse situation - denied credit or employment, notification by police or collection agencies, receipt of credit cards or bills never ordered, etc. Only 15% found out through a positive action taken by a business group that verified a submitted application or a reported change of address.

9. Victims report a lack of responsiveness from those entities to whom they turned for help similar to results reported in 2000*. These include police, collection agencies, credit issuers, utility companies and financial institutions.

Sucks to be you.

Under these rules, which have only been in force for a few years I have been notified 3 times that my data may have been stolen. In each case my recompence was a free year long subscription for monthly credit activity reports. I guess that does mean I get to know it sucks to be me potentially much sooner, but well, it would still suck to be me. There is no element of finacial responsibility attached to the database compilers lack of adequate security. The reason is really quite simple - it’s no skin off their nose if I go under at the hands of identity thieves. Well, the only way that can be changed is by law.

I’m talking about the kind of responsibility that business understands - fiscal. How about a scheme like this: next time an employee of your company thinks it is a good idea to carry your entire database of identity data around on a laptop that subsequently gets stolen, your company is obliged to foot the bill for any and all identity related fraud, including all incidental costs, for all the people with information in the database for say, the next five years. Of course, given that the details have been stolen, it would be that companies burden to prove in any one case that it was not their leak that resulted in the crime. With something like that in place you better believe those who safe guard the data will be paying a lot more attention to the guarding part than simply the compiling part. I should imagine that there would be some motivation to also stop relying on the pathetically idiotic proofs of identity in common use now, such as social security numbers and the like.

At the end of the day, if the costs incurred by the victims of stolen identity data, both fiscal and in pure inconvenience, are never accounted for then as history shows us, there is insufficient motivation to treat the data with the care that the public deserves. It’s time to make those who profit from data accumulation pay for the cost of the breaches. Make it a cost of doing business, not a cost of living.

I finally got around to checking out the support for information cards at Opinity. So off I go to grab Chuck Mortimore’s excellent proof of concept identity selector, install it on Firefox (an obscure browser used by long haired beardy folk) on Linux (ditto), create myself an information card and go acruising over to Opinity, click on the registration screen information card graphic, select my information card and I am greeted with:

You should use IE7 or above version to do this

Thanks for the heads up. Oh wait, Microsoft haven’t gotten around to releasing Internet Explorer 7.0 on Linux yet - I’m still waiting for the update. Did I get warped back to the 90’s? This had better not be taken as an exemplar for the first wave of implementation of information card support for the web 2.0 crowd. Or the web 2.0 crowd might find the first movers not movin because they can’t get in.

I suggest an alternative method of making sure the user does the right thing for themselves, upon receipt of an information card, use it, otherwise remain calm and ajax in Bob to explain what’s up. But whatever you do, don’t ever require a certain browser, browser version, or by extension, operating system.

Hopefully this example won’t last long.

I know there is some talk about how bad for you multi-master replication is, and I am also aware that the source of the arguments generally come from this document, an IETF informational draft that expired 3 years ago. Since MMR is a feature of Fedora directory server I thought I would point out the response to that document on the FDS site.

Here it is:

This is in response to http://www.watersprings.org/pub/id/draft-zeilenga-ldup-harmful-02.txt

Lightweight Directory Update Protocol (LDUP) was the now defunct IETF LDAP replication working group. Many years of work went into the attempt to create a standard LDAP multi-master replication protocol, but little came of it. The Fedora DS MMR protocol is based on this work.

The author of the paper is technically correct. Any loosely consistent replication model can lead to inconsistencies, including single master replication models. I won’t go into too many details, but if you really want to know about different replication consistency models, read this - http://www2.parc.com/csl/projects/bayou/pubs/uist-97/Bayou.pdf

In general, the only way to ensure absolute consistency is to use something like two phase commit, used by some RDBMS products. In this case, your write operation does not return a response until that write operation has been successfully propagated to all systems in the replication topology (or rolled back from all in the case of failure).

There is no way for any LDAP loosely coupled replication to guarantee “read your writes” consistency. As an example, consider a single master case with one or more read only replicas. End user clients will typically be pointed to one or more read only replicas to use for searches for load balancing or failover purposes. If the client makes a write request, it will typically be sent a referral to the master, where the write operation will be performed. The write operation will return immediately to the client, without waiting for that write operation to be propagated to the replicas. If the client immediately performs a search request on a replica (which it has been configured to do so), that data may or may not be available, depending on the replication schedule, speed of the master, write performance of the replica, etc., etc.

Any kind of loosely coupled replication breaks ACID:

  • Consistency - the “view” of the data is different depending on which server you talk to
  • Isolation - the update is visible on the master before it is visible on a consumer
  • Durability - it’s possible that the change could be lost or refused due to an error condition on a replica

However, the “LDUP harmful” document states the following:

It is noted that X.500 replication (shadowing) model allows for
transient inconsistencies to exist between the master and shadow
copies of directory information.  As applications which update
information operate upon the master copy, any inconsistencies in
shadow copies are not evident to these applications.

This would be fine except that almost no real world deployments follow this model. Replicas are used for load balancing, failover, and data locality (e.g. putting a copy of the corporate data in a remote office). Therefore, in almost every LDAP deployment, clients _use_ the “shadow copies” directly. In almost every case, load balancing, failover, data locality, “no single point of failure” are the most salient concerns of network architects, and absolute data consistency is secondary.

The “LDUP harmful” document then goes on to give specific examples of where MMR can lead to inconsistencies. In almost every case, the MMR protocol can handle the inconsistency in a logical manner, or flag the inconsistency for operator intervention (with the operational attribute nsds5ReplConflict).

So, knowing that, you have to make your own trade-off between convenience and absolute consistency. MMR gives you the ability to have data locality with writes and no single point of write failure, at the cost of extra administrative overhead - monitoring, looking for conflicts (e.g. search for nsds5ReplConflict=*), and manually resolving them. MMR has been deployed for years (starting in 3/2001 with iPlanet DS 5.0), and in the vast majority of cases, data inconsistency just doesn’t happen.

In other words, FDS gives you the choice of using multi-master replication in your deployment topology. Don’t like it? Then nobody is going to make you use it. Need it? Then it is there. When it comes to complex functionality such as MMR, it is comforting to know that it has the robustness imbued by many years of bug fixing and deployment in demanding environments. It’s ready to use, no pioneering required.