2013-06-14

OSX Diff Tools

I’ve been happily using SourceGear’s DiffMerge as a diff tool for years. It’s a great cross-platform tool with an ugly UI. Every now and then I check out the competition to see if there’s anything better, but I haven’t found anything that includes a similar feature set at a comparable price point. It’s hard to beat free.

Today all of that changed. DiffMerge 4 is nagware and costs $39. This isn’t necessarily a bad thing - it’s only fair that SourceGear be compensated for their work - but it means that the app loses its main advantage over the competition and gives me an excuse to shop around for a replacement.

There aren’t that many diff tools available for the Mac. Many of them can be culled immediately from the set of contenders for making basic mistakes.

Mistake #1 is using cross-platform, un-Maclike UIs. Violators are DiffMerge, Meld, P4Merge and KDiff3. Meld is the worst offender as it requires the installation of XQuartz, so I couldn’t even try that one out.

Mistake #2 is not supporting 3-way merges, which makes the tools useless with Mercurial and Git. Apple’s FileMerge is the only tool I looked at that doesn’t support this essential functionality.

Mistake #3 is a lack of new releases. If I’m going to have to invest money and time in a new diff tool I want something that is still under active development. Changes hasn’t seen any significant updates since 2009, despite being taken over by a new company. DiffFork hasn’t been updated since 2011.

Mistake #4 is stupid pricing. Most of the tools are priced at $39 or cheaper. Araxis Merge is three times that price for the cut-down version and six times for the full version.

Mistake #5, which I’m editing in at a later date, is Java. I don’t like Java for reasons far too numerous to go into here. Out goes DeltaWalker

By default, I’m left with just one choice: Kaleidoscope. I’ve mentioned it before and complained that it couldn’t “compete on price, features and speed” with DiffMerge. At $69 it’s more expensive than DiffMerge, but the difference is now between $39 and $69, rather than “free” and “not free”. Kaleidoscope offers a vastly superior UI, and having tried out many of the other options, I realise that it’s actually one of the faster diff tools out there.

It has a number of features that I immediately appreciated:

  • It can jump directly to each change in a file.
  • It shows the list of changed files in a pane of the diff window, rather than a separate window.
  • It shows new files in addition to changed files.
  • Each change is highlighted in a different colour.
  • It has a dark theme.
  • Double-clicking on a folder or file opens it in a new tab.

I’m a little annoyed that I missed out on version 2’s introductory discount price, but I downloaded it from the AppStore anyway.

So long, DiffMerge.

2013-05-31

Patching NSManagedObjects

Previous posts discussed how to create a diff of an NSManagedObject pair. Now we’ll look at how to apply that diff as a patch.

Creating a diff gives us a data structure that looks like this:

(
    {
        attributes: {
            new: {
                city: "Sometown"
            },
            old: {
                city: "Townsville"
            }
        },
        entityName = "Address",
        guid: "123"
    }
)

The diff tells us that the Address entity with the GUID “123” changed its city attribute from “Townsville” to “Sometown”.

The patcher has to be able to do a number of things with a patch like this:

  • It must update the CoreData entity’s attributes to match the new values for each change in the patch.
  • It must be able to traverse both the nested patch structure and the CoreData graph and apply patches to sub-entities.
  • It must be able to apply patch arrays to sets of sub-entities (one-to-many relationships).
  • It must be able to delete entities from the CoreData graph if the patch indicates a entity was deleted.
  • It must be able to create new entities within the CoreData graph if the patch indicates a new entity was created.

The last requirement is tricky. Creating an entity is rarely a case of just calling insertNewObjectForEntityForName: inManagedObjectContext:. Typically, each entity will have associated sub-entities that should be created at the same time, so the patcher will inevitably use the delegate pattern to call back to an external object that can create new entity instances.

It would probably be handy if the patcher could allow manual patching for certain entity types, so that the developer could override its default behaviour for complex or unusual object types. For example, we might receive a patch for a Distance entity that includes two attributes: value and units. If we respect the semantics of the type we can’t update either attribute in isolation. We would expect the patcher to only overwrite one attribute if only one attribute has changed, but we should really overwrite both values to maintain the integrity of the data. (In contrast to the earlier posts about diffing, the current version of the differ in SZManagedObject emits the entire set of old attributes if it detects a change, precisely to cater for this situation.) Allowing for the ability to manually patch objects and override the patcher’s default behaviour lets us do this.

Something else I’m going to bake in is the ability to manually convert types for date and binary attributes. I want to be able to send diffs over the wire as JSON, but JSON doesn’t like date and binary data. They typically get sent as ISO strings or JavaScript date object constructors (for dates) and base64-encoded strings (for binary). A patch will therefore not necessarily have the correct data types for these two attribute types. Whenever they are encountered in the entity I’ll check the type of data in the patch, and if they disagree I’ll call out to a delegate to perform a conversion process.

Patching an entity requires two steps: attributes then relationships. Patching attributes is easy:

  • If there are old values but no new values, the entity has been deleted.
  • Loop through keys in the new dictionary and apply the values to the entity.

Patching relationships is really just a matter of creating new entities if necessary and using recursion to patch the entire graph:

  • If the relationship is one-to-many, patch the relationship as a set.
  • If not:
    • If no sub-entity with the required identifier can be found, call out to the delegate to create it.
    • Recursively patch the entity’s attributes and relationships.

Eventually we’ll hit a point where there are no more sub-patches, which means we have finished patching.

Patching one-to-many relationships is a little more complex, but it isn’t too hard:

  • Loop through sub-patches.
  • If the matching sub-entity does not exist and there is no old data in the patch, call out to the delegate to create it.
  • If there are no new values in the patch, delete the sub-entity.
  • If there are new values, old values and an object to update, recursively update it.

We now have a fully patched object graph.

Again, check out the SZManagedObject project on BitBucket for the sourcecode.

2013-05-29

SZManagedObject

As promised, here’s the BitBucket repository that contains the full NSManagedObject diff/patch functionality discussed in the last post:

The repository doesn’t include a demo yet, but the comments are extensive and the APIs are simple. I’ll revisit this soon with a post that discusses the patching functionality.

2013-05-28

Diffing NSManagedObjects Part 2

When I last discussed comparing two NSManagedObjects and producing a diff, I said that I’d come back to a few issues later. It turned out to be much later than expected; this post has been floating around on my hard disk for 6 months.

This time around I’ll deal with two of those issues:

  • What do you do if you want to include a related object in the diff if it has its deletion rule set to nullify?
  • What do you do if you want to exclude an attribute or relationship from the diff?

The answers are “whitelist” and “blacklist”.

In the previous post, we used deletion rules to signify object ownership. This works well, but means that we can’t automatically include crucial information from a relationship if the deletion rule indicates a lack of ownership.

For example, imagine we’re a dogfood company and have a Dog class and a Person class:

  • Person (User info: @{ @“id”: @“guid” })

    • Attributes:
      • guid
      • name
      • age
    • Relationships:
      • dog (Dog) (Delete: cascade) (To one)
  • Dog (User info: @{ @“id”: @“guid” })

    • Attributes:
      • guid
      • name
    • Relationships:
      • owner (Person) (Delete: nullify) (To one)

If we delete a person we’ll also delete his dog. We don’t care about the dog if we no longer track the person, presumably because we can’t sell anything directly to a dog. On the other hand, if we delete the dog we don’t necessarily want to delete the person. The person could get a new dog.

If a dog changes owner, it’d be handy if we could represent that information in a diff. Unfortunately, the deletion rule will prevent the automated diff methods from examining this relationship. We need to whitelist this relationship to ensure that it is included in the diff.

The whitelist looks like this:

NSDictionary *whitelistedRelationships = @{
@"Dog": @[ @"owner" ]
};

Each key in the dictionary represents the name of a class. Each value represents an array of properties of that class. In this case, the whitelist says, “If the diff creator encounters an instance of the Dog class, include the owner relationship in the diff regardless of the deletion rule”.

There are two issues here. Firstly, we’re hard-coding information that should really be metadata in the data model. Unfortunately, there’s not much we can do about that if we want to be able to produce a variety of different diffs that can be started in multiple places within the object graph.

The second issue is that we’ve re-introduced a cycle into the graph. The diff creator will repeatedly walk from the Dog to its owner and back, eventually causing a stack overflow. To get around that, we need to introduce the concept of a blacklist. Anything in a blacklist will not be examined. Here’s what the blacklist would look like in this situation:

NSDictionary *blacklistedProperties = @{
@"Owner": @[ @"dog" ]
};

The dictionary format is that same as the blacklist: each key is the name of a class, whilst each value is an array of attribute and relationship names that should be ignored by the diff creator. This particular blacklist says, “Do not traverse the dog relationship of any Owner instances”. This would be counter-productive if we were starting the diff from the owner, due to the direction of the relationship established by the deletion rule, but if we start from the dog it is very useful.

Now that we’ve decided upon workable solutions, we just need to implement them. We’ll start off with a method that can retrieve all of the property names for an NSManagedObject instance that exist in a blacklist or whitelist:

+ (NSArray *)propertyNamesForManagedObject:(NSManagedObject *)managedObject inList:(NSDictionary *)list {

    NSMutableArray *output = [NSMutableArray array];

    for (NSString *className in [list allKeys]) {
        Class class = NSClassFromString(className);

        if ([managedObject isKindOfClass:class]) {
            [output addObjectsFromArray:list[className]];
        }
    }

    return output;
}

We’ve introduced a new metaprogramming technique here: getting a pointer to a metaclass using the NSClassFromString() function. As the name suggests, we can get a class definition from a string representing its name. We can use that to determine the properties of the class.

This function will eventually give us a list of all property names of object managedObject that match a name in the list dictionary. For example, if we passed it an Owner object and the blacklist defined above, we’d get the following output:

@[ @"dog" ]

We know that the dog property of our Owner object is blacklisted and shouldn’t be traversed.

Next we’ll add a little helper method to determine if an array contains a string:

+ (BOOL)isString:(NSString *)string inList:(NSArray *)list {

    for (NSString *item in list) {
        if ([string isEqualToString:item]) {
            return YES;
        }
    }

    return NO;
}

Perhaps a set would have been a better data structure here; I might revisit that.

Now we can modify the diff method so that it checks the blacklist before attempting to examine an attribute or relationship:

...

NSArray *attributeNames = newObject == nil ? [[[oldObject entity] attributesByName] allKeys] : [[[newObject entity] attributesByName] allKeys];

NSArray *blacklist = [self propertyNamesForManagedObject:newObject ? newObject : oldObject inList:blacklistedProperties];

for (NSString *name in attributeNames) {

    if ([self isString:name inList:blacklist]) continue;

    ...

}

We do something similar to ensure that we traverse whitelisted relationships:

...

NSArray *whitelist = [self propertyNamesForManagedObject:newObject ? newObject : oldObject inList:whitelistedRelationships];

for (NSString *name in relationships) {

    NSRelationshipDescription *relationship = [relationships objectForKey:name];

    if (![self isString:name inList:whitelist]) {
        if (relationship.deleteRule != NSCascadeDeleteRule) continue;
    }

    ...
}

It’s a simple check - if the relationship is whitelisted, ignore the deletion rule.

At this point, the code is far too big to post here. I’ll create a BitBucket repository with the diff/patching system as a library with a demo project at some point. The explanation of the patching code will come later.

2012-11-27

Diffing NSManagedObjects

I’m currently working on a system to calculate the differences between two CoreData objects. Why would you want to do that, you ask? In my case, I’m trying to transmit the changes made to the object to a web server using the least amount of data and I don’t want to bother trying to store the changes as some kind of activity log. As discussed previously, I’m storing two copies of my objects: one that represents the last-known server state, and one that is changed by the user.

Here’s an imaginary CoreData model that we’ll use for all examples:

  • Person

    • firstName (string)
    • lastName (string)
    • age (int)
    • address (Address)
  • Address

    • street (string)
    • city (string)
    • state (string)
    • postalCode (string)

The “Person” entity has a related “Address” object, which gives the address at which the person lives.

In my first attempt, I came up with a particularly brittle solution and created the diff manually. The method takes in two parameters (the new and old objects that represent its different states) and computes the diff like this:


NSMutableDictionary *diff = [NSMutableDictionary dictionary];

if (![newObj.firstName isEqualToString:oldObj.firstName]) {
    diff[@"firstName"] = @{ @"old": oldObj.firstName, @"new": newObj.firstName };
}

Each attribute is compared and, if inequalities are found, the new and old values are stored in the dictionary. We can then transmit this minimal amount of information and the server can figure out what changed and how.

This solution is tied to the structure of the data model. Any changes in the data model mean we have to change the diff method. A better way would be to exploit features of the CoreData API that allow us to use a metaprogramming approach that has no foreknowledge of the data model and figure out the whole thing out on the fly.

CoreData has a number of useful features that allow us to do this. Firstly, all NSManagedObject instances are composed of two types of data: attributes and relationships. Attributes are basic properties of an object, such as ints, strings, doubles, etc. Relationships connect different NSManagedObjects together. To demonstrate this on the example model:

  • Person

    • Attributes:
      • firstName (string)
      • lastName (string)
      • age (int)
    • Relationships:
      • address (Address)
  • Address

    • Attributes:
      • street (string)
      • city (string)
      • state (string)
      • postalCode (string)

Each NSManagedObject has an associated “entity” instance that contains metadata about its attributes and relationships. If we wanted to list all of the names of the attributes for a Person instance, we could do this:


NSArray *attributeNames = [[[person entity] attributesByName] allKeys];

for (NSString *name in attributeNames) {
    NSLog(@"%@", name);
}

The output would be as follows:

age
firstName
lastName

We can do the same with relationship names:


NSArray *relationshipNames = [[[person entity] relationshipsByName] allKeys];

for (NSString *name in relationshipNames) {
    NSLog(@"%@", name);
}

Here’s the output:

address

We’re now able to list all of the properties of an NSManagedObject. Armed with this, and adding in a little Objective-C metaprogramming, we can write a function that can identify differences between two managed objects. Assuming we have two valid NSManagedObject pointers, here’s a simple diff function:


- (NSDictionary *)diffNewObject:(NSManagedObject *)newObject withOldObject:(NSManagedObject *)oldObject {
    
    NSMutableDictionary *diff = [NSMutableDictionary dictionary];

    NSArray *attributeNames = [[[newObject entity] attributesByName] allKeys];

    for (NSString *name in attributeNames) {

        SEL selector = NSSelectorFromString(name);

        id newValue = [newObject performSelector:selector];
        id oldValue = [oldObject performSelector:selector];

        if (![newValue isEqual:oldValue]) {
            diff[name] = @{ @"new": newValue, @"old": oldValue };
        }
    }

    return diff;
}

One of the metaprogramming tricks we’re using is “NSSelectorFromString()”, which can be thought of as giving us a function pointer, though that isn’t really at all accurate. Since entity attributes all map to properties of the NSManagedObject instance, we can treat them as selectors and call them using the “performSelector:” method. Once we’ve extracted the new and old values we can compare them and, if they are different, add them to the diff dictionary.

Let’s see what happens if we run it:


Person *person1 = [self createPerson];
person1.firstName = @"Joe";
person1.lastName = @"McJoe";
person1.age = @23;

Person *person2 = [self createPerson];
person2.firstName = @"Bob";
person2.lastName = @"McJoe";
person2.age = @23;

NSDictionary *diff = [self diffNewObject:person1 withOldObject:person2];

NSLog(@"%@", diff);

The output we get is this:

{
    firstName = {
        new = Joe;
        old = Bob;
    };
};

Now the attributes of the entities in our data model can change without the diff method breaking.

The method is still a little brittle, however. What happens if we’re trying to diff two objects of different types that share some of the same properties? What happens if some of the properties are nil? (NSDictionaries cannot store nil values.) What if one of the objects is nil?

Here’s an updated version of the method that accounts for all of these possibilities:


- (NSDictionary *)diffNewObject:(NSManagedObject *)newObject withOldObject:(NSManagedObject *)oldObject {
    
    NSMutableDictionary *diff = [NSMutableDictionary dictionary];

    NSArray *attributeNames = newObject == nil ? [[[oldObject entity] attributesByName] allKeys] : [[[newObject entity] attributesByName] allKeys];

    for (NSString *name in attributeNames) {

        SEL selector = NSSelectorFromString(name);

        id newValue = nil;
        id oldValue = nil;

        if (newObject != nil && [newObject respondsToSelector:selector]) newValue = [newObject performSelector:selector];
        if (oldObject != nil && [oldObject respondsToSelector:selector]) oldValue = [oldObject performSelector:selector];

        newValue = newValue ? newValue : [NSNull null];
        oldValue = oldValue ? oldValue : [NSNull null];

        if (![newValue isEqual:oldValue]) {
            diff[name] = @{ @"new": newValue, @"old": oldValue };
        }
    }

    return diff;
}

There are a few changes:

  • We ensure that we get the attribute list from a non-nil object;
  • We ensure that the objects respond to the selectors before trying to call them;
  • We replace nil values with NSNull.

Unfortunately, finding the diff between two objects’ attributes is the easy part. Producing a diff of the relationships is rather harder. We made a mistake in our object model, so let’s correct that now and see one of the reasons why diffing relationships is hard:

  • Person

    • Attributes:
      • firstName (string)
      • lastName (string)
      • age (int)
    • Relationships:
      • address (Address)
  • Address

    • Attributes:
      • street (string)
      • city (string)
      • state (string)
      • postalCode (string)
    • Relationships:
      • person (Person)

Apple recommend including an inverse of all relationships. Our Person entity has a relationship to the Address entity, so our Address entity should include the inverse relationship back to the Person entity.

Hopefully the problem we’ve introduced is apparent. We’ve created a cycle in what was previously an acyclic graph. If we try and walk the graph from a Person to an Address, we’ll find that we need to walk back to the Person. And back to the Address. And back to the Person. And now we’ve recursed too far and blown the stack.

CoreData includes another feature that can help mitigate this problem. It isn’t a perfect solution, but it should suffice in most situations. Each relationship includes a deletion rule that tells CoreData what to do with related entities each time an entity is deleted. In this case, if we were to delete a Person we’d want their Address to be deleted. However, if we delete an Address (because the person sells his house), we don’t want to delete the person.

We’ll add that metadata to our data model:

  • Person

    • Attributes:
      • firstName (string)
      • lastName (string)
      • age (int)
    • Relationships:
      • address (Address) (Delete: cascade)
  • Address

    • Attributes:
      • street (string)
      • city (string)
      • state (string)
      • postalCode (string)
    • Relationships:
      • person (Person) (Delete: nullify)

We can use the deletion rule to infer ownership of one entity by another. The Person entity can be said to “own” its related Address entity because deleting the Person deletes the Address. The Address entity can be said to be “owned” by its related Person entity because deleting the Address won’t delete the Person. We still have cycles in the graph, but now we can use its deletion rules as direction indicators that tell a walk algorithm which paths it can take through the graph.

Here’s a graph walk algorithm that uses this new information:


- (void)walkGraph:(NSManagedObject *)managedObject {

    NSLog(@"%@", [[managedObject entity] name]);
    
    NSDictionary *relationships = [[managedObject entity] relationshipsByName];

    for (NSString *name in relationships) {

        NSRelationshipDescription *relationship = relationships[name];

        if (relationship.deleteRule != NSCascadeDeleteRule) continue;

        SEL selector = NSSelectorFromString(name);

        NSManagedObject *child = [managedObject performSelector:selector];

        [self walkGraph:child];
    }
}

We’re using a familiar set of metaprogramming tools: NSSelectorFromString(), performSelector: and relationshipsByName. Here’s what it will output if we feed it a fully-populated Person instance:

Person
Address

Note that an alternative would be to maintain a list of walked objects instead of using the deletion rule/ownership approach, but this could introduce its own set of problems.

At this point, we can add a couple of methods and produce a full diff between two Person objects:


- (NSDictionary *)diffNewObject:(NSManagedObject *)newObject withOldObject:(NSManagedObject *)oldObject {
    
    NSDictionary *attributeDiff = [self diffAttributesOfNewObject:newObject withOldObject:oldObject];

    NSDictionary *relationshipsDiff = [self diffRelationshipsOfNewObject:newObject withOldObject:oldObject];

    NSMutableDictionary *diff = [NSMutableDictionary dictionary];

    if (attributeDiff.count > 0) {
        diff[@"attributes"] = attributeDiff;
    }

    if (relationshipsDiff.count > 0) {
        diff[@"relationships"] = relationshipsDiff;
    }

    return diff;
}

- (NSDictionary *)diffRelationshipsOfNewObject:(NSManagedObject *)newObject withOldObject:(NSManagedObject *)oldObject {

    NSMutableDictionary *diff = [NSMutableDictionary dictionary];
    
    NSDictionary *relationships = newObject == nil ? [[oldObject entity] relationshipsByName] : [[newObject entity] relationshipsByName];

    for (NSString *name in relationships) {

        NSRelationshipDescription *relationship = relationships[name];

        if (relationship.deleteRule != NSCascadeDeleteRule) continue;

        SEL selector = NSSelectorFromString(name);

        id newValue = nil;
        id oldValue = nil;

        if (newObject != nil && [newObject respondsToSelector:selector]) newValue = [newObject performSelector:selector];
        if (oldObject != nil && [oldObject respondsToSelector:selector]) oldValue = [oldObject performSelector:selector];

        NSDictionary *relationshipDiff = [self diffNewObject:newValue withOldObject:oldValue];

        if (relationshipDiff.count > 0) {
            diff[name] = relationshipDiff;
        }
    }

    return diff;
}

- (NSDictionary *)diffAttributesOfNewObject:(NSManagedObject *)newObject withOldObject:(NSManagedObject *)oldObject {

    NSMutableDictionary *diff = [NSMutableDictionary dictionary];

    NSArray *attributeNames = newObject == nil ? [[[oldObject entity] attributesByName] allKeys] : [[[newObject entity] attributesByName] allKeys];

    for (NSString *name in attributeNames) {

        SEL selector = NSSelectorFromString(name);

        id newValue = nil;
        id oldValue = nil;

        if (newObject != nil && [newObject respondsToSelector:selector]) newValue = [newObject performSelector:selector];
        if (oldObject != nil && [oldObject respondsToSelector:selector]) oldValue = [oldObject performSelector:selector];

        newValue = newValue ? newValue : [NSNull null];
        oldValue = oldValue ? oldValue : [NSNull null];

        if (![newValue isEqual:oldValue]) {
            diff[name] = @{ @"new": newValue, @"old": oldValue };
        }
    }

    return diff;
}

We’ve added two new methods. The new diffRelationshipsOfNewObject:withOldObject: method expands on our walkGraph: method by checking for nil, the ability to perform a given selector, and by producing a diff. The new diffNewObject:withOldObject: method collates the attribute and relationship diffs into a single dictionary. It is called recursively by the relationship diff method to ensure that the entire object graph is compared.

Although we can now successfully create a full diff of the graph described by our Person entity, we haven’t yet managed to cover all of the possibilities offered by CoreData. Suppose we modify our data model to accommodate people who own multiple houses?

  • Person

    • Attributes:
      • firstName (string)
      • lastName (string)
      • age (int)
    • Relationships:
      • address (Address) (Delete: cascade) (To many)
  • Address

    • Attributes:
      • street (string)
      • city (string)
      • state (string)
      • postalCode (string)
    • Relationships:
      • person (Person) (Delete: nullify) (To one)

Our relationship diff method no longer works. When we retrieve the value of the relationship we receive an NSSet pointer rather than an NSManagedObject pointer and we haven’t written any code to handle that. Even if we write code to iterate over the set and diff individual items, we run into some nasty problems:

  • If we have two addresses and change one, how do we match up the addresses in the two Person objects?
  • How can we tell if an address has been deleted?
  • How can we tell if an address has been added?

It’s clear that our Address entity needs to include an immutable identifier that is unique within the context of its set, but is shared between copies of a Person. NSManagedObjects include an objectID property, but this is unique within the entire dataset so can’t be used. We’ll need to add a new attribute to the Address entity:

  • Address
    • Attributes:
      • guid (string)
      • street (string)
      • city (string)
      • state (string)
      • postalCode (string)
    • Relationships:
      • person (Person) (Delete: nullify) (To one)

I’ve plumped for a GUID, but you could use a bog-standard string or a number.

Now we can identify Address entities by their GUID, but we’ve got another problem. By design, our diff method is entirely ignorant of the structure of our object model. It doesn’t know that it is trying to diff Address objects and therefore can’t know that it should identify them using their “guid” attribute. To get around this, we’ll use yet another meta-feature of CoreData: user info.

Everything in a CoreData model includes a “user info” dictionary. This is a collection of key/value pairs of user-created metadata that can be examined at run time. By “everything”, I mean precisely that: entities, attributes and relationships all contain a “user info” dictionary.

We need the guid property of the Address entity to be identifiable as a unique identifier at run time, so we can exploit the user info dictionary to do this. We’ll add a single key/value pair to the Address entity’s user info that contains “id” as the key and “guid” as the value. At run time, any time we encounter a one-to-many relationship, we’ll examine the members of the related set and locate their “id” user info key. The value will tell us which property contains that entity’s unique identifier, which we can then use to match up objects in the two sets.

Our Address entity now looks like this:

  • Address (User info: @{ @“id”: @“guid” })
    • Attributes:
      • guid (string)
      • street (string)
      • city (string)
      • state (string)
      • postalCode (string)
    • Relationships:
      • person (Person) (Delete: nullify) (To one)

The method for diffing two sets looks like this:


- (NSArray *)diffNewSet:(NSSet *)newSet withOldSet:(NSSet *)oldSet {
    
    NSMutableArray *changes = [NSMutableArray array];

    // Find all items that have been newly created or updated.
    for (NSManagedObject *newItem in newSet) {

        NSString *idAttributeName = newItem.entity.userInfo[@"id"];

        NSAssert(idAttributeName, @"Entities must have an id property set in their user info.");

        id newItemId = [newItem valueForKey:idAttributeName];

        NSManagedObject *oldItem = nil;

        for (NSManagedObject *setItem in oldSet) {
            id setItemId = [setItem valueForKey:idAttributeName];

            if ([setItemId isEqual:newItemId]) {
                oldItem = setItem;
                break;
            }
        }

        NSDictionary *diff = [self diffNewObject:newItem withOldObject:oldItem];

        if (diff) {
            [changes addObject:diff];
        }
    }

    // Find all items that have been deleted.
    for (NSManagedObject *oldItem in oldSet) {

        NSString *idAttributeName = oldItem.entity.userInfo[@"id"];

        NSAssert(idAttributeName, @"Entities must have an id property set in their user info.");

        id oldItemId = [oldItem valueForKey:idAttributeName];

        NSManagedObject *newItem = nil;

        for (NSManagedObject *setItem in newSet) {
            id setItemId = [setItem valueForKey:idAttributeName];

            if ([setItemId isEqual:oldItemId]) {
                newItem = setItem;
                break;
            }
        }

        if (!newItem) {
            NSDictionary *diff = [self diffNewObject:newItem withOldObject:oldItem];

            if (diff) {
                [changes addObject:diff];
            }
        }
    }

    return changes;
}

We’ll create a pair of example sets and see what we get out of the method. Here are our sets:

NewSet = (
    Address {
        guid = "123";
        street = "Somewhere";
        city = "Sometown";
        postalCode = "54532";
    },
    Address {
        guid = "567";
        street = "Elsewhere";
        city = "Another Town";
        postalCode = "222";
    }
};

OldSet = (
    Address {
        guid = "123";
        street = "Somewhere";
        city = "Townsville";
        postalCode = "54532";
    }
);

You should be able to see that we’ve added a new address and changed the existing address’ city attribute. This is the array produced by the method:

(
    {
        attributes: {
            city: {
                old: "Townsville",
                new: "Sometown"
            }
        }
    },
    {
        attributes: {
            guid: {
                old: "<null>",
                new: "567"
            },
            street: {
                old: "<null>",
                new: "Elsewhere"
            },
            city: {
                old: "<null>",
                new: "Another Town"
            },
            postalCode {
                old: "<null>",
                new: "222"
            }
        }
    }
)

Looks good. We can see that one of the array entries contains just the old and new values for “city” as we’d expect. The other entry contains values for all of the Address object’s attributes as we’ve compared it to nil. It didn’t exist previously. The only issue we have is that we can’t tell which Address object the first list entry represents because its guid attribute isn’t included. We can rectify this omission by modifying the diffNewObject:withOldObject: method:


- (NSDictionary *)diffNewObject:(NSManagedObject *)newObject withOldObject:(NSManagedObject *)oldObject {
    
    NSDictionary *attributeDiff = [self diffAttributesOfNewObject:newObject withOldObject:oldObject];

    NSDictionary *relationshipsDiff = [self diffRelationshipsOfNewObject:newObject withOldObject:oldObject];

    NSMutableDictionary *diff = [NSMutableDictionary dictionary];

    if (attributeDiff.count > 0) {
        diff[@"attributes"] = attributeDiff;
    }

    if (relationshipsDiff.count > 0) {
        diff[@"relationships"] = relationshipsDiff;
    }

    if (diff.count > 0) {
        diff[@"entityName"] = newObject ? newObject.entity.name : oldObject.entity.name;

        NSString *idAttributeName = newObject ? newObject.entity.userInfo[@"id"] : oldObject.entity.userInfo[@"id"];

        if (idAttributeName) {
            id itemId = newObject ? [newObject valueForKey:idAttributeName] : [oldObject valueForKey:idAttributeName];

            if (itemId) {
                diff[idAttributeName] = itemId;
            }
        }
    }

    return diff;
}

We’re fetching the unique identifier attribute and adding its value to the diff. We’ve also added the entity’s name, which may help when dealing with the diff once it has been transmitted and received and needs to be parsed. The output from our previous pair of sets now looks like this:

(
    {
        attributes: {
            city: {
                old: "Townsville",
                new: "Sometown"
            }
        },
        entityName = "Address",
        guid: "123"
    },
    {
        attributes: {
            guid: {
                old: "<null>",
                new: "567"
            },
            street: {
                old: "<null>",
                new: "Elsewhere"
            },
            city: {
                old: "<null>",
                new: "Another Town"
            },
            postalCode {
                old: "<null>",
                new: "222"
            }
        },
        entityName = "Address",
        guid: "567"
    }
)

The final piece of the puzzle is to plug this set diffing method into our relationship diff method. Fortunately, the relationship description class includes an “isToMany” property that will allow us to use the correct diffing method for a given relationship. Here’s the final set of methods required for diffing a pair of managed objects:


- (NSDictionary *)diffNewObject:(NSManagedObject *)newObject withOldObject:(NSManagedObject *)oldObject {
    
    NSDictionary *attributeDiff = [self diffAttributesOfNewObject:newObject withOldObject:oldObject];

    NSDictionary *relationshipsDiff = [self diffRelationshipsOfNewObject:newObject withOldObject:oldObject];

    NSMutableDictionary *diff = [NSMutableDictionary dictionary];

    if (attributeDiff.count > 0) {
        diff[@"attributes"] = attributeDiff;
    }

    if (relationshipsDiff.count > 0) {
        diff[@"relationships"] = relationshipsDiff;
    }

    if (diff.count > 0) {
        diff[@"entityName"] = newObject ? newObject.entity.name : oldObject.entity.name;

        NSString *idAttributeName = newObject ? newObject.entity.userInfo[@"id"] : oldObject.entity.userInfo[@"id"];

        if (idAttributeName) {
            id itemId = newObject ? [newObject valueForKey:idAttributeName] : [oldObject valueForKey:idAttributeName];

            if (itemId) {
                diff[idAttributeName] = itemId;
            }
        }
    }

    return diff;
}

- (NSDictionary *)diffRelationshipsOfNewObject:(NSManagedObject *)newObject withOldObject:(NSManagedObject *)oldObject {

    NSMutableDictionary *diff = [NSMutableDictionary dictionary];
    
    NSDictionary *relationships = newObject == nil ? [[oldObject entity] relationshipsByName] : [[newObject entity] relationshipsByName];

    for (NSString *name in relationships) {

        NSRelationshipDescription *relationship = relationships[name];

        if (relationship.deleteRule != NSCascadeDeleteRule) continue;

        SEL selector = NSSelectorFromString(name);

        id newValue = nil;
        id oldValue = nil;

        if (newObject != nil && [newObject respondsToSelector:selector]) newValue = [newObject performSelector:selector];
        if (oldObject != nil && [oldObject respondsToSelector:selector]) oldValue = [oldObject performSelector:selector];

        if (relationship.isToMany) {

            NSArray *changes = [self diffNewSet:newValue withOldSet:oldValue];

            if (changes.count > 0) {
                diff[name] = changes;
            }

        } else {

            NSDictionary *relationshipDiff = [self diffNewObject:newValue withOldObject:oldValue];

            if (relationshipDiff.count > 0) {
                diff[name] = relationshipDiff;
            }
        }
    }

    return diff;
}

- (NSDictionary *)diffAttributesOfNewObject:(NSManagedObject *)newObject withOldObject:(NSManagedObject *)oldObject {

    NSMutableDictionary *diff = [NSMutableDictionary dictionary];

    NSArray *attributeNames = newObject == nil ? [[[oldObject entity] attributesByName] allKeys] : [[[newObject entity] attributesByName] allKeys];

    for (NSString *name in attributeNames) {

        SEL selector = NSSelectorFromString(name);

        id newValue = nil;
        id oldValue = nil;

        if (newObject != nil && [newObject respondsToSelector:selector]) newValue = [newObject performSelector:selector];
        if (oldObject != nil && [oldObject respondsToSelector:selector]) oldValue = [oldObject performSelector:selector];

        newValue = newValue ? newValue : [NSNull null];
        oldValue = oldValue ? oldValue : [NSNull null];

        if (![newValue isEqual:oldValue]) {
            diff[name] = @{ @"new": newValue, @"old": oldValue };
        }
    }

    return diff;
}

- (NSArray *)diffNewSet:(NSSet *)newSet withOldSet:(NSSet *)oldSet {
    
    NSMutableArray *changes = [NSMutableArray array];

    // Find all items that have been newly created or updated.
    for (NSManagedObject *newItem in newSet) {

        NSString *idAttributeName = newItem.entity.userInfo[@"id"];

        NSAssert(idAttributeName, @"Entities must have an id property set in their user info.");

        id newItemId = [newItem valueForKey:idAttributeName];

        NSManagedObject *oldItem = nil;

        for (NSManagedObject *setItem in oldSet) {
            id setItemId = [setItem valueForKey:idAttributeName];

            if ([setItemId isEqual:newItemId]) {
                oldItem = setItem;
                break;
            }
        }

        NSDictionary *diff = [self diffNewObject:newItem withOldObject:oldItem];

        if (diff.count > 0) {
            [changes addObject:diff];
        }
    }

    // Find all items that have been deleted.
    for (NSManagedObject *oldItem in oldSet) {

        NSString *idAttributeName = oldItem.entity.userInfo[@"id"];

        NSAssert(idAttributeName, @"Entities must have an id property set in their user info.");

        id oldItemId = [oldItem valueForKey:idAttributeName];

        NSManagedObject *newItem = nil;

        for (NSManagedObject *setItem in newSet) {
            id setItemId = [setItem valueForKey:idAttributeName];

            if ([setItemId isEqual:oldItemId]) {
                newItem = setItem;
                break;
            }
        }

        if (!newItem) {
            NSDictionary *diff = [self diffNewObject:newItem withOldObject:oldItem];

            if (diff.count > 0) {
                [changes addObject:diff];
            }
        }
    }

    return changes;
}

That’s it: diff created! At this point, if you find that you don’t like the way old and new values are grouped together, or want to change any other aspect of the way the diff is produced, you should have enough information about how to use metaprogramming with NSManagedObjects to be able to change things around.

There are still two unsolved problems that I’ll tackle in another post:

  • What do you do if you want to include a related object in the diff if it has its deletion rule set to nullify?
  • What do you do if you want to exclude an attribute or relationship from the diff?

Once those are solved, there’s another big one: how do you apply the diff to an object? I’ll tackle that in another post, too.