Skip to content
This repository has been archived by the owner on Jan 17, 2023. It is now read-only.

_AFStateObserving Swizzling Implementation is Wrong #2702

Merged
merged 1 commit into from May 14, 2015

Conversation

kcharwood
Copy link
Contributor

The _AFStateObserving category on NSURLSessionTask was recently changed (2.5.3) to remove a dispatch_once block. I'm not sure what the intent of this change was, but one of the side effects is that the swizzling gets run multiple times (for every initialized subclass of NSURLSessionTask) and effectively "flip flops" the implementations of the original methods with the swizzled methods.

If this gets called an odd number of times, it happens to work. However, if it gets called an even number of times, the net result is that the swizzled methods are never run.

If I log the classes this method is called on, here's the result I see when I first use a data task:

2015-05-06 14:17:00.788 class=__NSCFLocalDataTask, taskClass=__NSCFLocalSessionTask
2015-05-06 14:17:00.793 class=__NSCFLocalSessionTask, taskClass=__NSCFLocalSessionTask
2015-05-06 14:17:00.795 class=NSURLSessionTask, taskClass=__NSCFLocalSessionTask
2015-05-06 14:17:00.805 class=__NSCFLocalUploadTask, taskClass=__NSCFLocalSessionTask
2015-05-06 14:17:00.809 class=NSURLSessionDataTask, taskClass=__NSCFLocalSessionTask

So, this works fine for a while. However, if I eventually also use an upload task, I see this also:

2015-05-06 14:19:51.903 class=NSURLSessionUploadTask, taskClass=__NSCFLocalSessionTask

This last call swizzles the methods a sixth time, resulting in resume/suspend being set back to their original implementations.

@kcharwood
Copy link
Contributor

I believe this was related to #2638.

I'll try and dig in today or tomorrow.

@kcharwood kcharwood added the bug label May 7, 2015
@kcharwood
Copy link
Contributor

It looks like @mattt actually removed the dispatch_once in commit ace91df

I'm guessing we need still a dispatch_once, just inside the conditional. Something like this:

+ (void)initialize {
    if ([NSURLSessionTask class]) {
        static dispatch_once_t onceToken;
        dispatch_once(&onceToken, ^{
            NSURLSessionDataTask *dataTask = [[NSURLSession sessionWithConfiguration:nil] dataTaskWithURL:nil];
            Class taskClass = [dataTask superclass];

            af_addMethod(taskClass, @selector(af_resume),  class_getInstanceMethod(self, @selector(af_resume)));
            af_addMethod(taskClass, @selector(af_suspend), class_getInstanceMethod(self, @selector(af_suspend)));
            af_swizzleSelector(taskClass, @selector(resume), @selector(af_resume));
            af_swizzleSelector(taskClass, @selector(suspend), @selector(af_suspend));

            [dataTask cancel];
        });
    }
}

Does this resolve the issue for you?

@kcharwood
Copy link
Contributor

It actually may be more complicated than that... Let me do some more thinking.

@kcharwood
Copy link
Contributor

After further review, I THINK the dispatch_once will solve your issue as well as the other linked issues here, but I need verification.

Could you take a look at this branch, or drop in this patch and verify things look good on your end?

@kcharwood kcharwood added this to the 2.5.4 milestone May 7, 2015
@cnoon
Copy link
Member

cnoon commented May 7, 2015

Your patch is deadlocking for me @kcharwood in the AFNetworking iOS Example app.

Deadlock Case

If you use this modified patch, and run the example app, I'm assuming you will see the same behavior.

+ (void)initialize {
    NSLog(@"Attempting to initialize: %@", NSStringFromClass([self class]));

    static dispatch_once_t onceToken;
    dispatch_once(&onceToken, ^{
        NSLog(@"Initializing class: %@", NSStringFromClass([self class]));

        NSURLSessionDataTask *dataTask = [[NSURLSession sessionWithConfiguration:nil] dataTaskWithURL:nil];
        Class taskClass = [dataTask superclass];

        af_addMethod(taskClass, @selector(af_resume),  class_getInstanceMethod(self, @selector(af_resume)));
        af_addMethod(taskClass, @selector(af_suspend), class_getInstanceMethod(self, @selector(af_suspend)));
        af_swizzleSelector(taskClass, @selector(resume), @selector(af_resume));
        af_swizzleSelector(taskClass, @selector(suspend), @selector(af_suspend));

        [dataTask cancel];
    });
}

Produces the following output and hangs:

2015-05-07 09:24:10.692 AFNetworking iOS Example[50814:37472160] Attempting to initialize: NSURLSessionTask
2015-05-07 09:24:10.692 AFNetworking iOS Example[50814:37472160] Initializing class: NSURLSessionTask
2015-05-07 09:24:10.692 AFNetworking iOS Example[50814:37472160] Attempting to initialize: __NSCFLocalSessionTask

Static State Variable

One way to avoid the deadlock would be to store a second static variable to early out if necessary.

+ (void)initialize {
    NSLog(@"Attempting to initialize: %@", NSStringFromClass([self class]));

    static BOOL isInitialized = NO;

    if (isInitialized) {
        return;
    }

    static dispatch_once_t onceToken;
    dispatch_once(&onceToken, ^{
        isInitialized = YES;

        NSLog(@"Initializing class: %@", NSStringFromClass([self class]));

        NSURLSessionDataTask *dataTask = [[NSURLSession sessionWithConfiguration:nil] dataTaskWithURL:nil];
        Class taskClass = [dataTask superclass];

        af_addMethod(taskClass, @selector(af_resume),  class_getInstanceMethod(self, @selector(af_resume)));
        af_addMethod(taskClass, @selector(af_suspend), class_getInstanceMethod(self, @selector(af_suspend)));
        af_swizzleSelector(taskClass, @selector(resume), @selector(af_resume));
        af_swizzleSelector(taskClass, @selector(suspend), @selector(af_suspend));

        [dataTask cancel];
    });
}

Produces the following output and runs properly:

2015-05-07 09:26:12.725 AFNetworking iOS Example[50861:37477359] Attempting to initialize: NSURLSessionTask
2015-05-07 09:26:12.725 AFNetworking iOS Example[50861:37477359] Initializing class: NSURLSessionTask
2015-05-07 09:26:12.725 AFNetworking iOS Example[50861:37477359] Attempting to initialize: __NSCFLocalSessionTask
2015-05-07 09:26:12.725 AFNetworking iOS Example[50861:37477359] Attempting to initialize: __NSCFLocalDataTask
2015-05-07 09:26:12.728 AFNetworking iOS Example[50861:37477413] Attempting to initialize: __NSCFLocalUploadTask
2015-05-07 09:26:12.728 AFNetworking iOS Example[50861:37477413] Attempting to initialize: NSURLSessionDataTask

Another option would be to compare the class against NSURLSessionTask.

Class Check

+ (void)initialize {
    NSLog(@"Attempting to initialize: %@", NSStringFromClass([self class]));

    if ([NSStringFromClass([self class]) isEqualToString:NSStringFromClass([NSURLSessionTask class])]) {
        NSLog(@"Initializing class: %@", NSStringFromClass([self class]));

        NSURLSessionDataTask *dataTask = [[NSURLSession sessionWithConfiguration:nil] dataTaskWithURL:nil];
        Class taskClass = [dataTask superclass];

        af_addMethod(taskClass, @selector(af_resume),  class_getInstanceMethod(self, @selector(af_resume)));
        af_addMethod(taskClass, @selector(af_suspend), class_getInstanceMethod(self, @selector(af_suspend)));
        af_swizzleSelector(taskClass, @selector(resume), @selector(af_resume));
        af_swizzleSelector(taskClass, @selector(suspend), @selector(af_suspend));

        [dataTask cancel];
    }
}

Produces the following output and runs properly:

2015-05-07 09:27:16.067 AFNetworking iOS Example[50895:37480604] Attempting to initialize: NSURLSessionTask
2015-05-07 09:27:16.067 AFNetworking iOS Example[50895:37480604] Initializing class: NSURLSessionTask
2015-05-07 09:27:16.067 AFNetworking iOS Example[50895:37480604] Attempting to initialize: __NSCFLocalSessionTask
2015-05-07 09:27:16.067 AFNetworking iOS Example[50895:37480604] Attempting to initialize: __NSCFLocalDataTask
2015-05-07 09:27:16.068 AFNetworking iOS Example[50895:37480655] Attempting to initialize: __NSCFLocalUploadTask
2015-05-07 09:27:16.068 AFNetworking iOS Example[50895:37480655] Attempting to initialize: NSURLSessionDataTask

There's probably a better way to compare class equality, but that was the best solution I could come up with on short notice. See thread for more info.

Another option would be to switch from initialize to load, but that seems to have many other implications from past issues I've been digging through. They may have had other issues though due to the previous implementations of this workaround.

IMO, I think the [Class Check](#Class Check) approach is the best way to go.

Thoughts?

@kcharwood
Copy link
Contributor

I think Class Check looks pretty good as well.

I put together a quick test to try and verify this in AFURLSessionManagerTests:

- (void)testSwizzlingIsWorkingAsExpected {
    [self expectationForNotification:@"com.alamofire.networking.task.suspend"
                              object:nil
                             handler:nil];
    NSURL *delayURL = [self.baseURL URLByAppendingPathComponent:@"delay/1"];
    NSURLSessionDataTask *task = [self.manager dataTaskWithRequest:[NSURLRequest requestWithURL:delayURL]
                                                 completionHandler:nil];
    [task resume];
    [task suspend];
    [self waitForExpectationsWithTimeout:2.0 handler:nil];
    [task cancel];


    [self expectationForNotification:@"com.alamofire.networking.task.suspend"
                              object:nil
                             handler:nil];

    NSURLSessionDataTask *uploadTask = [self.manager uploadTaskWithRequest:[NSURLRequest requestWithURL:delayURL]
                                                                   fromData:nil
                                                                   progress:nil
                                                          completionHandler:nil];
    [uploadTask resume];
    [uploadTask suspend];
    [self waitForExpectationsWithTimeout:2.0 handler:nil];
    [uploadTask cancel];
}

When running this test in isolation, this test fails on 2.5.3. When running with the class check patch, it passes.

I'm not sure this is a great test to add to the suite, because its still susceptible to race conditions from other tests. For example, if both of these classes had already been loaded from previous tests, and an odd number of session task classes had already been created, this test would pass on 2.5.3. Anyone have any better ideas on how to build a better test?

@kcharwood
Copy link
Contributor

I updated the branch with @cnoon's patch for class name check.

@kcharwood
Copy link
Contributor

I'm also wondering if the code can be reduced to

+ (void)initialize {
    if ([NSStringFromClass([self class]) isEqualToString:NSStringFromClass([NSURLSessionTask class])]) {

        af_addMethod([NSURLSessionTask class], @selector(af_resume),  class_getInstanceMethod(self, @selector(af_resume)));
        af_addMethod([NSURLSessionTask class], @selector(af_suspend), class_getInstanceMethod(self, @selector(af_suspend)));
        af_swizzleSelector([NSURLSessionTask class], @selector(resume), @selector(af_resume));
        af_swizzleSelector([NSURLSessionTask class], @selector(suspend), @selector(af_suspend));
    }
}

but I'm not super clear what the original need was to call "superclass" to get something that will work here. Still trying to parse through #2638

@kcharwood
Copy link
Contributor

One more alternative a coworker suggested.

We could create a dummy class only used for swizzling. That way we don't have to make a protocol on NSURLSessionTask and swizzle in initialize, and we can swizzle from load in the dummy class only if NSURLSessionTask is available. I think that would address the original problems for the crashers, and would eliminate the multiple calls to initialize.

Something like this commit. I dropped a bunch of tests in there as well.

@cnoon
Copy link
Member

cnoon commented May 7, 2015

Oh man I like it! I think it could safely be condensed down to the following:

@implementation _AFURLSessionTaskSwizzling

+ (void)load {
    Class urlSessionTaskClass = [NSURLSessionTask class];

    af_addMethod(urlSessionTaskClass, @selector(af_resume),  class_getInstanceMethod(urlSessionTaskClass, @selector(af_resume)));
    af_addMethod(urlSessionTaskClass, @selector(af_suspend), class_getInstanceMethod(urlSessionTaskClass, @selector(af_suspend)));
    af_swizzleSelector(urlSessionTaskClass, @selector(resume), @selector(af_resume));
    af_swizzleSelector(urlSessionTaskClass, @selector(suspend), @selector(af_suspend));
}

@end

The load method will never get run more than once. Additionally, there's no need to add a guard around the Class object. Last tweak I made was to rename the variable to urlSessionTaskClass.

@kcharwood
Copy link
Contributor

Thanks @cnoon. Just pushed an update with that refactor, along with one other note. On OS X, Apple says don't use the class method to check for existence, but rather NSClassFromString, so I made that update as well.

In OS X (and in iOS projects that do not meet the set of conditions just listed), you cannot use the class method to determine if a weakly linked class is available. Instead, use the NSClassFromString function in code similar to the following:

Class cls = NSClassFromString (@"NSRegularExpression");
if (cls) {
    // Create an instance of the class and use it.
} else {
    // Alternate code path to follow when the
    // class is not available.
}

@cnoon
Copy link
Member

cnoon commented May 8, 2015

Looks good @kcharwood! Thanks for being so thorough on this. Also, good to know on the class check 👍🏻

@kcharwood
Copy link
Contributor

It looks like Travis just can't handle the XCTests with expectationsForNotifications. I've commented those out for now, since Travis is running Xcode 6.1.

The tests do work locally for me.

@implementation _AFURLSessionTaskSwizzling

+ (void)load {
Class urlSessionTaskClass = NSClassFromString(@"NSURLSessionTask");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this fix ignores the issue that 27964aa was ostensibly trying to fix. I'm not familiar with the history of this code, but presumably swizzling the superclass of NSURLSessionDataTask is important, and this change reverts that behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the note @fouvrai. It's difficult to track the myriad of issues associated to this swizzling due to the varying implementations that have been in the past few releases. Let me take a look once more with that lens and make sure we are good.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Swizzling of the superclass was adding here 27964aa Fix AFNetworkActivityIndicatorManager for iOS 7...dubiously by @tangphillip He explains the reason is due to the superclass being an undocumented class in iOS7.

I'll point out that there are class dump headers of iOS including any undocumented classes here https://github.com/EthanArbuckle/IOS-7-Headers/blob/master/Frameworks/CFNetwork.framework/__NSCFLocalSessionTask.h

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phoney has it absolutely right here. This is not going to work in iOS 7, because in the actual implementation of iOS 7, NSURLSessionDataTask's inheritance path doesn't include a class named NSURLSessionTask. Instead, its superclass is named __NSCFLocalSessionTask. Which is private, and probably not a good idea to reference directly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These swizzles were (are?) in an +initialize of a category on NSURLSessionDataTask to guarantee that the entire inheritance chain of NSURLSession classes had an opportunity to +initialize before we swizzle their methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tangphillip yes you are correct. The branch I have pushed here does not match my latest branch. You're right this approach won't work on 7.

I have something right now that works for 8 and everything on 7 except for Upload Task. Haven't had time to try a few of my other ideas yet.

@kcharwood
Copy link
Contributor

@fouvrai I think you may be right, as I am seeing some strange behavior on iOS 7. How many different ways can I possibly skin this cat...

@kcharwood
Copy link
Contributor

I'm starting to question if this has every actually worked properly for iOS 7. I have an example that works properly for NSURLSessionDataTask, but trying to use UploadTask or DownloadTask causes a problem on iOS 7. It all works great on iOS 8.

Running various variations of the swizzling I've seen in the history of this file seems to fail in the same way. This one is nasty, mainly due to the implementation differences under the hood between iOS 8 and iOS 7 that Apple has hidden away from us.

@kcharwood
Copy link
Contributor

I've built out a class diagram for NSURLSession for iOS 7 and iOS 8, based on what classes are being given to me at runtime.

iOS 7:
screen shot 2015-05-08 at 9 31 36 pm

iOS 8:
screen shot 2015-05-08 at 9 31 42 pm

Note the grey boxes, which represent the hierarchy if you do something like [[NSURLSessionDataTask class] superclass]. The green boxes represent the class hierarchy if you take an actual object returned by NSURLSessionManager, and go up the superclass tree. I have tests confirming the swizzling approach works for all green boxes.

Note the one red box in iOS 7 - The upload task object. Building out this diagram helped me identify that as the one failing test. If I disable the tests for UploadTask on iOS 7, everything else passes. If I run the UploadTask tests in iOS 7 in isolation or in the suite, it causes a deadlock inside of af_suspend.

I have not been able to fully diagnose why this is happening, but I'm using this ticket as a scratchpad to talk out my notes. I'm trying to determine if there is something different about the implementation of suspend in iOS 7 for the upload task that is causing a problem with the swizzle.

@larsacus
Copy link
Contributor

larsacus commented May 9, 2015

I know I might be commenting on this without the full history or context as to what's going on with this particular bit of code, but when someone says 'runtime', my ears perk up.

So to summarize, it appears that the entire purpose of the swizzled methods is to:

  1. Add the af_resume/af_suspend methods to all (what we think is) the root class for all NSURLSessionTask objects used by the user
  2. Call those newly added methods on the existing implementations of -[NSURLSessionTask resume] and -[NSURLSessionTask suspend] in order to post a notification

In addition, the reason that +initialize is being used as opposed to +load is due to not actually knowing what the true "root" class ("root" here being the closest superclass after NSObject) of the hierarchy tree is. This is due to the class hierarchy diagrams that Kevin drew up since +load is called in order of hierarchy, and this breaks on iOS 7.

Is there a reason we need to swizzle this into the class of NSURLSessionTask, so that the new functionality is to be used on all classes, even those that are created outside of AFNetworking? Could we simply add and swizzle in the same exact methods, but on a per-instance basis in the *TaskWithRequest methods in AFURLSessionManager, where everything is funneling into? We could then add/swizzle on the actual returned class (once) vs an unknown root class (which would run an unknown number of times in a runtime method like +initialize).

Bad? Depends on if this functionality must live on the furthest superclass and if all *SessionTask subclasses in the bundle need this functionality regardless of association with AFNetworking.

Another solution could be to actually use the runtime to find the furthest superclass that actually implements suspend/resume and add the af_* methods to just that one class. Then use the existence of those af_* methods we just added to know if we have already performed this action, or also wrap this action into a dispatch_once.

I'm also not positive what is causing the deadlocking referred to above. I'm assuming the onceToken, since it is static and only found in that one category, is actually the same token across multiple calls to +initialize. This would mean that +intialize is called concurrently?

@phoney
Copy link

phoney commented May 9, 2015

Removing the dispatch_once is OK. The normal way that +initialize is protected so that the code is run only once is like this

+(void)initialize
{
if ([self class] == [NSURLSessionTask class] {
// code that should only run once goes here
}
}

I think in the code committed by @mattt ace91df he had a typo where the == [NSURLSessionTask class] part was missing.

This entire issue is the cause of my related issue: https://github.com/AFNetworking/AFNetworking/issues/2660 AFNetworkActivityIndicatorManager doesn't work for background downloads.

If you change this code again I request that you make sure it works for background downloads.

I think the swizzling can be made to work for background downloads if you swizzle [self class] one time.

My real opinion is that we should throw away the swizzling code and go back to the KVO code. KVO observing of the task state is the right way to do this. I understand there were some crashes in some cases when removing the observers. I expect that those crashes could be fixed. Swizzling is an attempt to avoid fixing the KVO code. Everyone at Apple says don't swizzle. They're obviously right.

@tangphillip
Copy link
Contributor

Those crashes are not fixed—they're a bug in KVO itself...and that bug is still at large, as of iOS 8.1. In our experience, the KVO crashes increase our per-session crash rate by ~0.2% (leading to hundreds of crashes a day), making them our most common crash by a factor of 10.

I agree that swizzling is a terrible, terrible idea. But ultimately, code quality is less important than user experience, which is why swizzling is unfortunately the right thing to do here.

@kcharwood
Copy link
Contributor

I pushed up a rebase off master with just the url session unit tests I have for this issue if anyone wants to try their solution on it.

https://github.com/AFNetworking/AFNetworking/tree/updated_url_session_tests

@kcharwood
Copy link
Contributor

Running rake locally on machine shows all tests passing. Running on Travis shows failures. I can't catch a break... Looks like the failures are for 10.9.5. Trying to find a 10.9 machine now.

@phoney personally I'm not a huge fan of asking everyone to use custom resume/suspend methods, as that can be confusing and as you said, not backwards compatible. If we can "make it magically work", I'm more of a fan of that approach.

@cnoon
Copy link
Member

cnoon commented May 12, 2015

+1 @kcharwood. I think we all thought through alternative possibilities while maintaining backwards compatibility. That's why swizzling is really the only option since KVO is crashing. All other proposed solutions would have serious public API implications or would break this feature for all our current users.

@kcharwood
Copy link
Contributor

TRAVIS CI JUST PASSED MY TESTS. I'm going to spend quite a bit of time cleaning this branch up now, and adding appropriate documentation.

@kcharwood
Copy link
Contributor

Ok. Branch cleaned up. More tests added. Comments added to explain my thoughts.

I would love as many eyes as possible on this one.

@phoney
Copy link

phoney commented May 12, 2015

I tested your newest code in my app on iOS 8 and iOS 7 and it seems to work fine. The network activity indicator spins for background downloads and non-background requests.

NSURLSessionTaskState state;
SEL selector = @selector(state);
NSAssert([self respondsToSelector:selector], @"Does not respond to state");
NSInvocation *invocation = [NSInvocation invocationWithMethodSignature:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale for switching to an NSInvocation here? What's wrong with NSURLSessionTaskState state = [self state];?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These methods are in a dummy class, which doesn't have a self state. To prevent a compile warning, I went this route.

I guess it would be possible to stub out the state method in the dummy class and drop an assert in there.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yea I was only thinking about it in the context in which it'd actually be called.

It seems weird to me to compromise the implementation to prevent a compiler warning, when the resultant implementation wouldn't actually work on the class on which it's defined anyway. At that point, it feels like it'd be better to just locally disable the compiler warning, rather than simply "hiding" it.

An alternative option would be to derive the _AFURLSessionTaskSwizzling class from NSURLSessionTask, so that the methods these functions need to call on self would exist on the class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided just to stub the method state method in the dummy class. Due to the complications around the class cluster of NSURLSessionTask, I'd prefer to keep the dummy class completely out of the inheritance chain.

Removing the NSInvocation does make it a bit cleaner, so I'm happy with compromise.

@cnoon
Copy link
Member

cnoon commented May 14, 2015

These changes all look good to me @kcharwood. I went through it with a fine tooth comb and it looks great! All the documentation you've added is really helpful, and the test coverage is really well done. Awesome job!

@cnoon
Copy link
Member

cnoon commented May 14, 2015

Looks good!

kcharwood added a commit that referenced this pull request May 14, 2015
_AFStateObserving Swizzling Implementation is Wrong
@kcharwood kcharwood merged commit 7162efe into master May 14, 2015
@kcharwood kcharwood deleted the 2702_alternate_solution branch May 14, 2015 16:35
@kcharwood
Copy link
Contributor

I'd like to thank everyone who jumped in on this issue and provided feedback/suggestions. Super nasty issue, and couldn't have gotten to this solution without everyones involvement.

cheers 🍻

@julianL0veios
Copy link

af3.1.0version iOS10.2 still got the crash!
410bbe63-0373-4b84-957e-13cdb016dc00

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
7 participants