Late Night Cocoa Episode 036
2nd December 2008

Scotty discusses Garbage Collection with Realmac developer Andre Pang

Listen to the show   Show notes

Late Night Cocoa

Late Night Cocoa was MDN's first ever podcast starting back in January 2007. Each episode consists of an interview by Scotty with an expert on a single subject. Late Night Cocoa ended after 41 episodes in March 2009 when it became a segment of the newly formed MDN Show

Show Notes

Correction: Interior Pointers

In the "GC Gotchas" section of the podcast, there's a discussion about what the Garbage Collection Programming Guide calls interior pointers. You'll most commonly run across these when you're using NSData objects (though the problem's not limited to them). As an example,

- (void)myMethod
{
  NSData* myData = [someObject getMyData];
  // interior pointer to myData
  const void* buffer = [myData bytes];
  // do something with the 'buffer' local variable here...
}

Here, -myMethod receives a myData object, and then you call -bytes on that object to retrieve a raw pointer to a buffer of bytes. The problem is that the due to the garbage collector running concurrently in the background and compiler optimisations, myData may end up being collected because there's no more references to it in the method after the initial variable assignment. This is a nasty little problem, because it often only bites on you on release builds where optimisations are enabled. There's two workarounds to this.

One workaround is to simply put [myData self] at the end of the method, which sends myData a self message that simply does nothing. This ensures that there's another reference to myData later in the method, which ensures its longevity, so the garbage collection won't collect it.

In the podcast, I incorrectly stated that the other workaround is to declare the void* buffer pointer as volatile: this makes no sense whatsoever. What I meant to say is that you should declare the myData pointer as volatile, e.g.

 NSData* volatile myData = [someObject getMyData];

There's a thread on cocoa-dev about this interior pointer problem started by Rick Hoge; in particular, see the messages by Alastair Houghton and Chris Suter about why volatile works.

Garbage Collection Programming Guide

You can learn pretty much everything in the episode by reading the Garbage Collection Programming Guide. It's undergone many revisions since it was first published, and is now quite comprehensive and well-written. If you first looked it at a long time ago, you probably want to look at it again. In particular, there's in-depth info there about Core Foundation, which is probably the trickiest day-to-day use of garbage collection.

CFAutorelease, NSMakeCollectable, CFMakeCollectable and -autorelease, oh my!

In the Core Foundation part of the episode, I talk about using NSMakeCollectable(), CFMakeCollectable() and -autorelease (at around 45 minutes). The standard idiom to make Core Foundation objects work in dual-mode code is [NSMakeCollectable(object) autorelease]. At 47 minutes, I discuss using a CFAutorelease() macro, which looks like this:

#define CFAutorelease(cf) ((__typeof(cf))[NSMakeCollectable(cf) autorelease])

You want the __typeof there to avoid compiler warnings about type mismatches if you're compiling at the higher warning levels, which is why this is a macro rather than a function.

The gcc cleanup attribute

In the "GC Secrets, Tips & Tricks" part of the episode, there's a brief discussion about NSAllocateCollectable() and how I don't recommend using it since I personally prefer to keep the Garbage Collection strictly in Cocoa-land and keep C the way it is. As an alternative, you can use the GCC cleanup attribute to automatically free() things when the local variable exits out of scope, which looks like this:

// Simple wrapper macro to make using the cleanup attribute a
// little bit more pleasant
#define SCOPE_EXIT(function) __attribute__((cleanup(function)))

// Declared static inline so you can put it into a header file
static inline void ScopedFree(void** pPtr)
{
  free(*pPtr);
}

#define AUTO_FREE SCOPE_EXIT(ScopedFree)

// Usage: char* buffer AUTO_FREE = malloc(2048);

As an example of this, you can use it by doing

char* buffer AUTO_FREE = malloc(2048);

. gcc will then free() the buffer for you automatically when the current scope ends. (Note that this means you can also use { and } to introduce new scoping levels just for the auto-free behaviour.)

Unfortunately, since advocating this technique, I've since learned that using the gcc cleanup attribute surrounds your code with a @try/@catch block so that proper stack unwinding can be done if an exception is thrown somewhere, which has performance implications. This is pretty cheap on the 64-bit runtime, but expensive on the normal 32-bit runtime and iPhone. You'll have to be the judge of whether that performance hit is OK for your code.

Debugging environment variables

From a Terminal, if you do strings /usr/lib/libauto.dylib | grep AUTO_, you'll find some interesting environment variables that you can set to print out some statistics and debugging information about what the garbage collector is doing. For the boolean variables, I'm guessing you set the environment variable value to YES. Here's the list that I've found so far on 10.5.5:

  • AUTO_ENABLE_MONITOR
  • AUTO_DISABLE_GENERATIONAL
  • AUTO_LOG_NOISY
  • AUTO_LOG_ALL
  • AUTO_LOG_COLLECTIONS
  • AUTO_LOG_REGIONS
  • AUTO_LOG_UNUSUAL
  • AUTO_LOG_WEAK
  • AUTO_COLLECTION_THRESHOLD
  • AUTO_COLLECTION_RATIO
  • AUTO_RECORD_REFCOUNT_STACKS
  • AUTO_DIRTY_ALL_DELETED
  • AUTO_USE_EXACT_SCANNING

Since those environment variables don't appear to be documented anywhere, don't rely on them in your production code, of course.

Garbage Collector Open-Sourced

Like many other runtime hackers, I was very happy to see that the source code for libauto was released recently. Bill Bumgarner wrote an announcement about it on his blog.

Dual-Mode Code Support in Mac OS X 10.4 (Tiger)

A few members of the Objective-C runtime and garbage collection teams at Apple have acknowledged to me in personal communication that it's actually possible to build a GC-supported framework or plugin and have it work in Mac OS X 10.4 (Tiger), obviously under manually managed memory only. The good news is that this is apparently an officially supported Apple-configuration, but the bad news is that you have to stay away from weak pointers, and you can't use any of the new weak-pointer-aware collection classes (NSMapTable, NSHashTable and NSPointerArray). For the problem with weak-pointer-aware collection classes, you can actually use the traditional collection classes with +[NSValue valueWithNonretainedObject:], which does exactly what you think it does. Under GC, using that NSValue storage option even supports the zeroing-weak-pointer behaviour. You have to be doubly-careful in a manually managed memory scenario though, since you don't get the zeroing-weak-pointer behaviour under that, so you need to write code to manually nil out or remove the NSValue in that case.

However, you'll still run into problems if you use the __weak pointer qualifier, since the compiler will rewrite assignments and reads from that pointer to two mysterious function calls named objc_read_weak() and objc_assign_weak(), which don't exist in the 10.4 Objective-C runtime. You can get around that problem by dropping the following file (RMGarbageCollectionReadWeakAssignWeak.m) into your project:

						
						
//***************************************************************************

// Copyright (C) 2007 Realmac Software Ltd
//
// Permission is hereby granted, free of charge, to any person obtaining
// a copy of this software and associated documentation files (the
// "Software"), to deal in the Software without restriction, including
// without limitation the rights to use, copy, modify, merge, publish,
// distribute, sublicense, and/or sell copies of the Software, and to
// permit persons to whom the Software is furnished to do so, subject
// to the following conditions:
//
// The above copyright notice and this permission notice shall be
// included in all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
// EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
// MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
// IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR
// ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF
// CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
// WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

//***************************************************************************

/** \file RMGarbageCollectionReadWeakAssignWeak.m
 *
 * These two objc_read_weak() and objc_assign_weak() functions are used whenever
 * you use a __weak type qualifier on a variable: GCC will replace any reads
 * or assignments to __weak variables with a "interceptor call" to
 * the objc_read_weak() or objc_assign_weak() functions.
 * Thus, both these symbols need to be present in the global symbol
 * namespace to actually work.  If the Mac OS X SDK is set to 10.4,
 * these symbols are not present in the Objective-C runtime,
 * and you'll therefore (correctly) get a link error when compiling for 10.4.
 * So, one solution is to hand-code these functions to work.
 *
 * However, since those symbols need to be present in the global namespace,
 * providing hand-coded versions of these functions will actually override
 * the default libobjc.dylib versions on Leopard.
 * Leopard's libobjc version is much more complex because it calls into
 * Leopard's libauto.dylib to actually perform the read,
 * and furthermore, how they work depend on such factors as whether the
 * GC is actually enabled, whether it's in the middle of a collection,
 * whether the location to read is in the GC zone or not, etc.
 * Hand-coding it for Leopard is therefore not an option.
 *
 * So, with a little bit of dlopen() and dlsym() magic,
 * what we do is check at runtime whether objc_read_weak and
 * objc_assign_weak exist in /usr/lib/libobjc.dylib.
 * If it doesn't exist, we use our own naive version, because that's all we need.
 * If the symbol does exist (i.e. we're running 10.5+),
 * we call into libobjc's own objc_read_weak() and objc_assign_weak() functions
 * to do the work for us.
 */

//***************************************************************************

#include <dlfcn.h>

//***************************************************************************

typedef id(*objc_read_weak_type)(id*);
typedef id(*objc_assign_weak_type)(id, id*);

//***************************************************************************

static inline void* GetLibObjcSymbol(const char* const symbolName)
{
    void* handle = dlopen("/usr/lib/libobjc.dylib", RTLD_LAZY|RTLD_LOCAL);

    void* symbol = dlsym(handle, symbolName);

    dlclose(handle);

    return symbol;
}

//***************************************************************************

id objc_read_weak(id *location)
{
    static BOOL inspectedLibObjc = NO;
    static objc_read_weak_type libobjc_objc_read_weak = NULL;

    if(!inspectedLibObjc)
    {
        libobjc_objc_read_weak = GetLibObjcSymbol("objc_read_weak");
        inspectedLibObjc = YES;
    }

    if(libobjc_objc_read_weak)
    {
        return libobjc_objc_read_weak(location);
    }
    else
    {
        return *location;
    }
}

id objc_assign_weak(id value, id* location)
{
    static BOOL inspectedLibObjc = NO;
    static objc_assign_weak_type libobjc_objc_assign_weak = NULL;

    if(!inspectedLibObjc)
    {
        libobjc_objc_assign_weak = GetLibObjcSymbol("objc_assign_weak");
        inspectedLibObjc = YES;
    }

    if(libobjc_objc_assign_weak)
    {
        return libobjc_objc_assign_weak(value, location);
    }
    else
    {
        *location = value;

        return *location;
    }
}

//***************************************************************************

If you do that, you should be able to compile dual-mode frameworks and plugins with weak pointer support and have them work on 10.4.

Turning GC on/off for a Dual-Mode Application

You normally reserve the GC-supported compiler option only for frameworks, plugins (and perhaps input managers :), where you don't know whether the host application supports GC or not. However, it's possible to build an application as a dual-mode binary that works both under garbage collection and under manually managed memory, and choose to use GC on application launch. The secret to this is an OBJC_DISABLE_GC environment variable; if this is set to YES, your application will run without the garbage collection even if it's compiled as GC-supported. (I don't know what happens if you try to set this environment variable for a GC-required app; I presume bad things will happen.)

This is a great (and somewhat amazing) feature for those of you who are transitioning an existing application to use GC: you can then directly compare versions of your applications that run side-by-side for metrics such as memory usage and performance. You may also find this useful if you're writing a new app and want to experiment with GC, but can't quite take the leap of faith and hope that everything will be OK. Throw the following code into your main.m file:

						
static inline NSUserDefaults* TheStandardUserDefaults()
{
    return [NSUserDefaults standardUserDefaults];
}

static inline NSGarbageCollector* TheGarbageCollector()
{
    return [NSGarbageCollector defaultCollector];
}

static void RegisterGarbageCollectionUserDefaults()
{
    NSDictionary* garbageCollectionUserDefaults =
        [NSDictionary dictionaryWithObjectsAndKeys:
        [NSNumber numberWithBool:NO], @"EnableGarbageCollection", nil];

    [TheStandardUserDefaults() registerDefaults:garbageCollectionUserDefaults];
}

static void RestartWithCorrectGarbageCollectionSettingIfNecessary(const int argc,
                                                                 const char* argv[])
{
    NSAutoreleasePool* pool = [NSAutoreleasePool new];

    if(getenv("GarbageCollectionVerified")) return;

    RegisterGarbageCollectionUserDefaults();

    BOOL requireRestart = NO;
    if([TheStandardUserDefaults() boolForKey:@"EnableGarbageCollection"] == YES
        && TheGarbageCollector() == nil)
    {
        unsetenv("OBJC_DISABLE_GC");
        requireRestart = YES;
    }
    else if([TheStandardUserDefaults() boolForKey:@"EnableGarbageCollection"] == NO
            && TheGarbageCollector())
    {
        setenv("OBJC_DISABLE_GC", "YES", 1);
        requireRestart = YES;
    }

    if(requireRestart)
    {
        setenv("GarbageCollectionVerified", "YES", 1);

        const int execReturnValue = execvp(argv[0], (char**)argv);
        if(execReturnValue == -1)
        {
            perror("execvp() failed, continuing...");
        }
    }

    [pool drain];
}

// Usage: In your main(),
call RestartWithCorrectGarbageCollectionSettingIfNecessary(argc, argv);

And call RestartWithCorrectGarbageCollectionSettingIfNecessary(argc, argv) in your own main() function.

If you're feeling really adventurous, you could probably make this work on 10.4 too by using objc_getClass("NSGarbageCollector") to dynamically see if the garbage collector class exists, rather than trying to access the garbage collector directly. This means that you could theoretically run your application on 10.5 with GC enabled, 10.5 with manual memory management, and 10.4 with manual memory management, just like you can with dual-mode frameworks and plugins. Apple's garbage collection and Objective-C runtime hacking team will probably feel like killing you, though.

Special Guest

Name: Andre Pang
Company : Algorithm
Blog: Andre's Blog
Twitter: @andrepang

 

Show Host

Name: Steve (Scotty) Scott
Company : mamooba
Blog: Scotty's blog
Twitter: @macdevnet

Steve "Scotty" Scott is the founder of The Mac Developer Network, co-organizer of NSConference and developer of TrackTime.

 

Show Statistics & Information

Running Time: 1 hr 16 mins
Download Size: 35MB
Download: Late Night Cocoa Episode 036