CF-263 - 10.2 “Jaguar”#

The following came from the ADC release notes page.

CFBundle additions#

CFBundle contains some additional API for dealing with specific localizations:

CFArrayRef CFBundleCopyLocalizationsForPreferences(CFArrayRef locArray, CFArrayRef prefArray);

Given an array of possible localizations, this function returns the one or more of them that CFBundle would use, without reference to the current application context, if the user’s preferred localizations were given by prefArray.

If prefArray is NULL, the current user’s actual preferred localizations are used. This is not the same as CFBundleCopyPreferredLocalizationsFromArray, because that function takes the current application context into account. CFBundleCopyLocalizationsForPreferences is the call that would be used for determining the localizations that another application would use.

Unbundled Applications#

CFBundle in the past has had some explicit support for unbundled CFM applications, allowing them to specify an info dictionary via a ‘plst’ resource. This has been extended to unbundled Mach-o applications, which can now specify an info dictionary by adding a special section named __info_plist in the __TEXT segment. The contents of this section should be byte-for-byte identical with what otherwise would be the contents of the Info.plist file. Such a section can be added, for example, by specifying “-sectcreate __TEXT __info_plist <path to Info.plist file>” as arguments to the linker.

CFBundle also now has some additional APIs that allow the retrieval of information from unbundled applications:

 CFDictionaryRef CFBundleCopyInfoDictionaryForURL(CFURLRef url);

If the URL points to a directory, this is equivalent to CFBundleCopyInfoDictionaryInDirectory. For a plain file URL representing an unbundled application, CFBundleCopyInfoDictionaryForURL will attempt to read an info dictionary either from the (__TEXT, __info_plist) section (for a Mach-o file) or from a ‘plst’ resource.

 CFArrayRef CFBundleCopyLocalizationsForURL(CFURLRef url);

If the URL points to a directory, this is equivalent to calling CFBundleCopyBundleLocalizations on the corresponding bundle. For a plain file URL representing an unbundled application, CFBundleCopyLocalizationsForURL will attempt to determine localizations using the CFBundleLocalizations and CFBundleDevelopmentRegion keys in the dictionary returned by CFBundleCopyInfoDictionaryForURL, or a ‘vers’ resource if those are not present.

We note again that if both a CFBundleDevelopmentRegion key and a ‘vers’ resource are present, they should specify the same localization; if they do not, the behavior of the application may be undefined or inconsistent in certain situations. We note also that the use of the CFBundleLocalizations key to specify the localizations provided by an unbundled application is not recommended in general; it should be reserved for special cases in which an application performs its own localization and for some reason cannot make use of the normal CFBundle localization mechanisms, namely bundling the application and supplying .lproj folders.

For the sake of completeness, here is a way to determine what localizations CFBundle would use for an arbitrary bundle someBundle within the context of an arbitrary application (bundled or unbundled) at the url appURL, if the user’s localization preferences were given by prefArray:

    CFArrayRef preferredLocalizations = NULL;
    CFArrayRef bundleLocalizations = CFBundleCopyBundleLocalizations(someBundle);
    if (NULL != bundleLocalizations) {
        CFArrayRef appLocalizations = CFBundleCopyLocalizationsForURL(appURL);
        CFMutableArrayRef tempArray = CFArrayCreateMutableCopy(NULL, 0, prefArray);
        if (NULL != appLocalizations) {
            CFArrayRef preferredAppLocalizations = CFBundleCopyLocalizationsForPreferences(appLocalizations, prefArray);
            if (NULL != preferredAppLocalizations) {
                CFIndex count = CFArrayGetCount(preferredAppLocalizations);
                while (count-- > 0) {
                    CFArrayInsertValueAtIndex(tempArray, 0, CFArrayGetValueAtIndex(preferredAppLocalizations, count));
                }
                CFRelease(preferredAppLocalizations);
            }
            CFRelease(appLocalizations);
        }
        preferredLocalizations = CFBundleCopyLocalizationsForPreferences(appLocalizations, tempArray);
        CFRelease(bundleLocalizations);
    }
    return preferredLocalizations;

Loading CFM bundles#

In Mac OS X 10.1.x, when a bundle containing a CFM executable was loaded by CFBundle or CFPlugIn, that executable would be prepared as a data-fork executable and its initialization routine would be passed a kDataForkCFragLocator. It is now possible to have the load process in this case prepare the executable as a bundle and pass the initialization routine a kCFBundleCFragLocator. However, for compatibility reasons, that will occur only if the bundle’s info dictionary contains the key CFBundleCFMLoadAsBundle with a value resolving to true (binary true, or a non-zero number, or the string “YES”).

Property Lists#

CFPropertyLists can now be saved in a binary format which is more compact and faster to load, especially for larger property lists. However, the binary format cannot be loaded on 10.1.x and earlier systems.

CFPropertyList.h contains new APIs to validate and deal with property list formats in a more general fashion:

Boolean CFPropertyListIsValid(
            CFPropertyListRef plist,
            CFPropertyListFormat format);

CFIndex CFPropertyListWriteToStream(
            CFPropertyListRef propertyList,
            CFWriteStreamRef stream,
            CFPropertyListFormat format,
            CFStringRef *errorString);

CFPropertyListRef CFPropertyListCreateFromStream(
            CFAllocatorRef allocator,
            CFReadStreamRef stream,
            CFIndex streamLength,
            CFOptionFlags mutabilityOption,
            CFPropertyListFormat *format,
            CFStringRef *errorString);

CFString#

When provided with the kCFCompareBackwards flag, CFStringCreateArrayWithFindResults() will now do the search backwards in the string and report the ranges starting from the last one. Note that the resulting ranges might be different than one generated with a forward search, as this function does not find overlapping instances of the target string.

CFString now has the following function to replace multiple occurrences of a string in one call:

CFIndex CFStringFindAndReplace(
                    CFMutableStringRef theString,
                    CFStringRef target,
                    CFStringRef replacement,
                    CFRange rangeToSearch,
                    CFOptionFlags compareOptions);

This call pays attention to kCFCompareCaseInsensitive, kCFCompareBackwards, kCFCompareNonliteral, and kCFCompareAnchored. kCFCompareBackwards can be used to do the replacement starting from the end, which could give a different result. kCFCompareAnchored assures only anchored but multiple instances are found (the instances must be consecutive at start or end). Returns number of replacements performed.

kCFCompareNumerically is now implemented when doing comparisons without the kCFCompareLocalized flag.

A bug was fixed which would cause CFStringFindWithOptions() with the kCFCompareNonliteral flag to sometimes return success on one-character long search strings that did not actually occur in the string. The returned range would be a zero-length range at the end of the string. NSString’s rangeOfString: methods use this flag by default; they would end up returning this range as well. (Clients which compared length to 0 would succeed, but those who compared location to NSNotFound would fail.)

CFString now looks for the Unicode BOM character at the head of UTF8 byte streams and removes it when converting to internal format. The UTF8 representation of the BOM character is hex EF BB BF.

A bug was fixed where CFStringGetBytes would fail to pay attention to the supplied range.location if the requested encoding was kCFStringEncodingUnicode. (Instead it would always return characters starting from 0).

The character database used by CoreFoundation String Service is updated to the Unicode version 3.2 specification. Both CFString and CFCharacterSet now fully support surrogate characters as defined in the specification.

CFStringFindWithOptions, CFStringCompareWithOptions, CFStringLowercase, CFStringUppercase, and CFStringCapitalize functions now uses the new updated character database based on the Unicode version 3.2 specification.

There are two new functions in CFString: CFStringGetRangeOfComposedCharactersAtIndex and CFStringFindCharacterFromSet.

CFString now provides CFStringNormalize function that lets applications normalize mutable strings as described in the Unicode version 3.2 specification standard annex #15 “Unicode Normalization Forms.”

Warning on CFString NoCopy Creation#

The immutable string creation functions functions CFStringCreateWithPascalStringNoCopy(), CFStringCreateWithCStringNoCopy(), and CFStringCreateWithCharactersNoCopy() try to use the provided buffer as the backing store. When calling these functions, you supply a second allocator (the “contentsDeallocator” argument). If you provide a deallocation function, then the assumption is that the CFString owns the buffer, and will call your deallocation function when it’s no longer needed (which could be at the time the NoCopy() is called, or much much later).

If you do not provide a deallocation function, you have not transferred ownership to the CFString, but the assumption is that the buffer is around as long as that CFString is around. This is a dangerous situation: If that CFString is retained or copied, its lifetime might be extended in ways you do not expect; this would create additional dependency on the lifetime of the supplied buffer. For some buffers (like a constant C-string) this might be OK; for others (like a stack buffer, or a mapped in file), it’s not.

So, be careful with your use of the NoCopy() functions where the supplied buffer is not owned by the CFString. Use such CFStrings temporarily, and don’t hand them out to APIs which might copy or retain them.

CFSTR()#

CFSTR() macro can now generate truly constant strings — that is, strings laid out at compile time rather than created on the fly. Strings generated this way work only in 10.2 and as far back as 10.1.2, so any binary which needs to run on earlier systems should not use this mechanism. By default, modules compiled with 10.2 deployment target (setenv MACOSX_DEPLOYMENT_TARGET 10.2) use the new constant string feature, while other modules use the previous mechanism, which of course is still active and available. Many system frameworks and applications are compiled for 10.2 deployment, while many third party applications are likely not to be.

Both kinds of constant CFSTRs are valid CFStrings and can be used with any APIs that work with CFStrings (and of course NSStrings). However, there are some differences due to implementation details between both kinds of strings — developers should not make any assumptions that might break when switching from one kind of constant string to the other. Three examples of behaviors not to depend on are the uniquing behavior (that is, assuming CFSTR(“Hello”) == CFSTR(“Hello”)), reference counting (that is, assuming CFGetRetainCount(CFSTR(“Hello”)) is a specific value), and the actual memory layout of the string (it’s possible that in the future the actual layout might be tweaked at runtime). Given that applications are likely to see both kinds of strings at runtime, it’s best to just treat CFSTRs just like any other CFStrings, and not make any assumptions with regards to such and other special behaviors.

High-bit characters (greater than 127, either specified with \ooo format, or explicitly as high bit characters in source code) should continue to be avoided in CFSTR (and 8-bit constant strings) as their encoding is ambiguous. Even if they seem to work in your testing, there are no guarantees they will continue to work properly in future updates, or when running under different languages.

Word of caution on property lists#

XML property lists very commonly use UTF-8 encoding (and this is indicated in the XML header). A common error is to save XML plist files using the wrong encoding, such as MacRoman. This usually is harmless if the file contains no high-bit characters, but high-bit characters such as copyright, trademark, registered trademark, long-dash, etc do cause the property list file to be rendered invalid. Due to more strict, Unicode-conformant UTF-8 parsing, such invalid files would normally not load under 10.2, and in general invalid UTF-8 byte streams can not be loaded.

In order to provide compatibility, the UTF-8 parser will make a special consideration for the MacRoman encoded copyright character (0xA9), replacing it with the Unicode replacement character. In addition, the XML property list parser, on encountering an invalid property list file, will issue a log and attempt to load it using MacRoman encoding. Note that this might not produce the correct results either, as the original file encoding might have been something else altogether.

CFNumber#

CFNumber hashing algorithm has been improved so that for floating point numbers it takes the whole number (both integral and floating point parts) into account, and the invariant (num1 == num2) does truly imply (hash(num1) == hash(num2)). This latter was not the case in certain corner cases, especially when comparing numbers of different types.

Due to a bug, CFNumberGetValue() get sometimes return true inappropriately in some rare cases such as:

    SInt32 originalValue = 128;
    CFNumberRef n = CFNumberCreate(NULL, kCFNumberSInt32Type, &originalValue);
    SInt8 resultingValue;
    Boolean success = CFNumberGetValue(n, kCFNumberSInt8Type, &resultingValue);

The value stored in the CFNumber is 128, which cannot be represented as an SInt8. However, this function does return true due to the way the value is stored in the CFNumber, and the return value looks like -128. We intend to fix this in the next release, for new applications (in order to maintain binary compatibility). Note that if the original call had provided the value 128 in a kCFNumberSInt64Type, the return would have been false, which is correct.

A clarification on CFNumberGetValue(): A false return from this function indicates that the stored value could not be returned exactly using the type that was requested. In many cases this is not a fatal error; for instance, with floating point values, if the stored value is a double (Float64) and the asked-for value is a float (Float32), false return might just indicate a little loss of precision. This kind of error can for instance happen if the CFNumber was saved and read back in as text, causing a little floating point error to creep in.

CFCharacterSet#

The predefined character sets instantiated via CFCharacterSetGetPredefined function are now based on the Unicode version 3.2 specification. The sets covers the entire character range from U0000 to U10FFFF defined by the specification. The functions that take CFRange (i.e. CFCharacterSetCreateWithCharactersInRange) now interpret it as a UTF-32 character range.

There are new API functions added to CFCharacterSet in order to support the extended character range. CFCharacterSetIsLongCharacterMember function takes a UTF-32 character with which the membership is queried. You can use UCIsSurrogateHighCharacter, UCIsSurrogateLowCharacter, and UCGetUnicodeScalarValueForSurrogatePair inline functions from Carbon Unicode Utilities to map a pair of UTF16 characters to a UTF-32 character.

The new API function, CFCharacterSetCreateInvertedSet, is added for efficient inverted set creation.

A new predefined character set type, kCFCharacterSetCapitalizedLetter, is added. That character set corresponds to the Unicode character category ‘Lt’ as defined in the specification. Now, the predefined character set, kCFCharacterSetUppercaseLetter, consists of both the uppercase (the Unicode character category ‘Lu’) and the titlecase (the Unicode character category ‘Lt’) letters.

Two new functions are added to help applications to access CFCharacterSet more efficiently.
CFCharacterSetIsSupersetOfSet examines if a character set is superset of another. You could easily achieve the same result with the following code, but this function tries not to allocate memory.

CFCharacterSetRef setA;
CFCharacterSetRef setB;
CFMutableCharacterSetRef copy;
Boolean result;

copy = CFCharacterSetCreateMutableCopy(NULL, setA);
CFCharacterSetIntersect(copy, setB);
result = CFEqual(copy, setB);
CFRelease(copy);

CFCharacterSetHasMemberInPlane helps applications that want to scan the character set. The previous version of CFCharacterSet only covered the range from U0000 to UFFFF so you only had to check 65536 times. With the new implementation, you need to check 1114112 (= 65536 * 17) times. Since most character set are expected to be scarce outside of the Basic Multilingual Plane (Plane 0), it is inefficient to query that many times. This function returns true if the character set has at least one member in a given plane.

CFSocket Additions#

CFSocket now has an additional callback type, kCFSocketWriteCallBack, that may be used in addition to any of the existing callback types. This provides a notification when the socket is writable. Note that a typical socket becomes writable very quickly, as soon as data has been sent out and there is space available in the kernel buffers, so write callbacks become significant primarily when large amounts of data are to be sent.

There are now CFSocket flags that may be set to control whether callbacks of a given type are automatically reenabled after they are triggered, and whether the underlying native socket will be closed when the CFSocket is invalidated. These flags may be set using CFSocketSetSocketFlags() and read using CFSocketGetSocketFlags(). By default read, accept, and data callbacks are automatically reenabled; write callbacks are not, and connect callbacks cannot be, since they are only sent once. Be careful about automatically reenabling read and write callbacks, since this implies that the callbacks will be sent repeatedly if the socket remains readable or writable respectively. Be sure to set these flags only for callbacks that your CFSocket actually possesses; the result of setting them for other callback types is undefined.

By default the underlying native socket will be closed when the CFSocket is invalidated, but it will not be if the flag kCFSocketCloseOnInvalidate is turned off. This can be useful in order to destroy a CFSocket but continue to use the underlying native socket. The CFSocket must still be invalidated when it will no longer be used. Do not in either case close the underlying native socket without invalidating the CFSocket.

Individual callbacks may now also be enabled and disabled manually, using CFSocketEnableCallBacks() and CFSocketDisableCallBacks(), whether they are automatically reenabled or not. If they are not automatically reenabled, then they will need to be manually reenabled in order for the callback to be received again. Even if they are automatically reenabled, there may be times when it will be useful to be able to manually disable them temporarily and then reenable them. Be sure to enable and disable only callbacks that your CFSocket actually possesses; the result of enabling and disabling other callback types is undefined.

CFURL#

CFURL has added support for literal IPv6 addresses, per RFC 2732 (http://www.ietf.org/rfc/rfc2732.txt). Essentially, the IPv6 address is placed between square brackets in place of the host. CFURLCopyHostName() will return the IPv6 address (without the square brackets) for such an URL.

CFStream#

CFStream has added the functions CFReadStreamSetProperty() and CFWriteStreamSetProperty() to provide a single API for customizing various kinds of streams. Unless documented otherwise, you must call SetProperty() on a stream before opening if for the customization to take affect. Properties are named CFStrings; their value are CFTypes. See the header file appropriate for the type of stream you are using to find the list of supported properties and their expected values.

File write streams have added the property kCFStreamPropertyAppendToFile; if set to kCFBooleanTrue, the write stream will append bytes written to it to the file opened, rather than simply replacing the file’s contents entirely.

Finally, CFStreamCreatePairWithPeerSocketSignature was added in order to allow the creation of read stream and write stream pair using a socket signature structure defined in CFSocket.h.