How to really slow down a Core Data fetch

Core Data is a bit of a mysterious thing. Sometimes, patterns that seem helpful in theory can be disastrous in practice. Consider, for example, the instinct not to save changes to disk, or to do so very infrequently, for fear of slowing down processing or blocking the main thread. This is something I’ve done, and seen others do, in projects using Core Data.

What’s easy to overlook is that unsaved changed in Core Data can make fetch requests slower. Sometimes, orders of magnitude slower. Which could undo all the benefits of deferred saves. Here’s a real-life example with some numbers.

The code below was used retrieve all the Word entities in an object graph whose string attribute was equal to one of the strings in an array,  stringsToFetch. There were 28,720 words in the graph, and the fetch matched 2,044 of them.

In one test, the objects had been inserted in the context but save: had not yet been called. In the other, the inserted objects had been saved to disk. (The code was run on an iPad Air.)

Saved objects Unsaved objects Fetch time
0 28,720 4.55 secs
28,720 0 0.07 secs

The performance of the fetch with unsaved changes was, in a word, hideous. This is obviously an extreme case — nearly 30,000 unsaved insertions — but the effect was quite linear. Even 3,000 or so unsaved objects slowed the fetch down to a still-needlessly-long 0.5 seconds.

The reason for the slowdown is clear if you run the same code using the Time Profiler. When the unsaved objects are present, about 70% of the processor time is spent on string comparisons that descend from the call to executeFetchRequest:, in which our predicate is being evaluated. So in essence there are two fetches: One is a super-fast SQL query, and the other is a ponderously slow series of in-memory string comparisons.

Screen Shot 2014-09-12 at 10.54.17 AM

Keep in mind: You won’t uncover this problem by using the “Fetch Duration” data from the Core Data Fetches tool in Instruments. That’s because this tool seems to return the duration of the SQL query only: It doesn’t account for the time that was spent evaluating in-memory objects as well. You need to put a timer around the actual call to executeFetchRequest: to see the true processing time.

Every scenario is different, but I highly suspect that for some people who complain that “Core Data fetching is slow”, the problem isn’t a trip to disk, but the opposite: too many inserted, updated and/or deleted objects in memory.