:select and :include in ActiveRecord queries

Along with Bill Eisenhauer, I’ve been digging into what it would take to fix the problem I found with GeoKit, that it wouldn’t support :include queries properly. The explanation has seen us going deep into the internals of ActiveRecord to discover, as others have before, that the :select parameter (which GeoKit uses to build its distance column in the query) and the :include parameter don’t play nicely together.

The reason for that is that Rails employs special methods to build the select parameters when there’s a :include in the query (ie. when you’re joining with other tables and eager-loading them) and currently ignores :select along the way. GeoKit would ordinarily produce SQL along the lines of:

SELECT *,
	SQRT(
		POW(69.1*(42.962927-latitude), 2) +
		POW(42.0590342306803*(-85.637179-longitude), 2)
	) AS distance
FROM bus_stops
ORDER BY distance ASC LIMIT 1

but with a :include on a has_many relationship, we don’t get that distance column and so most of the queries will fail.

There are a couple of patches on the rails trac that seek to clean that up, but so far neither has been adopted. I’m sure the GeoKit developers would appreciate any input on workarounds.

Tags: , ,

6 comments

  1. One suggestion could be that you apply the patch yourself.

    You may have a visceral reaction to this, but luckily we are using ruby. You can create a new module. In this module, create modules for each class that was affected by this patch, like JoinBase and Associations. Once you have done that, you can monkey patch your way to victory, at least that’s the theory.

    http://pastie.caboo.se/48188 is a skeletonized example of what I’m thinking.

    The bad: You have to know exactly where to plug those pieces into. Notice how I didn’t take on the patch to ActiveRecord::Associations::ClassMethods, because at a quick glance, I’m sure that’s just patched up to ActiveRecord::Base.

    The good: You don’t harm rails. If the rails team does add the patch, you can remove the monkey patch and it’s life-as-always. Plus the patch came with a test, so you can always use that to make sure that it worked.

  2. Thanks. I’d wondered about doing that, but held off because I’d like to know why the core team aren’t applying the patch. If they have a better solution in mind, I’d rather not have a patch in place that interferes. But that wouldn’t be the end of the world and I could always just remove that module when it’s no longer needed….

  3. Looks like those Rails Trac bugs were closed. I would not hold my breath for any :include incompatibilities to be fixed any time soon. I’ve run into similar problems with other functionality beyond GeoKit.

    One of the Rails bugs did recommend this as a solution instead:

    http://activereload.net/2007/5/23/spend-less-time-in-the-database-and-more-time-outdoors

    It caches objects from the database, so I guess going from 1 db query to 100 due to lack of :include becomes less of a problem…

  4. I have recently made some improvements to GeoKit to enable sorting by distance together with an :include. So, for example, the following works:

    shops=Shop.find :all, :origin=>’94117′, :include=>:reviews, :conditions=>[‘reviews.user_id = ?’, user_id], :order=>’distance asc’

    How does it work, given the AR limitations noted above? When you use an :include, GeoKit will replace the “distance” column in the :order clause with the actual distance SQL.

    One caveat: while the result set will be returned properly sorted by distance, you can’t *display* the distances, because AR drops the calculated “distance” column from the results. You can add it back in post-query with the following:

    shops.sort_by_distance_from(origin) #origin is a GeoKit::LatLng or GeoKit::GeoLoc

    See the GeoKit README for more info.

    Cheers,

    Andre

  5. Thanks for this change! I wound up working around it by using bounds, but this is a bit more straightforward.