Ruby on Rails | Screencasts | Download | Documentation | Weblog | Community | Source

Ticket #9560 (reopened defect)

Opened 2 years ago

Last modified 2 years ago

[PATCH] Only Join Required Tables in Limited Eager Load Pre-Query

Reported by: dasil003 Assigned to: nzkoz
Priority: normal Milestone: 2.x
Component: ActiveRecord Version: edge
Severity: normal Keywords: patch limited eager loading
Cc: nzkoz, fcheung

Description

Per Pratik's suggestion, I'm creating a new ticket for the second patch from http://dev.rubyonrails.org/ticket/9497 in order to separate discussion about the two possible solutions.

This is a non-destructive patch to solve a potentially major performance issue as described in the original ticket. My preference is for the original patch to be applied, however if that is rejected by the community than this ticket should be applied.

Attachments

only_include_referenced_tables.diff (3.3 kB) - added by dasil003 on 03/02/08 22:47:58.
New version of the patch tested against edge rails r8971
0001-Added-logic-to-associations.rb-to-make-sure-select_f.patch (4.0 kB) - added by jdevine on 05/04/08 03:56:20.

Change History

09/14/07 23:34:02 changed by dasil003

  • keywords changed from limited eager loading to patch limited eager loading.
  • summary changed from Only Join Required Tables in Limited Eager Load Pre-Query to [PATCH] Only Join Required Tables in Limited Eager Load Pre-Query.

09/26/07 22:58:11 changed by nzkoz

  • cc set to nzkoz.

03/02/08 02:20:04 changed by nzkoz

  • owner changed from core to nzkoz.
  • status changed from new to assigned.

Looks good to me, however to nitpick a little:

The variable naming is a little off compared to what we normally do? full names rather than o_tables etc. and perhaps some comments outlining what's going on and why as it seems a little tangential.

03/02/08 22:47:58 changed by dasil003

  • attachment only_include_referenced_tables.diff added.

New version of the patch tested against edge rails r8971

03/02/08 22:51:18 changed by dasil003

Thanks for the attention Koz, you always seem to pay attention when no one else does. I've refactored the patch with clear names and a comment. Additionally there was a new dependency on the include_eager_order? and include_eager_conditions? method signature that was introduced since I first created the patch. I've fixed that as well.

mysql test suite is passing for me. my application tests pass as well.

My new patch is against trunk. I haven't bothered applying to stable since performance enhancements don't go in stable, right?

03/03/08 06:23:42 changed by nzkoz

  • status changed from assigned to closed.
  • resolution set to fixed.

(In [8977]) Improve performance on :include/:conditions/:limit queries by selectively joining in the pre-query. Closes #9560 [dasil003]

03/13/08 23:18:04 changed by jdevine

  • status changed from closed to reopened.
  • resolution deleted.

First-time commenter here, so I'm sorry if this is the wrong place to post...

I'm finding that this patch is breaking my queries in the following situation:

I have a model Play
 has_many :contributions
model Contribution
 belongs_to :play
 belongs_to :artist

If I now select

Play.find(:all, 
          {:include=>{:contributions=>:artist},
           :order=>" artists.last_name asc, artists.first_name asc "})

the query will error out because contributions is not included in the join. The new code filters out all joins listed directly in the "order by" clause, but not the intermediary joins needed to access those tables. The "artists" join is included, but the "contributions" join is not.

In other words I get: missing FROM-clause entry in subquery for table "contributions" P263 Fparse_relation.c L2008 RwarnAutoRange: SELECT * FROM (SELECT DISTINCT ON ("plays".id) "plays".id, artists.last_name AS alias_0, artists.first_name AS alias_1, plays.written_date AS alias_2, plays.written_mask AS alias_3, plays.name AS alias_4 FROM "plays" LEFT OUTER JOIN "artists" ON "artists".id = "contributions".artist_id ) AS id_list ORDER BY id_list.alias_0 , id_list.alias_1 , id_list.alias_2 , id_list.alias_3 , id_list.alias_4 LIMIT 20 OFFSET 0

I know I should include a patch, but I haven't quite parsed all the code yet. If I come up with one I will.

03/14/08 01:23:39 changed by jdevine

The following changes should fix the problem outlined above:

          def join_for_table_name(table_name)
            @joins.select{|j|j.table_name == table_name}.first rescue nil
          end
          
          def joins_for_table_name(table_name)
            join = join_for_table_name(table_name)            
            result = nil            
            if join && join.is_a?(JoinAssociation)
              result = [join]
              if join.parent && join.parent.is_a?(JoinAssociation)
                result = joins_for_table_name(join.parent.aliased_table_name) + result
              end
            end
            result
          end

03/14/08 01:31:43 changed by jdevine

Sorry, hit submit when I meant to hit preview.

        def construct_finder_sql_for_association_limiting(options, join_dependency)

          ...

          all_tables             = tables_from_conditions + tables_from_order
+          distinct_join_associations = all_tables.uniq.map{|table| 
+            join_dependency.joins_for_table_name(table)
+          }.flatten.compact.uniq

          ...

          if is_distinct
+            sql << distinct_join_associations.collect(&:association_join).join
-            sql << join_dependency.join_associations.reject{ |ja| !all_tables.include?(ja.table_name) }.collect(&:association_join).join
            add_joins!(sql, options, scope)
          end

        ...

        class JoinDependency
          ....

+          def join_for_table_name(table_name)
+            @joins.select{|j|j.table_name == table_name}.first rescue nil
+          end
          
+          def joins_for_table_name(table_name)
+            join = join_for_table_name(table_name)            
+            result = nil            
+            if join && join.is_a?(JoinAssociation)
+              result = [join]
+              if join.parent && join.parent.is_a?(JoinAssociation)
+                result = joins_for_table_name(join.parent.aliased_table_name) +             
+                         result
+              end
+            end
+            result
+          end

04/26/08 09:42:04 changed by rubyruy

+1 for jdevine's patch.

Really important actually since this causes regressions for order + limit queries.

04/27/08 08:04:35 changed by nzkoz

Could you provide this patch as a git patch. There are some instructions here:

http://www.tpope.net/rails-git-best-practices

04/27/08 16:06:49 changed by jdevine

I held off on submitting as a patch because I'm not sure the code I posted addresses table aliasing correctly... I've been using it with no problem in development so far, but I haven't taken the time to think it through or test it with multiple associations referencing the same table, etc.

If there's someone very familiar with the table-aliasing code who could give it a glance, that'd be great.

04/28/08 23:42:19 changed by fcheung

  • cc changed from nzkoz to nzkoz, fcheung.

This certainly looks plausible and a move in the right direction (and you have my sympathy for having dived into JoinDependency & friends :-) ). I do share your concern about the murkiness of some of the table aliasing stuff, so it would be great to have some tests to go with this. If you really want to make sure the table aliasing stuff is doing the right thing, then perhaps some tests where the same table is joined more than once.

I wonder if perhaps joins_for_table_name(join.parent.aliased_table_name) should be joins_for_table_name(join.parent.table_name) ? joins_for_table_name iterates over @joins looking for one whose table_name matches the argument supplied , so if you give it aliased_table_name and aliased_table_name differs from table_name (ie the non trivial case) then I would expect joins_for_table_name to return nothing.

05/04/08 03:56:20 changed by jdevine

  • attachment 0001-Added-logic-to-associations.rb-to-make-sure-select_f.patch added.

05/04/08 04:00:35 changed by jdevine

  • type changed from enhancement to defect.

Added patch as suggested. Changed pasted code to handle table aliasing correctly & handled cases where the table name is quoted in the order or conditions clause.

I also came across another similar situation today when using includes with a select statement, but I'll open another ticket for that one. Patch is simple.

05/04/08 04:24:10 changed by nzkoz

Sorry to be a pain but could you make a lighthouse ticket for this? That'll let us mark it as a 2.1 blocker

05/04/08 18:39:23 changed by jdevine

Submitted as ticket #109 on lighthouse.