Ruby on Rails | Screencasts | Download | Documentation | Weblog | Community | Source

Ticket #5849 (new defect)

Opened 2 years ago

Last modified 1 year ago

[PATCH] find(:first) forgets the 'LIMIT 1' when eager loading is used, severely hurting MySQL performance in some cases

Reported by: matt@drownedinsound.com Assigned to: bitsweat
Priority: normal Milestone: 2.x
Component: ActiveRecord Version: edge
Severity: major Keywords: find first limit eager loading associations
Cc: bitsweat, jarkko, tomafro, carlivar

Description

Artist.find(:first)

results in this SQL:

SELECT * FROM artists LIMIT 1

whereas

Artist.find(:first, :include => :label)

results in

SELECT artists.`id` AS t0_r0, artists.`name` AS t0_r1, artists.`location_id` AS t0_r2, artists.`type` AS t0_r3, artists.`label_id` AS t0_r4, artists.`image_id` AS t0_r5, artists.`info` AS t0_r6, artists.`summary` AS t0_r7, artists.`ordername` AS t0_r8, labels.`id` AS t1_r0, labels.`name` AS t1_r1, labels.`type` AS t1_r2, labels.`link_id` AS t1_r3, labels.`image_id` AS t1_r4, labels.`location_id` AS t1_r5, labels.`info` AS t1_r6 FROM artists LEFT OUTER JOIN labels ON labels.id = artists.label_id

Note the lack of a LIMIT clause. This means that MySQL performs the join on the entire artists table, and returns every row of the table, taking quite some time for what should be a really quick query. With tables holding millions of rows this is quite a performance problem.

I'm guessing this should be an easy fix, as explicity adding in a :limit => 1 parameter to the find seems to do the job.

Attachments

patch-1-for-5849.diff (3.3 kB) - added by jarkko on 01/15/07 16:22:04.
Updated patch to test both belongs_to and has_one cases.

Change History

(follow-up: ↓ 2 ) 08/22/06 02:36:00 changed by bitsweat

  • cc set to bitsweat.
  • version set to 1.1.1.
  • milestone set to 1.2.

(in reply to: ↑ 1 ; follow-up: ↓ 3 ) 01/04/07 17:31:23 changed by evan

Doesn't the join require no limit, otherwise if you have multiple dependent records you will only get one of them?

(in reply to: ↑ 2 ) 01/04/07 17:43:02 changed by mwillson

Replying to evan:

Doesn't the join require no limit, otherwise if you have multiple dependent records you will only get one of them?

In the case of a belongs_to or a has_one, there'll be at most one associated row, so it should include the LIMIT 1 when it's find(:first). This was the case I described, although perhaps I didn't make it clear enough.

That said, in the case of eager loading of a has_many or has_and_belongs_to_many, I suggest it should ignore the eager loading when this is done with a find(:first), as the performance penalty of missing out the LIMIT and hence having to join the entire table instead of just the single row, would more than negate the small benefit from avoiding one extra query to find the associated objects of that one first row.

01/15/07 14:10:30 changed by jarkko

  • version changed from 1.1.1 to edge.
  • milestone changed from 1.2 to 1.x.

Just a heads-up: the bug still exists in edge and I'm working on it.

01/15/07 16:12:05 changed by jarkko

  • summary changed from find(:first) forgets the 'LIMIT 1' when eager loading is used, severely hurting MySQL performance in some cases to [PATCH] find(:first) forgets the 'LIMIT 1' when eager loading is used, severely hurting MySQL performance in some cases.

I attached a patch with tests that fixes this issue with :belongs_to and :has_one associations.

I agree that the whole :include option is highly questionable when using find(:first). Fixing it would be relatively easy, just changing

          options.update(:limit => 1) unless options[:include]

to

          options.update(:limit => 1)
          options.delete(:include)

in find_initial (I think). However, making that change breaks a lot of existing tests (read assumptions) so I would like to hear the core's opinion on that.

But like said, the current fix doesn't break anything and fixes the issue for the most obvious cases.

01/15/07 16:12:17 changed by jarkko

  • cc changed from bitsweat to bitsweat, jarkko.

01/15/07 16:14:15 changed by jarkko

Oh, forgot to say, I tested the current patch with MySQL, PostgreSQL and SQLite, but it should work also for Oracle and DB2 (testing against /LIMIT 1|rownum <= 1/). Haven't a chance to test it against them, though.

01/15/07 16:22:04 changed by jarkko

  • attachment patch-1-for-5849.diff added.

Updated patch to test both belongs_to and has_one cases.

02/09/07 15:11:48 changed by tomafro

  • cc changed from bitsweat, jarkko to bitsweat, jarkko, tomafro.

I think find_every is now clever enough to limit returned results no matter what associations are included. So find_initial could be reduced to:

def find_initial(options)
  options.update(:limit => 1)
  find_every(options).first
end

I've discussed this with jarkko and will investigate further.

06/13/07 21:49:56 changed by carlivar

  • cc changed from bitsweat, jarkko, tomafro to bitsweat, jarkko, tomafro, carlivar.

Note that a workaround in the meantime is to use:

find(:all, :limit => 1).first

06/13/07 23:21:58 changed by bitsweat

  • owner changed from David to bitsweat.