Ruby on Rails | Screencasts | Download | Documentation | Weblog | Community | Source

Ticket #8773 (new defect)

Opened 2 years ago

Last modified 2 years ago

[PATCH] incorrect records return on find with has_many association, :group and :limit on mysql 5.0.37

Reported by: Smilinguy Assigned to: core
Priority: normal Milestone: 2.x
Component: ActiveRecord Version: 1.2.3
Severity: normal Keywords: has_many group limit
Cc: bitsweat

Description

I'm trying to query for events with a unique title and reviews with ratings higher than 1. I would like to return the earliest event with each unique title and I only would like to return a limited number of results. I constructed the query below, but it appears to yield erroneous results and I think it may also be inefficient. This happens when the :include refers to a has_many relationship. It doesn't happen on has_one relationships.

I'm running Rails 1.2.3 with MySQL 5.0.37 and Ruby 1.8.6 on OS X 10.4.9

When I run this query I get the expected results

Event.find(:all, :include=>:reviews, :conditions=>['reviews.rating > ?', 1], :group=>'events.title', :order=>'events.created_at ASC')

It generates the expected SQL:

SELECT events.`id` AS t0_r0, events.`title` AS t0_r1, ...
FROM events LEFT OUTER JOIN reviews ON reviews.event_id = events.id WHERE (reviews.rating > 1) GROUP BY events.title ORDER BY events.created_at ASC

However, when I add the :limit parameter, another query is run first that counts the number of records. This counting query ignores the :group clause and returns the wrong event ids (ie will give me duplicate titles)

Event.find(:all, :include=>:reviews, :group=>'events.title', :order=>'events.created_at ASC', :conditions=>[reviews.rating > ?', 3], :limit=>4)
SELECT DISTINCT events.id FROM events LEFT OUTER JOIN reviews ON reviews.event_id = events.id 
WHERE (reviews.rating > 1) ORDER BY events.created_at ASC LIMIT 4

SELECT events.`id` AS t0_r0, events.`title` AS t0_r1, ...
FROM events LEFT OUTER JOIN reviews ON reviews.event_id = events.id 
WHERE (reviews.rating > 1) AND events.id IN ('197', '564', '592', '617') 
GROUP BY events.title 
ORDER BY events.created_at ASC

I would expect it to generate only one query like

SELECT events.`id` AS t0_r0, events.`title` AS t0_r1, ...
FROM events LEFT OUTER JOIN reviews ON reviews.event_id = events.id 
WHERE (reviews.rating > 1) AND events.id IN ('197', '564', '592', '617') 
GROUP BY events.title 
ORDER BY events.created_at ASC
LIMIT 4

Or, if two queries are required for some reason, I would expect the query below for the first and for the second a simple query on IDs without the WHERE and GROUP criteria again. I'm not sure whether the ORDER BY or the LEFT OUTER JOIN are required on the second query either.

SELECT DISTINCT events.title, events.id FROM events LEFT OUTER JOIN reviews ON reviews.event_id = events.id 
WHERE (reviews.rating > 1) GROUP BY events.title ORDER BY events.created_at ASC LIMIT 4

SELECT events.`id` AS t0_r0, events.`title` AS t0_r1, ...
FROM events LEFT OUTER JOIN reviews ON reviews.event_id = events.id 
WHERE events.id IN ('197', '564', '592', '617') 
ORDER BY events.created_at ASC 

Attachments

add_group_clause_to_construct_finder_sql_for_association_limiting.patch (5.1 kB) - added by kamal on 06/28/07 07:59:12.
patch with test case
db_definitions_for_new_tables.patch (9.3 kB) - added by kamal on 06/28/07 07:59:51.
db definitions to support new test tables
db_definitions_for_new_tables.2.patch (0.8 kB) - added by kamal on 06/29/07 01:27:59.
patch db_definitions/schema.rb instead for the new tables

Change History

06/27/07 16:07:30 changed by Smilinguy

  • version changed from edge to 1.2.3.

06/28/07 07:59:12 changed by kamal

  • attachment add_group_clause_to_construct_finder_sql_for_association_limiting.patch added.

patch with test case

06/28/07 07:59:51 changed by kamal

  • attachment db_definitions_for_new_tables.patch added.

db definitions to support new test tables

06/28/07 08:05:41 changed by kamal

  • cc set to bitsweat.

I was intrigued by this defect, but I couldn't wrap my head around using the existing fixtures to create a scenario that matches this. So, I did the dumb thing and went ahead and created Events and Reviews table, hence the additional db_definitions patch.

This patch adds a one liner to construct_finder_sql_for_association_limiting to add a group clause if it exists.

MySQL

Works.

sqlite3

Fails two test, but could be due to a sqlite3 issue. It looks like it doesn't take into account the ORDER clause when doing grouping and wrongly selects the record that 'wins'.

Other databases

Untested. I'm also wary of the db_definitions for the other databases as I did it by hand .. Please help to test and report here.

06/28/07 08:44:25 changed by kamal

  • summary changed from incorrect records return on find with has_many association, :group and :limit on mysql 5.0.37 to [PATCH] incorrect records return on find with has_many association, :group and :limit on mysql 5.0.37.

06/28/07 20:33:55 changed by bitsweat

You can add the new tables to db_definitions/schema.rb rather than each database's sql.

06/29/07 01:27:59 changed by kamal

  • attachment db_definitions_for_new_tables.2.patch added.

patch db_definitions/schema.rb instead for the new tables

06/29/07 02:38:32 changed by kamal

Hmm, it occured to me that the query

SELECT DISTINCT events.title, events.id
FROM events LEFT OUTER JOIN reviews ON reviews.event_id = events.id
WHERE (reviews.rating > 1)
GROUP BY events.title
ORDER BY events.created_at ASC
LIMIT 4

is not valid in Oracle. This is because events.id is not a grouped column or a function like COUNT. Somehow, sqlite3 and MySQL allows this, but with varying results.

I'm leaning that this is a special case that needs to revert to plain find_by_sql or that AR should raise some Exception. Thoughts?