Ruby on Rails | Screencasts | Download | Documentation | Weblog | Community | Source

Ticket #8740 (new defect)

Opened 1 year ago

Last modified 7 months ago

Intermediary model may not find records of its subclasses

Reported by: caio Assigned to: core
Priority: normal Milestone: 2.x
Component: ActiveRecord Version: edge
Severity: normal Keywords:
Cc: lifofifo, tarmo, mikong

Description

First some code. I use count for brevity but the bug affects any finder method.

The bug is in how type_condition works.

# necessary schema
create_table :roots do |t|
  t.column :title, :string
  t.column :type,  :string
end

# root.rb
class Root < ActiveRecord::Base; end

# intermediary.rb
class Intermediary < Root; end

# leaf.rb
class Leaf < Intermediary; end

# Start a new console in development env
script/console

# Create some records
Root.create         :title => 'root'
Intermediary.create :title => 'intermediary'
Leaf.create         :title => 'leaf'

# case 1
reload!; nil
Leaf.count         => 1
Intermediary.count => 2
Root.count         => 3
# all fine here.

# case 2
reload!; nil
Root.count         => 3
Intermediary.count => 1 <= WRONG
Leaf.count         => 1
Intermediary.count => 2 <= Right, now it knows about Leaf.

So here's what's going on:

Root.count always returns the correct value because Root counts regardless of type. For the root model, all records in the table are instances of it.

Leaf.count always returns the correct value because it only counts records of type Leaf, it doesn't care about other models in its hierarchy.

The problem is with the Intermediary class. It has to find/count both records with type Intermediary and Leaf. In case 1, Leaf will have already been loaded, so it'll be counted in Intermediary's subclasses. That'll allow Intermediary to build the correct type_condition, and it'll find the right records.

In case 2, however, Leaf has not been loaded and Intermediary has no way to know about it, so its type_condition will only match records of type Intermediary, not Leaf. After we load Leaf, Intermediary starts finding/counting correctly, because now Leaf is listed in Intermediary's subclass list.


I think this is buggy behavior because, with the same records in the table, the same method call will return different sets of records depending on which classes have been loaded in the program up to this point.


I talked about this on #rails-contrib with hasmanyjosh, and one idea was to have Root query the table for the values in the type column, and then try to load the respective models. I've been playing with this idea for a day now, and I don't think I can make it work satisfactorily. I'd also prefer not to couple AR with the Dependencies mechanism.

Another idea I had was to break things and aim for Rails 2.0. This is my preferred route for now. My idea is to store the entire model hierarchy in the type column. So, for our records in the example, we'd have the following values in the type column:

Root/
Root/Intermediary/
Root/Intermediary/Leaf/

This makes selecting the appropriate records much easier and independent of which classes are currently loaded. For Intermediary, for example, just query for type LIKE 'Root/Intermediary/%'


I realize this breaks a highly relied upon feature, in order to fix a relatively obscure bug. Most people only use one level of inheritance in their apps. I hope to provide a migration to help people get on with this change. I'd appreciate some feedback on the feasibility of checking such a change in.

Right now I'll begin work on a plugin that implements the type column modification, so it can be used by people on 1.2.x. Later on I'll turn it into a patch for the trunk.

Attachments

8740_r7126-test_case.diff (2.9 kB) - added by caio on 06/26/07 01:45:16.
unit test

Change History

(follow-up: ↓ 5 ) 06/25/07 20:41:23 changed by jeremymcanally

Is there any way you could wrap that into a unit test?

06/26/07 00:39:17 changed by lifofifo

  • cc set to lifofifo.

I think this is a very specialized case and in cases like this, you should just load models explicitly in environment.rb by just typing class name. And this will be an issue in development mode only.

06/26/07 01:24:50 changed by caio

lifofifo, I tried loading the models in the environment, the problem is that in development mode, all classes will be unloaded for each request, and if the first class that's called is the intermediary class, you'll have trouble anyway.

The problem does not only affect development mode. Even in production mode, classes get autoloaded, they just don't get unloaded. So if the intermediary class is used before something loads the Leaf, you'll still get incomplete results. True, though, in this case loading the classes in the environment would fix it.

06/26/07 01:27:18 changed by caio

Also, I don't think having more than one level of nesting in the models is that specialized.

And more importantly, I think the inconsistency you get when the leaf classes are not loaded is really bad in the principle of least surprise scale.

(in reply to: ↑ 1 ) 06/26/07 01:29:11 changed by caio

Replying to jeremymcanally:

I can make a test for case 2.

I'd have to fake class autoloading by doing manual requires, but the test would still get the point across. I have something in the works already.

Really testing unloading and reloading would require some sort of functional test.

06/26/07 01:45:16 changed by caio

  • attachment 8740_r7126-test_case.diff added.

unit test

06/26/07 01:50:18 changed by caio

By the way, I think I can patch this in a backwards-compatible way by adding some extra smarts to compute_type.

07/03/07 01:37:36 changed by tarmo

  • cc changed from lifofifo to lifofifo, tarmo.

Well, although you don't like the dependencies mechanism to me it seems the simplest solution. I've been using it with no sideffects. btw. #4730 is similiar to this ticket.

So what I use currently looks like this:

parent.rb:
class Parent < ActiveRecord::Base
end

require_association 'sub_sti'

sub_sti.rb:
class SubSti < Parent
end

require_association 'sub_sub_sti'

sub_sub_sti.rb
class SubSubSti < SubSti
end

Whenever one of these classes is reloaded it's parent (as before) and now also it's children get reloaded, recursively, this is the only way to know that any and all related classes are loaded at the same time.

The problem with using the Parent/Child/SubChild for type is that object hierarchy can and will change and fixing such changes may be a nightmare especially because the type does not get a lot of attention because it should be automatically handled by Rails.

Just loading all models that exist in the type column sounds extreme, it not only could be an expensive query on a large table (I can't think of any other thing in Rails that automatically does something that potentially looks at all the rows in a table) but it also gives different results for different database contents, so wether a syntax error in a model becomes visible is going to depend on if that model has a row in a table. Though if I had to choose, I'd choose this one.

07/11/07 09:09:07 changed by mikong

  • cc changed from lifofifo, tarmo to lifofifo, tarmo, mikong.

03/25/08 17:42:54 changed by Danimal

Can't you just simply include the non-loaded Model into the model that needs it? For example, in the original example, all you need to do is change the definition of Intermediary as follows:

class Intermediary < Root
  Leaf
end

and now case 2 looks like:

# case 2 - now fixed!
reload!; nil
Root.count         => 3
Intermediary.count => 2 <= RIGHT! (cause it knew about Leaf from the get-go)
Leaf.count         => 1

No enviroment.rb changes. No other code. I don't know if there is any other impact but it looks like it just preloads the Leaf class so the Leaf.count call becomes:

  SQL (0.000370)   SELECT count(*) AS count_all FROM `roots` WHERE ( (`roots`.`type` = 'Intermediary' OR `roots`.`type` = 'Leaf' ) ) 

instead of the non-preloaded:

  SQL (0.000322)   SELECT count(*) AS count_all FROM `roots` WHERE ( (`roots`.`type` = 'Intermediary' ) )