Ruby on Rails | Screencasts | Download | Documentation | Weblog | Community | Source

Ticket #3238 (new enhancement)

Opened 2 years ago

Last modified 4 months ago

[PATCH] Allow ActiveRecord to monitor which attributes are modified

Reported by: lsegal@holycraplions.com Assigned to: rails@bitsweat.net
Priority: normal Milestone: 2.x
Component: ActiveRecord Version: edge
Severity: normal Keywords: ActiveRecord modified attributes needy risky
Cc: xaviershay, chuyeow, bitsweat, chewi@aura-online.co.uk, demisone, jerrett@bravenet.com, tarmo

Description

Motivation

ActiveRecord currently has no convenient way to see which values have been changed since the last record instantiation, reload, or save. This patch allows for the ability to monitor which attributes have changed.

Advantages

One major advantage to this incorporated into the code is increased efficiency in the update method. Before this patch, ActiveRecord will send an SQL query to update all fields, whether or not their values have changed. For small records this is okay, but for large records (tables with 8kb+ fulltext fields) this can be inefficient when only something like an froeign key id is modified. This patch will use the attribute_changed? method to only throw modified attributes and their values into the UPDATE SQL statement.

In addition to SQL optimization, users now have the ability to keep tabs on what data was modified from an existing record, which can be very helpful with form based input.

API

The following two public methods are implemented for use by the programmer:

# Returns true if attribute was modified since last read/reload/save
def attribute_changed?(attr_name) 

# Returns an array of modified attributes
def changed_attributes 

These methods can be incorporated into existing code for better management of attribute values.

Attachments

changed_attributes.patch (5.7 kB) - added by lsegal@holycraplions.com on 12/15/05 09:55:30.
changed_attributes2.patch (12.1 kB) - added by argv <lsegal@holycraplions.com> on 12/16/05 12:25:50.
This patch supercedes the original changed_attributes.patch
db_benchmark.rb (3.0 kB) - added by argv <lsegal@holycraplions.com> on 12/16/05 12:26:58.
Benchmarking script along with my results listed in the comments
activerecord_track_changes.diff (12.8 kB) - added by michaelboutros on 01/02/08 05:02:07.
activerecord_monitor_changes_fixed.diff (13.1 kB) - added by michaelboutros on 01/02/08 12:12:05.

Change History

12/15/05 09:55:30 changed by lsegal@holycraplions.com

  • attachment changed_attributes.patch added.

12/15/05 15:55:29 changed by anonymous

If trac supported bugzilla style voting, this would get mine.

12/15/05 16:19:52 changed by bitsweat

  • cc set to bitsweat.
  • milestone changed from 1.0 to 1.1.

See #1152, #1349, #1423.

Unfortunately, there is a pathological case:

foo = Foo.find(1)
foo.attribute_changed?(:name) == false
foo.name << 'abc'
foo.attribute_changed?(:name) == false
foo.name = 'def'
foo.attribute_changed?(:name) == true

12/15/05 21:07:28 changed by argv <lsegal@holycraplions.com>

Right,

However:

>> u = User.find_first
>> u.first_name
=> "Loren"
>> u.first_name << " Segal"
=> "Loren Segal"
>> u.first_name
=> "Loren"
>> u.first_name.gsub!('Loren','Steven')
=> "Steven"
>> u.first_name
=> "Loren"

So those functions don't stick changes in the attribute, if that's what you're getting at. Unless of couse I'm doing something wrong...

Thanks for the ticket references by the way. I will look through them more closely, but I really do like your original idea of the #{attr_name}_changed? method name, much cleaner. Maybe I will incorporate that.

12/15/05 22:20:49 changed by bitsweat

>> o = Order.find(1)
=> #<Order:0x2aaaad25a360 @attributes={"name"=>"foo", "shipped_at"=>nil, "id"=>"1", "pay_type"=>"cc", "address"=>"um", "email"=>"foobar@bitsweat.net"}>
>> o.name << 'bar'
=> "foobar"
>> o.name
=> "foobar"

12/16/05 12:24:20 changed by argv <lsegal@holycraplions.com>

Okay so I've done a lot of work on this thing in the last day. This patch has now become a sort of merging between the three tickets refered to me by bitsweat. Each one of those on their own had some great ideas, and I tried to bring them all together in the most coherent fashion.

Some considerable changes

  • You can now call #{attr_name}_changed? (ex: @user.first_name_changed?) in addition to the default attribute_changed?(attr_name) to see if the attribute has been changed. [Thanks to bitsweat for the idea]
  • There is now an option to track in place attribute modifications (on by default). Things like gsub''' and << will now show that the attribute is changed when the class variable "@@track_inplace_modifications" is set to true. Turning this off will further increase SQL performance [see db_benchmark.rb], but will not allow you to use in place modifications of attributes in your code. I'll bring this topic up later in the discussion. [Thanks to Ulysses]
  • I did benchmarks to further show performance increase. It seems that even with my implementation to track in place modification changes, there is actually an performance increase in code execution, not decrease. This is good-- faster code is always good. [Thanks to minam for the idea]

Discussion

I'd really like some feedback regarding the idea of the in place modifications thing. Ulysses' idea is that we should have the ability to disallow in place modifications entirely as long as the functionality is disabled by default.

I took a slightly different approach to this-- The developer still has access to in place modifications if he turns off the tracking for in place modification changes-- but it is his responsibility to know that he is likely to screw himself over if he uses anything like gsub! or serialized values. I think that's a fair deal, considering those changes are tracked by default.

On the other hand, turning this feature off can result in serious loss of functionality (serialized values), and I'm wondering if the developer should even be able to switch this feature off? Either way, there is a performance increase (see db_benchmark.rb for actual numbers), but it is a much larger increase when in place modifications are not being tracked. I'm basically wondering if you guys think this dangerous switch should be kept just for the developer who yurns for that extra speed, or if it should be scrapped?

One issue people might be concerned with is the extra memory usage coming along with the tracking of in place modifications. The implementation is no longer as lightweight as it was when I posted the original version yesterday, because it now keeps a copy of the data every time the attribute is read. This could be hurtful. I'm open to hearing people's thoughts on this subject. I personally don't see the extra memory usage as such a big deal, considering there is a performance gain to make up for it.

Finally, I put in the benchmarks to show performance gains. Given-- the benchmarks don't stress test large fields like fulltexts. I'm sure there is a point where performance will take a considerable hit with in place modifications on-- because of all the object cloning. However, in general case, this shows that the implementation can be good.

12/16/05 12:25:50 changed by argv <lsegal@holycraplions.com>

  • attachment changed_attributes2.patch added.

This patch supercedes the original changed_attributes.patch

12/16/05 12:26:58 changed by argv <lsegal@holycraplions.com>

  • attachment db_benchmark.rb added.

Benchmarking script along with my results listed in the comments

12/17/05 21:49:49 changed by epigenesis

This patch looks great! The benchmarks say it all:

Benchmarks:

original ActiveRecordpatched (with << and .gsub() tracking)patched (without tracking)
100 queries (without saves)0.403s0.598s0.390s
100 queries (with saves)5.091s4.496s2.421s

Quite a nice speed improvement for saves when only a couple fields are being updated. Also, this could be quite useful for user accounts when the passwords on them that get automatically encrypted before_save.

12/22/05 18:22:07 changed by david

  • owner changed from David to rails@bitsweat.net.

01/16/07 23:16:59 changed by chewi

  • cc changed from bitsweat to bitsweat, chewi@aura-online.co.uk.

I'm interested in this but I'm reluctant to use such a big patch, especially when it hasn't been updated in so long. It almost applies against 1.1.6, only 4 of the 16 hunks fail to apply. That doesn't necessarily mean it would work though and I suspect some more work would need to be done for 1.2. Any chance of it being merged?

I'm actually interested in it because it would solve a concurrency problem I have more transparently and more effeciently than lock_version.

03/13/07 03:31:40 changed by josh

  • keywords changed from ActiveRecord modified attributes to ActiveRecord modified attributes needy risky.

Needs alot of work to patch nicely into edge.

05/10/07 21:13:57 changed by demisone

  • cc changed from bitsweat, chewi@aura-online.co.uk to bitsweat, chewi@aura-online.co.uk, demisone.

07/12/07 00:22:38 changed by ennoia

  • cc changed from bitsweat, chewi@aura-online.co.uk, demisone to bitsweat, chewi@aura-online.co.uk, demisone, jerrett@bravenet.com.

08/07/07 01:01:37 changed by tarmo

  • cc changed from bitsweat, chewi@aura-online.co.uk, demisone, jerrett@bravenet.com to bitsweat, chewi@aura-online.co.uk, demisone, jerrett@bravenet.com, tarmo.

09/26/07 12:54:40 changed by tneumann

I think this enhancement will not make it into the core, right? Nevertheless I find this functionality very useful for some of my models. Thus I was thinking about a different API, that does not break existing code:

topic = Topic.find(:first)
topic.track_attribute_changes # start attribute tracking
topic.title = 'new title'
topic.title_changed? # => true
topic.author_name_changed? # => false

# updates only the title column:
topic.save_changes # UPDATE ... SET title = 'new title' WHERE ...

# the old way:
topic = Topic.find(:first)
topic.title = 'new title'
topic.save! # makes an UPDATE with all the columns, even if not changed

Plain old save and save! methods work like before - they save ALL the attributes. This is the desired behavior if you want to make sure that you do not get any invalid records in your database when multiple users edit different attributes of the model.

Only if you call track_attribute_changes on the model then changes are tracked (even in-place editing of attributes via gsub!, <<, and so on).

If there is any interest in such a patch let me know. I can adjust my patch for edge rails - currently I patched only 1.2.3. I basically use argv's method (thanks a lot to you, argv!), but I added some more test cases and added the save_changes method.

There may be a way to add this functionality via a plugin, but there would be quite a lot of monkey patching needed and it may clash with other plugins ...

01/02/08 04:57:41 changed by michaelboutros

Well, I spent the better part of my day attacking this patch, and I'm glad to say I finally got it working. Rails has moved a lot of it's attribute-related methods into a seperate file (and class), so that's where most of the work was done.

It appears to work, but at (apparently) random times, it incorrectly tracks attributes. Can someone else take a look at it and see what's going on? I've never been very good at wrapping my head around cause -> effect scenarios.

01/02/08 05:02:07 changed by michaelboutros

  • attachment activerecord_track_changes.diff added.

01/02/08 05:02:37 changed by michaelboutros

Quick update, I moved the tests for this patch from base_test.rb to attribute_methods_test.rb.

01/02/08 06:20:45 changed by chuyeow

  • cc changed from bitsweat, chewi@aura-online.co.uk, demisone, jerrett@bravenet.com, tarmo to chuyeow, bitsweat, chewi@aura-online.co.uk, demisone, jerrett@bravenet.com, tarmo.
  • version changed from 1.0.0 to edge.
  • milestone changed from 1.2.7 to 2.x.

01/02/08 07:07:47 changed by michaelboutros

Scratch that, I don't think you should check this into edge just yet. I've found a fatal flaw, looking into why this happens right now:

[code]>> m = Mock.find(:first) => #<Mock id: 1, created_at: "2008-01-01 20:50:18", updated_at: "2008-01-01 20:51:29", name: "George", age: 10>

m.name << ' Boutros'

=> "George Boutros"

m.save!

=> true /code

Produces the SQL:

[code]UPDATE mocks SET updated_at = '2008-01-02 02:05:28' WHERE id = 1/code

Which is not what we want.

01/02/08 12:11:00 changed by michaelboutros

After even more debugging and trying things out, I think we've got it settled. However, please, test this out yourselves; not just with the unit tests, but with actual usage scenarios in script/console.

01/02/08 12:12:05 changed by michaelboutros

  • attachment activerecord_monitor_changes_fixed.diff added.

01/03/08 04:00:37 changed by michaelboutros

I believe I've gotten down all the problems, and even things like << and gsub should now be tracked. However, several other road blocks have popped up, not the least of which is ActiveRecord's Aggregations. How are we to track those as well?

I'm starting to think that this wasn't the best way to approach this problem... the only other viable alternative (that I can think of at the moment) is to keep another hash of the attributes that is frozen. Then, when the record is called to be saved, the two are compared and only changed values are updated. Any thoughts on that?

01/03/08 04:09:20 changed by xaviershay

  • cc changed from chuyeow, bitsweat, chewi@aura-online.co.uk, demisone, jerrett@bravenet.com, tarmo to xaviershay, chuyeow, bitsweat, chewi@aura-online.co.uk, demisone, jerrett@bravenet.com, tarmo.