Ruby on Rails | Screencasts | Download | Documentation | Weblog | Community | Source

Ticket #2162 (closed enhancement: wontfix)

Opened 3 years ago

Last modified 2 years ago

[PATCH] Connection Pooling for ActiveRecord

Reported by: greg@lapcominc.com Assigned to: rails@bitsweat.net
Priority: normal Milestone: 1.x
Component: ActiveRecord Version: 0.14.2
Severity: normal Keywords: activerecord connection pooling verified
Cc: bitsweat, tom@popdog.net

Description (Last modified by bitsweat)

This patch provides the option to use connection pooling in ActiveRecord. We have a need to use ActiveRecord in an existing WEBrick environment, and also want ActiveRecord to work in a multithreaded environment such as dRuby.

I'm not sure if I've done this the most elegant fashion, so let me know how it could be improved.

To turn on pooling, ActiveRecord::Base.pool_connections is set to true. There are several other options that control the minimum and maximum size of the pool, whether to wait or immediately error out, and how often to recycle connections.

I also made a small change in activerecord/test/fixtures/db_definitions/mysql.sql to allow foreign keys to work on MySQL 4.0, which is what I happened to have installed. Should be compatible with 4.1 as well.

In addition to the diff, there are two new files: activerecord/test/pool_connections_test.rb activerecord/lib/active_record/connection_adapters/connection_pool.rb

Attachments

active_record_connection_pooling.diff (13.1 kB) - added by greg@lapcominc.com on 09/09/05 16:47:51.
diff
connection_pool.rb (2.7 kB) - added by greg@lapcominc.com on 09/09/05 16:49:02.
new file
pool_connections_test.rb (4.7 kB) - added by greg on 09/09/05 16:49:45.
active_record_connection_pooling2.diff (4.3 kB) - added by anonymous on 11/03/05 18:24:43.
new_files.tar (20.0 kB) - added by greg@lapcominc.com on 11/03/05 18:27:27.
new files needed to work with .diff
active_record_connection_pooling3.diff (16.6 kB) - added by greg@lapcominc.com on 11/07/05 17:01:10.
active_record_connection_pooling4.diff (18.1 kB) - added by greg@lapcominc.com on 11/07/05 19:14:22.
connection_pool_test.rb (14.8 kB) - added by tomafro on 11/21/05 16:30:37.
Tests and code, all in one nasty file
pooled_adapter.diff (14.0 kB) - added by tomafro on 11/28/05 17:44:20.
Pooled adapter integrated into rails

Change History

09/09/05 16:47:51 changed by greg@lapcominc.com

  • attachment active_record_connection_pooling.diff added.

diff

09/09/05 16:49:02 changed by greg@lapcominc.com

  • attachment connection_pool.rb added.

new file

09/09/05 16:49:45 changed by greg

  • attachment pool_connections_test.rb added.

09/09/05 17:15:51 changed by anonymous

  • summary changed from Connection Pooling for ActiveRecord to [PATCH] Connection Pooling for ActiveRecord.

09/09/05 17:31:55 changed by greg@lapcominc.com

Sorry I forgot to mention this in the main description, but the unit tests for mysql and sqlite3 pass after my changes. Only the mysql adapter has been updated to support pooling, but I can provide changes to other adapters (especially Oracle) if the patch is acceptable. Thanks,

Greg

09/11/05 05:50:29 changed by david

This looks like solid work, but I was wondering why the default one-connection-per-thread approach wasn't sufficient? Is your use case such that you would easily get more threads than you wanted connections open?

09/11/05 19:55:16 changed by greg@lapcominc.com

Hi David,

The main problem with the default one-connection-per-thread approach was that it is storing the connection in a thread local and keeping the connection open indefinitely. This would work fine in a server where threads were kept in a pool and reused, but in our non-Rails WEBrick environment (and druby, which is another environment in which we would like to use activerecord) a new thread is created for each incoming request. As a result, we get a new connection for each request, and those connections are never being closed, so we ended up with several hundred connections and almost took down our live site :) The normal FastCGI and single-threaded WEBrick dev environment don't have this issue, its only when trying to use ActiveRecord in a multithreaded server that this becomes an issue.

Greg

09/14/05 09:07:02 changed by greg@lapcominc.com

Hello,

Just wanted to check in and see if I should make changes to the other adaptors and provide a new diff file. I wasn't sure if the patch had been accepted or not. Thanks,

Greg

09/28/05 09:31:22 changed by david

  • keywords changed from activerecord connection pooling to activerecord connection pooling fd.

10/11/05 00:55:22 changed by minam

  • owner changed from David to anonymous.
  • status changed from new to assigned.

Greg, you're currently pooling at the level of the actual connection. What if the pool was made at the adapter level? In other words, do you think it would be feasible to take what you've done one level higher in the abstraction chain and create a connection-adapter-pool? That would allow the feature to be easily extensible to other DB's, I would think, without having to actually implement the feature for each DB.

Thoughts?

10/11/05 12:22:09 changed by greg@lapcominc.com

I was originally going to do this at the adapter level, i.e. have the adapter manage a pool of actual connections. But when I started looking at the code, it seems like there would be less changes to make by doing it at the connection level. I guess I went for the less intrusive approach because I didn't want to start making high level changes without knowing if there was interest in the feature or not. But I think having activerecord work in a multithreaded environment (and someday all of Rails) is a worthy goal, especially if Ruby 2.0 is going to have native threads. So if there is interest in adding connection pooling, I am willing to try to re-implement it in a cleaner fashion and would love to hear some design ideas from someone who knows the code better than myself!

Thanks,

Greg

10/12/05 20:14:20 changed by bitsweat

Greg, we'd love to see a PooledAdapter that can juggle any of the other connection adapters included in Rails 1.0, particularly because Zed Shaw's SCGI interface is multithreaded and, hopefully, will also make it into 1.0.

See lib/active_record/query_cache.rb for an example of wrapping another connection adapter.

10/12/05 22:46:09 changed by marcel

  • keywords changed from activerecord connection pooling fd to activerecord connection pooling.

Move this out of the final drive.

11/03/05 18:22:32 changed by greg@lapcominc.com

  • version changed from 0.13.1 to 0.14.2.

Okay, here's another attempt at this, hopefully much cleaner than before. Plus, it should work with any adapters defined in RAILS_CONNECTION_ADAPTERS when activerecord is required. I basically took the same approach as query_cache.rb and redefined the adapter xx_connection methods so that they return a ConnectionPool::Wrapper when ActiveRecord::Base.pool_connections is true. The wrapper proxies to the actual connection adapters from the pool, taking a connection from the pool, executing method on it, returning to pool. Works with transactions too, only ugly part is I had to copy and paste DatabaseStatements::transaction into Wrapper in order to keep the internal begin_db_transaction, etc. calls going through my Wrapper...would like a cleaner way if anyone has a suggestion.

11/03/05 18:24:43 changed by anonymous

  • attachment active_record_connection_pooling2.diff added.

11/03/05 18:27:27 changed by greg@lapcominc.com

  • attachment new_files.tar added.

new files needed to work with .diff

11/06/05 15:17:05 changed by Jordan <jordan@husney.com>

I have just begin to delve into Rails behavior in a multithreaded environment and have opened Ticket #2742. Pooling a bunch of database handles does seem like the right thing to do especially since you won't have the added latency of having to connect to the database for each and every thread created.

However, shouldn't Rails' simplistic approach of one connection per thread just work?

When a WEBrick client connnection closes or any other multi-thread app joins its client handler thread shouldn't the database connection be closed and reaped automatically? Why is it that any multithreaded application ends up creating one un-closed connection for each and every thread created?

11/07/05 00:48:06 changed by greg@lapcominc.com

The problem with WEBrick (and I suspect with your multithreaded server as well) is that it creates a new thread for each request. The Rails allow_concurrency flag, when set to true, causes ActiveRecord to store active connections in a thread local variable instead of a class variable, allowing a connection-per-thread. The problem is that these connections are never closed, and the threads are not reused by WEBrick, so unreleased connections keep stacking up over time. This is what I'm trying to address with the connection pool patch. Rails gets around this by making WEBrick single threaded for development mode.

11/07/05 17:01:10 changed by greg@lapcominc.com

  • attachment active_record_connection_pooling3.diff added.

11/07/05 19:14:22 changed by greg@lapcominc.com

  • attachment active_record_connection_pooling4.diff added.

11/17/05 17:57:30 changed by tomafro

I'm working on a new AdapterPool based on the code from greg@lapcominc.com but factored as a standalone adapter wrapping a normal adapter. It will have the following features when finished:

* Min/Max # of idle adapters/connections (not guaranteed all the time, but rebalanced in a dedicated thread every x seconds)

* Keeping connections alive by checking for staleness every x seconds (to fix #428)

* Removing broken connections from the pool and replacing them with fresh ones where possible (allowing recovery from a db reset)

Configuration will be in database.yml, like so:

production:
  adapter: pool
  min_idle: 5
  underlying_adapter:
    adapter: mysql
    username: sa
    ....etc....  

Is this sort of what jeremy meant by a PooledAdapter?

I'll provide my first draft of this tomorrow but any comments/suggestions anyone has in the meantime are very welcome.

11/21/05 16:29:49 changed by tomafro

OK, I'm attaching the first very sketchy version of my connection/adapter pool for initial comments. It still needs a lot of cleaning up, and integrating into active record proper (which is why I'm attaching a single file with the adapter code and tests). To run the code, place the file in activerecord/test, go to activerecord/test and run ruby -I "connections\native_sqlserver_odbc" connection_pool_test.rb replacing sqlserver_odbc with your own connection. I'll try and finish up a cleaned up, properly integrated patch by wednesday, but as I promised something a couple of days ago, thought I should show what I have in the meantime!

11/21/05 16:30:37 changed by tomafro

  • attachment connection_pool_test.rb added.

Tests and code, all in one nasty file

11/23/05 03:13:43 changed by bitsweat

  • cc set to bitsweat.
  • description changed.
  • milestone set to 1.1.

Awesome; I'll give it a shot. Thanks Tom! And, that is what I meant by PooledAdapter :)

11/23/05 04:55:31 changed by greg@lapcominc.com

Hi Tom-

Just took a look at your code and it looks pretty good so far. Out of curiosity, what was missing in my solution that prompted you to re-implement connection pooling in a different fashion? Both approaches seem valid, and frankly I like your approach better as it uses composition instead of adding even more methods to the already bloated ActiveRecord::Base class.

Thanks,

Greg

11/23/05 16:21:56 changed by tomafro

  • cc changed from bitsweat to bitsweat, tom@popdog.net.

Greg,

As you can probably tell, much of my code is derived directly from yours. The only objective issue I had with your approach is pretty much as you've said: I didn't want the code so closely coupled with ActiveRecord::Base. Also, implementing the pool as a full blown adapter will allow the use of different pools and pooling strategies, to suit different configurations and users. I rewrote the code test-first from scratch, to learn as much about your code as I could and to help get my head around the way ActiveRecord instantiates adapters!

After my first version (which was much closer to your code), I tried to work out which areas I could simplify and refactored some stuff away. Finally, I added the actual pooling strategy I needed myself (idle connections ready for action, constant refreshing of stale connections to get around #428). Like I mentioned above, I'm sure different pooling strategies are valid in different environments. Maybe a next step is to create an AbstractAdapterPool and let subclasses dictate the pooling strategy.

Hope that kind of explains - feel free to email me if it doesn't (or reply on this bug report, whichever)

Tom

11/28/05 17:43:21 changed by tomafro

Cleaned up the previous code, fixed a few bugs and integrated it into a proper patch. I've run the active record test suite through the pooled adapter, and it mostly passes. Of the tests that fail, those I've examined have been due to assumptions in the tests. I will check through all the tests and attach another patch with a new defined adapter (something like native_pooled_mysql) and the fixes required to get the tests to work. I'm also going to configure a test server with the pool and fire some performance tests at it. In the meantime, I offer the latest code for constructive criticism.

To try out the adapter against a real app, use:

testing:
  adapter: pooled
  underlying_config:
    adapter: mysql
    ....etc...

I will be testing against mysql and sqlserver, so tests against other databases would be very much appreciated.

11/28/05 17:44:20 changed by tomafro

  • attachment pooled_adapter.diff added.

Pooled adapter integrated into rails

03/04/06 02:47:42 changed by david

  • owner changed from anonymous to rails@bitsweat.net.
  • status changed from assigned to new.

09/03/06 22:22:12 changed by bitsweat

  • milestone changed from 1.2 to 1.x.

Has anyone taken this patch further? Consider a plugin to gain some exposure and maturity - particularly, test thread safety with mongrel.

(in reply to: ↑ description ) 01/04/07 11:20:50 changed by twodecode

I'm also very keen on this patch making into rails.

We even want to take it a bit further. With MySQL (and probably) other DB it is possible to optimise certain machines for reading or writing within a cluster. It would be nice to be able to specify what connections should be used for (reading / writing) and it this should be forced or preferred behaviour.

02/25/07 23:19:35 changed by josh

  • keywords changed from activerecord connection pooling to activerecord connection pooling verified.

02/26/07 15:08:39 changed by tomafro

  • status changed from new to closed.
  • resolution set to wontfix.

As I understand it, rails itself isn't really threadsafe, (http://www.mail-archive.com/mongrel-users@rubyforge.org/msg00243.html) so adding connection pooling at the application level doesn't make sense. At some point in the future this may change, but for now I'm resolving this issue as I don't think my patch should be applied (or made into a plugin). If anyone else wants to take the code and do anything with it, feel free.