Ruby on Rails | Screencasts | Download | Documentation | Weblog | Community | Source

Ticket #9017 (reopened enhancement)

Opened 2 years ago

Last modified 6 months ago

[PATCH] Add LibXML support to Active Resource

Reported by: BCC Assigned to: core
Priority: normal Milestone: 2.x
Component: ActiveResource Version: edge
Severity: normal Keywords: LibXML, ActiveResource
Cc: chuyeow

Description

As I was trying to import 12 MB of XML, it quickly became very clear that the SimpleXML used in Hash.from_xml was not going to cut it. As Active Resource will probably be used to handle large xml files, I created a patch so that libxml can be used to parse the xml. This made fetching the active resource go from 240 seconds to 12 seconds. A patch is attatched.

The net override (also included) makes downloading a factor four faster.

Attachments

active_resource_lib_xml.patch (2.0 kB) - added by BCC on 07/19/07 08:17:50.

Change History

07/19/07 08:17:50 changed by BCC

  • attachment active_resource_lib_xml.patch added.

07/19/07 10:53:44 changed by lifofifo

  • status changed from new to closed.
  • resolution set to incomplete.

Please add tests and reopen the ticket.

07/19/07 11:28:31 changed by BCC

  • status changed from closed to reopened.
  • resolution deleted.

No additional tests are needed, as it functions 100% the same as SimpleXML.

07/19/07 11:41:47 changed by BCC

Thinking about this.. it might be necessary to copy the tests from Hash Conversion to the unit test for connection. Would that be a good idea?

07/19/07 11:43:32 changed by lifofifo

-1

lifo:~/Rails/rails/activeresource pratik$ rake test   
(in /Users/pratik/Rails/rails/activeresource)
/opt/local/bin/ruby -w -Ilib:test "/opt/local/lib/ruby/gems/1.8/gems/rake-0.7.3/lib/rake/rake_test_loader.rb" "test/authorization_test.rb" "test/base/custom_methods_test.rb" "test/base/equality_test.rb" "test/base/load_test.rb" "test/base_errors_test.rb" "test/base_test.rb" "test/connection_test.rb" 
./test/../lib/../../activesupport/lib/active_support/dependencies.rb:495:in `require': no such file to load -- xml/libxml (MissingSourceFile)
        from ./test/../lib/../../activesupport/lib/active_support/dependencies.rb:495:in `require'
        from ./test/../lib/../../activesupport/lib/active_support/dependencies.rb:342:in `new_constants_in'
        from ./test/../lib/../../activesupport/lib/active_support/dependencies.rb:495:in `require'
        from ./test/../lib/active_resource/connection.rb:6
        from ./test/../lib/../../activesupport/lib/active_support/dependencies.rb:495:in `require'
        from ./test/../lib/../../activesupport/lib/active_support/dependencies.rb:495:in `require'
        from ./test/../lib/../../activesupport/lib/active_support/dependencies.rb:342:in `new_constants_in'
        from ./test/../lib/../../activesupport/lib/active_support/dependencies.rb:495:in `require'
         ... 10 levels...
        from /opt/local/lib/ruby/gems/1.8/gems/rake-0.7.3/lib/rake/rake_test_loader.rb:5:in `load'
        from /opt/local/lib/ruby/gems/1.8/gems/rake-0.7.3/lib/rake/rake_test_loader.rb:5
        from /opt/local/lib/ruby/gems/1.8/gems/rake-0.7.3/lib/rake/rake_test_loader.rb:5:in `each'
        from /opt/local/lib/ruby/gems/1.8/gems/rake-0.7.3/lib/rake/rake_test_loader.rb:5
rake aborted!
Command failed with status (1): [/opt/local/bin/ruby -w -Ilib:test "/opt/lo...]

(See full trace by running task with --trace)

07/19/07 11:45:18 changed by BCC

You might want to read your error messages: no such file to load -- xml/libxml

Testing this without ruby-libxml seems a bit useless, doesn't it?

07/19/07 11:47:38 changed by lifofifo

ActiveResource tests do not have dependancy on xml/libxml. All the tests pass when this patch is not applied.

Thanks.

07/19/07 11:53:29 changed by lifofifo

  • summary changed from Add LibXML support to Active Resource to [PATCH] Add LibXML support to Active Resource.

07/19/07 12:14:10 changed by BCC

Ah, I see. Hmm, there seems to be something strange going on in the trunk, when I fix the requirement in dependencies: /activesupport/lib/active_support/core_ext/time/calculations.rb:230:in `-': uninitialized constant ActiveSupport::Duration Will look into it.

07/20/07 15:07:47 changed by BCC

I was thinking as activeresource as a plugin, but as it is going to be an integral part of rails, wouldn't it be more logical to just patch the hash.from_xml?

11/16/07 11:12:19 changed by chuyeow

  • cc set to chuyeow.

01/07/09 17:27:19 changed by zuk

What happened to this? Seems like a pretty vital enhancement. With the REXML parser, ActiveResource is somewhat useless in anything other than the most lightweight/trivial cases. It's just way to slow to cope with real-world data.

01/08/09 08:11:45 changed by BCC

I have been using it in production for over a year now. There are some more elegant patches out there that actually check if the hpricot/libxml/whatever gem is installed.

01/08/09 17:09:56 changed by zuk

I had to strip all newline characters (\n) from the input XML, otherwise the patched code chokes.

01/08/09 17:13:26 changed by zuk

... actually looks like this patch doesn't deal well with any XML indentation.

01/08/09 17:32:48 changed by zuk

Looks like setting the following LibXml option fixes the indentation/whitespace problem, although I can't confirm that this will fully replicate REXML parsing behaviour (treatment of text nodes with whitespace might be different):

XML.default_keep_blanks = false

The patch also needs to be modified to convert dashes in node names to underscores.

01/09/09 07:34:59 changed by BCC

I strip the newlines before feeding the data into LibXML. This actually seems faster. But this patch was intended for 2.0.1. Does it still work for 2.2?