Ruby on Rails | Screencasts | Download | Documentation | Weblog | Community | Source

Ticket #11123 (closed enhancement: fixed)

Opened 5 months ago

Last modified 4 months ago

[PATCH] String#squish as a heredoc alternative.

Reported by: jordi Assigned to: core
Priority: normal Milestone: 2.0.3
Component: ActiveSupport Version: edge
Severity: normal Keywords: tiny verified
Cc:

Description

This tiny patch to ActiveSupport implements a method on String called squish, that returns the string, first removing all whitespace on both ends of the string, and then changing remaining consecutive whitespace groups into one space each.

It is very useful as an alternative to writing multi-line strings using the concatenation operator, addition, or the heredoc notation.

If the name is deemed too obscure, I'd suggest String#to_line.

Attachments

add_squish_method_to_string.diff (2.8 kB) - added by jordi on 02/15/08 05:29:01.
add_squish_method_to_string_2.diff (2.9 kB) - added by Henrik N on 02/15/08 15:56:59.
Some mods.
fast_banged_squish.diff (0.6 kB) - added by jordi on 03/08/08 16:03:15.
Faster version of String#squish!

Change History

02/15/08 05:29:01 changed by jordi

  • attachment add_squish_method_to_string.diff added.

02/15/08 05:41:46 changed by jordi

Almost forgot, an example of usage:

@description = %{ String#squish() allows me to write long strings
  inside my Ruby code, without having to worry about indentation,
  the page margin, spaces, tabs or newlines. "double" or 'single'
  quotes are both fine, and #{interpolation} works. }.squish

And another:

@elephants = Elephant.find_by_sql %{
  SELECT foo, bar FROM elephants
    INNER JOIN /* complicated joins ... */      
    WHERE      /* complicated conditions ... */
    /* etc, etc ... */
}.squish

02/15/08 06:30:46 changed by blj

I suppose it can be achieved by chaining string methods, but anyway wont hurt to have this as a method. Nice one. +1 for the idea

02/15/08 06:39:24 changed by jordi

  • keywords set to tiny.

02/15/08 14:03:15 changed by stouset

+1, useful as hell

02/15/08 14:07:18 changed by seebq

As someone who's written code like this:

totals = InnovationInterestScore.find_by_sql(<<-END_SQL.gsub(/\s+/, " ").strip)
    SELECT
      #{ifs} AS i, AVG(percentile) AS total
    FROM
      innovation_interest_scores 
    WHERE
      #{conditions ? conditions + ' AND' : ''} monthly_sum = true #{selected_weight ? ' AND ' + 'weight_id=' + selected_weight.id.to_s : ''}
    GROUP BY
      i
    ORDER BY
      i
    END_SQL
  end

This method is nice -- and does the exact same thing!

+1 for usefulness, tiny-ness, and good documentation

02/15/08 14:25:05 changed by brandon.beacher

+1 documented and tested

02/15/08 15:49:07 changed by Henrik N

  • keywords changed from tiny to tiny verified.

+1.

But any particular reason you do

def squish 
  dup.squish! 
end 

# Performs a destructive squish. See String.squish. 
def squish! 
  strip! 
  gsub!(/\s+/, ' ') 
  self 
end

instead of the shorter and simpler

def squish 
  strip.gsub(/\s+/, ' ')
end 

# Performs a destructive squish. See String.squish. 
def squish!
  replace(strip)
end

?

02/15/08 15:56:59 changed by Henrik N

  • attachment add_squish_method_to_string_2.diff added.

Some mods.

02/15/08 15:58:06 changed by Henrik N

The attached diff makes the change in my comment above, says "See String#squish" rather than "String.squish" and adds a test assertion to verify that String#squish isn't destructive.

02/15/08 16:00:37 changed by Henrik N

…and the patch also has replace(strip) (typo by me in the comment above) fixed to replace(squish). Sorry for the million comments.

(follow-up: ↓ 13 ) 02/15/08 19:13:22 changed by jordi

I didn't know about replace, actually. As long as both squish and squish! still return the same squished string, I don't see why I would care.

02/15/08 19:16:34 changed by stouset

As I understand it, the original one will only allocate space for a new string when doing String#squish, whereas the modded version does a new allocation for both that and the bang'd version.

The difference is minimal, but I'd rather go with two lines longer, equally as clear, and not allocating new objects.

02/15/08 19:24:40 changed by stouset

The modified one about is 33% slower than the original.

Benchmark.bm(10) do |b|

?> n = 1_000_000

s = " this\n\n\t is a string \t\t with some extra \n spaces\n" b.report("no alloc:") { n.times { s.dup.squish_one! } } b.report(" alloc:") { n.times { s.dup.squish_two! } } end

user system total real

no alloc: 2.890000 0.020000 2.910000 ( 2.906246)

alloc: 3.930000 0.020000 3.950000 ( 3.951751)

(in reply to: ↑ 10 ; follow-up: ↓ 17 ) 02/15/08 19:25:54 changed by Henrik N

Replying to jordi:

I didn't know about replace, actually. As long as both squish and squish! still return the same squished string, I don't see why I would care.

A minor matter for sure. Would love the method no matter the implementation; thanks a lot for contributing it.

stouset: Yes, came to the same realization after submitting the patch. Thanks for the benchmark :)

"String.squish" -> "String#squish" should of course be fixed, but nevermind the rest, then.

02/15/08 19:27:18 changed by stouset

Teach me not to preview first.

>> Benchmark.bm(10) do |b|
?>   n = 1_000_000
>>   s = "  this\n\n\t is a   string \t\t with  some     extra \n spaces\n"
>>   b.report("no alloc:") { n.times { s.dup.squish_one! } }
>>   b.report("   alloc:") { n.times { s.dup.squish_two! } }
>> end
                user     system      total        real
no alloc:   2.890000   0.020000   2.910000 (  2.906246)
   alloc:   3.930000   0.020000   3.950000 (  3.951751)

02/15/08 19:28:11 changed by stouset

I'm sloppy today. Here's the definition of those methods in String.

>> class String
>>   def squish_one!
>>     strip!
>>     gsub! %{\s+}, ' '
>>     self
>>   end
>>   def squish_two!
>>     replace(strip.gsub(%{\s+}, ' '))
>>   end
>> end

(follow-up: ↓ 18 ) 02/15/08 19:34:02 changed by Henrik N

stouset: Likely doesn't change which implementation is quicker, but gsub(%{\s+}, ' ') will replace a string, not a regexp (%r{…}).

(in reply to: ↑ 13 ) 02/15/08 19:35:40 changed by jordi

Replying to Henrik N:

Replying to jordi:

I didn't know about replace, actually. As long as both squish and squish! still return the same squished string, I don't see why I would care.

A minor matter for sure. Would love the method no matter the implementation; thanks a lot for contributing it. "String.squish" -> "String#squish" should of course be fixed, but nevermind the rest, then.

No, the extra test you added should stay too. The one that makes sure squish is not destructive, only squish!

(in reply to: ↑ 16 ) 02/15/08 19:45:30 changed by stouset

Replying to Henrik N:

stouset: Likely doesn't change which implementation is quicker, but gsub(%{\s+}, ' ') will replace a string, not a regexp (%r{…}).

Thanks. The first implementation is still faster, but it's 10.65s to 11.27s, now.

02/16/08 00:02:33 changed by nzkoz

  • status changed from new to closed.
  • resolution set to fixed.

(In [8878]) Add String#squish and String#squish! to remove consecutive chunks of whitespace. Closes #11123 [jordi, Henrik N]

02/21/08 20:59:23 changed by stouset

  • status changed from closed to reopened.
  • resolution deleted.

The wrong version of the patch seems to have been applied.

03/08/08 16:03:15 changed by jordi

  • attachment fast_banged_squish.diff added.

Faster version of String#squish!

03/08/08 16:06:21 changed by jordi

The attached fast_banged_squish.diff is the faster version that stouset is talking about. The patch applies on top of [8992].

03/13/08 02:25:16 changed by david

  • status changed from reopened to closed.
  • resolution set to fixed.

(In [9015]) Applied the faster squish version (closes #11123) [jordi]