FasterCSV: preserving quoted strings

View: New views
7 Messages — Rating Filter:   Alert me  

FasterCSV: preserving quoted strings

by Bil Kleb :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Google et al are failing me: How do I preserve quoted
CSV strings on output?

% cat > csv_quotes.rb << EOF
require 'rubygems'
require 'faster_csv'
require 'test/unit'

class ConversionTest < Test::Unit::TestCase
   def test_preserve_quoted_strings
     csv_data = '"string",2,0.3'
     assert_equal( csv_data, csv_data.parse_csv*',' )
   end
end
EOF

% ruby -ws csv_quotes.rb
Loaded suite csv_quotes
Started
F
Finished in 0.005189 seconds.

   1) Failure:
test_preserve_quoted_strings(ConversionTest) [csv_quotes.rb:8]:
<"\"string\",2,0.3"> expected but was
<"string,2,0.3">.

1 tests, 1 assertions, 1 failures, 0 errors

Thanks,
--
Bil Kleb
http://fun3d.larc.nasa.gov
http://twitter.com/bil_kleb


Re: FasterCSV: preserving quoted strings

by James Gray-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Feb 26, 2009, at 8:14 AM, BIl Kleb wrote:

> Google et al are failing me: How do I preserve quoted
> CSV strings on output?

I'm not totally sure I understand the question, but your test made it  
look like your data was one field you wanted to be able to read in.  
If so, you'll need to write is out properly escaped first:

require 'rubygems'
require 'faster_csv'
require 'test/unit'

class ConversionTest < Test::Unit::TestCase
  def test_preserve_quoted_strings
    field = '"string",2,0.3'
    csv   = [field].to_csv  # => "\"\"\"string\"\",2,0.3\"\n"
    assert_equal(field, csv.parse_csv.first)
  end
end

If I'm wrong and you meant for that to be three separate fields, then  
just bust them up to get the valid CSV:

require 'rubygems'
require 'faster_csv'
require 'test/unit'

class ConversionTest < Test::Unit::TestCase
  def test_preserve_quoted_strings
    fields = '"string",2,0.3'.split(",")
    csv    = fields.to_csv  # => "\"\"\"string\"\"\",2,0.3\n"
    assert_equal(fields, csv.parse_csv)
  end
end

Hope that helps.

James Edward Gray II


Re: FasterCSV: preserving quoted strings

by Bil Kleb-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Feb 26, 12:11 pm, James Gray <ja...@...> wrote:
> On Feb 26, 2009, at 8:14 AM, BIl Kleb wrote:
>
> > Google et al are failing me: How do I preserve quoted
> > CSV strings on output?
>
> I'm not totally sure I understand the question, but your test made it  
> look like your data was one field you wanted to be able to read in.  

Sorry for the confusion, my simplified test isn't close enough
to my problem domain ... I'll try again in long form:

I have a CSV file with headers and rows like

 "scheme","time_steps","dt"
 "1storder",2,0.5
 "4thorder",5,1.0

and I am using FasterCSV to read this CSV file to get
a hash of header=>value pairs for each row.

For each row worth of data, I create an output file of the form

&some_weird_name
  scheme = "1storder",
  time_steps = 2,
  dt = 0.5
/

What I'm currently getting is "1storder" without the quotation marks.
I need the data fields to retain their quotation marks like they
have in the original CSV file.

Regards,
--
Bil Kleb
http://fun3d.larc.nasa.gov
http://twitter.com/bil_kleb


Re: FasterCSV: preserving quoted strings

by James Gray-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Feb 26, 2009, at 12:44 PM, Bil Kleb wrote:

> I have a CSV file with headers and rows like
>
> "scheme","time_steps","dt"
> "1storder",2,0.5
> "4thorder",5,1.0
>
> and I am using FasterCSV to read this CSV file to get
> a hash of header=>value pairs for each row.
>
> For each row worth of data, I create an output file of the form
>
> &some_weird_name
>  scheme = "1storder",
>  time_steps = 2,
>  dt = 0.5
> /
>
> What I'm currently getting is "1storder" without the quotation marks.
> I need the data fields to retain their quotation marks like they
> have in the original CSV file.

Well, then you don't really want a CSV parser.

Quotes in CSV data are used to indicate field grouping.  In other  
words, they are metadata about the content and it doesn't make sense  
for a parser to return those to you.  It's like how an XML parser  
wouldn't give you the equals sign used to set a tag attribute.

The way I see it you have two choices:

1.  Fix your data file so it's proper CSV (making the quotes a part of  
the field data).  For example, the first row would become:

"""scheme""","""time_steps""","""dt"""

A quote is doubled to escape it in CSV and another set is added to  
enclose each field, which is why they are tripled here.

2.  Decide that your data is not CSV and hand roll a parser to handle  
it. If you are sure fields won't contain commas, that may be a simple  
as:  fields = row.split(",").

Hope that helps.

James Edward Gray II


Re: FasterCSV: preserving quoted strings

by Bil Kleb-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Feb 26, 2:08 pm, James Gray <ja...@...> wrote:
>
> Well, then you don't really want a CSV parser.
>
> Quotes in CSV data are used to indicate field grouping.  In other  
> words, they are metadata about the content and it doesn't make sense  
> for a parser to return those to you.  It's like how an XML parser  
> wouldn't give you the equals sign used to set a tag attribute.

Ah, OK.  That clears things up.

Thanks,
--
Bil
http://fun3d.larc.nasa.gov
http://twitter.com/bil_kleb


Re: FasterCSV: preserving quoted strings

by Marcus Mitchell-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Bil Kleb wrote:

> On Feb 26, 2:08�pm, James Gray <ja...@...> wrote:
>>
>> Well, then you don't really want a CSV parser.
>>
>> Quotes in CSV data are used to indicate field grouping. �In other �
>> words, they are metadata about the content and it doesn't make sense �
>> for a parser to return those to you. �It's like how an XML parser �
>> wouldn't give you the equals sign used to set a tag attribute.
>
> Ah, OK.  That clears things up.
>
> Thanks,

or use :force_quotes => true when FasterCSV.open or FasterCSV.new
--
Posted via http://www.ruby-forum.com/.


Re: FasterCSV: preserving quoted strings

by James Gray-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Jul 9, 2009, at 5:59 AM, Marcus Mitchell wrote:

> Bil Kleb wrote:
>> On Feb 26, 2:08�pm, James Gray <ja...@...> wrote:
>>>
>>> Well, then you don't really want a CSV parser.
>>>
>>> Quotes in CSV data are used to indicate field grouping. �In other  
>>> �
>>> words, they are metadata about the content and it doesn't make  
>>> sense �
>>> for a parser to return those to you. �It's like how an XML parser  
>>> �
>>> wouldn't give you the equals sign used to set a tag attribute.
>>
>> Ah, OK.  That clears things up.
>>
>> Thanks,
>
> or use :force_quotes => true when FasterCSV.open or FasterCSV.new

That option causes FasterCSV to always quote fields on output.  Bil  
was asking if he could have the quotes left in his fields on input.

James Edward Gray II