Tuesday, November 21, 2017
  • Index
  •  » Code Exchange
  •  » A small library for writing rfc4180 CSV records, one at a time

#1 2015-04-09 08:12:13 pm

McUsrII
Member
Registered: 2012-11-21
Posts: 3046
Website

A small library for writing rfc4180 CSV records, one at a time

Hello.

A minimal library for writing CSV files, that should be as compatible as possible with applications that supports CSV, by conforming to the RFC4180 specification.

The library requires Mavericks or later, because I use features of the open for access I believe are new.

The library is intended to write one record at a time, and is not optimized for "batch operation" of creating CSV. You should however be able to reuse the bits here, and create something that should suit your needs.

The above pretty much means that if you generate CSV with this library, it should enter smoothly into Excel[*] and FileMaker Pro. I also assume, that if it is easy to import it into Ms Excel, then it should be equally easy to import into Numbers.

A list consisiting of:

A "lazy" dog, 1.23,Morning


will be translated into:

" \"\"lazy\"" dog", "1,23","Morning"


Which is how it is supposed to be represented according to the rfc4180 [1]

However, if you for some reason want to specify something else as a value (list) separator, (a ";" for instance), then I see no problem in doing that. You'll just have to make sure that the application you intend to import the csv into can handle it.  I also see no problem in changing the record separator from linefeed to return, if that's most comfortable for you.

Important
=========
The RFC states that only carriage returns (AppleScript returns) can be used as a linebreak inside a quoted field. Microsoft Excel doesn't support this, whether the field is quoted or not, and indeed breaks the RFC "quoting" it otherwise can handle. FileMaker Pro accepts this as long as it indeed is a carriage return.) I have no idea about how  Numbers handle returns within comma separated values. I have to chosen to leave it to the library user's discretion to handle this problem outside the on makeCSVRecord(soonToBeCSV) call, that is, that you have prepared any items in the soonToBeCSV list, so that they at least don't contain newlines, and that they are converted to returns,

With regards to numbers, it is at your discretion how numbers are specified, should you choose to use formatted numbers with this library, that is adorned numbers with thousands separators and such. This library merely converts numbers to text.

There isn't much more to it than that, other than that returns (not linefeeds) are allowed inside quotes, but not inside quoted text, which would be inside "lazy" above.

[*]Excel can really only deal with utf-8, if it receives the csv as  .txt file, therefore you must import from the text file choice. In order for the quotation to be removed properly, you must, (after having set "," as the field separator) check off for "Treat consecutive delimiters as one" and be sure that quote is selected as the delimiter in the "delimiter-dropdown", as you select the text.

[1] https://tools.ietf.org/html/rfc4180#section-1

Other references:
-----------------

http://en.m.wikipedia.org/wiki/Comma-separated_values

http://en.m.wikipedia.org/wiki/Decimal_ … _computers


Applescript:


use AppleScript version "2.3"
use scripting additions

(*
Copyright 2015 © McUsr: You may not post this work as your own somwhere else.
No warranties about anything, what so ever.
The library requires Mavericks or later, because I use features of the open for access I believe are new.
*)


property recordSeparator : linefeed
-- I believe that return is the original one, but I
-- have found no harm in using newlines
property listSeparator : ","
-- Although this is a strictly comma, I believe it
-- doesn't hurt to change it to ";" for instance,
-- if this helps out your workflow.

(*
on run
   driver()
end run
on driver()
   -- try the example, and see what it looks like in TextEdit
   set L1 to {"02:00", "03:00", "1,1"}
   set L2 to {"a \"lazy\" dog", 1.23, true}
   set L3 to {(current date), "a \"nice\" day", "04:00"}
   set res to makeCSVRecord(L3)
   -- probably not working all places
   set success to appendToFile(res, "~/Desktop/CSVTest.txt")
   if not success then error "problems writing to file"
end driver
*)


to makeCSVField(someData)
   if class of someData is text then
       -- check for quotes, those needs to
       -- be replaced by two quotes.
       set {tids, AppleScript's text item delimiters} to {AppleScript's text item delimiters, "\""}
       set bits to text items of someData
       set AppleScript's text item delimiters to "\"\""
       set someData to bits as text
       set AppleScript's text item delimiters to tids
   else
       -- bruteforce the data into text
       set someData to someData as text
   end if
   return "\"" & someData & "\""
end makeCSVField


on makeCSVRecord(soonToBeCSV)
   set CSVRecord to ""
   set rLen to ((length of soonToBeCSV) - 1)
   repeat with i from 1 to rLen
       set CSVRecord to CSVRecord & makeCSVField(item i of soonToBeCSV) & listSeparator
   end repeat
   set CSVRecord to CSVRecord & makeCSVField(item (rLen + 1) of soonToBeCSV) & recordSeparator
   return CSVRecord
end makeCSVRecord

to appendToFile(theData, theFile)
   if theFile starts with "~/" then
       set theFile to POSIX path of (path to home folder as text) & text 3 thru -1 of theFile
   end if
   -- Stolen from Chris Stone, as I have stolen the omission of file specifier
   -- or alias from the open for access command.
   (*For adding text to the end of a file*)
   --returns boolean success (true=success)
   --Assumes: theFile is a file path string and file exists.    
   try
       -- open the file, or create it if it doesn't exist
       set fRef to open for access theFile with write permission
       -- Get the file length
       set flen to (get eof fRef)
       
       -- Append contents to file
       write theData to fRef starting at (flen + 1) as «class utf8»
       -- Close it
       close access fRef
       return true
   on error
       try
           close access fRef
       end try
       return false
   end try
end appendToFile

Edit
Fixed a condition I overlooked, A CVS record consisting of just one field, I edited makeCSVRecord to consider the possibility.

Edit+
I had forgotten to remove a wrongful paragraph about lineendings in TextEdit, which can't be set there, that paragraph is now removed.

Edit++
Added a section (important) about when you can have returns encoded inside quoted text-fields, and when not.

Last edited by McUsrII (2015-04-11 03:25:05 pm)


Filed under: excel, csv, FileMaker Pro, RFC4180

Offline

 

#2 2015-04-11 03:28:03 pm

McUsrII
Member
Registered: 2012-11-21
Posts: 3046
Website

Re: A small library for writing rfc4180 CSV records, one at a time

Added a section in the documentation, describing that FileMaker Pro needs returns if line-breaks are to be used inside a csv-value, and that Excel doesn't handle it. And that it is the library users responsibility to see to that any use of linebreaks will work with the program he is to import the cvs into. (The removal, or conversion to, must happen before the encoding of csv takes place.

Offline

 
  • Index
  •  » Code Exchange
  •  » A small library for writing rfc4180 CSV records, one at a time

Board footer

Powered by FluxBB

RSS (new topics) RSS (active topics)