Wednesday, January 17, 2018

#1 2015-12-11 01:33:46 am

JMichaelTX
Member
From:: Houston, TX (The Woodlands)
Registered: 2014-07-12
Posts: 139

How to Read CSV File and Parse into AppleScript List

How to Read CSV File and Parse into AppleScript List

As I had a recent need for this functionality, I did a fairly extensive Internet search, and could not find exactly what I needed.  I found lots and lots of solutions, lots bits and pieces, and a few very complex solutions.

So, based on everything I read, I have built my own script.  Certainly the idea is not new, and many have contributed long before me.  I apologize in advance if I happen to be using parts of code that you have posted somewhere on the Internet.

For the benefit of others like me, I'm posting my compiled solution.  It has no error checking (which it needs) and is a very SIMPLE approach that does NOT try to adhere to the RFC 4180 de facto standard.

For a more robust, extensive process, see:
CSV-to-list converter by Nigel Garvey, 2010-03-13
http://macscripter.net/viewtopic.php?pid=125444#p125444

For AppleScript newbies, it puts together the CSV file read, and the parsing of the fields/columns.
It meets my needs, but I'm sure it will not meet everyone's needs.

If you see any issues, or have suggestions for improvement, please post here.

Applescript:


(*
====================================================
   [CSV] How to Read CSV File into AppleScript List
====================================================

PURPOSE:
   â€¢ Provide a simple process to read a CSV file, and parse each line/row into a AS list
       for further use as needed by the programmer.
   â€¢ This is a tool/example for the AppleScript programmer, not the end-user of a script
   â€¢ See limitations below
   
DATE:    Fri, Dec 11, 2015            VER: 1.0
AUTHOR: JMichaelTX (on MacScripter.net and StackOverflow.com forums)

LIMITATIONS:
   â€¢ I don't believe the parsing of the CSV data here conforms to the RFC 4180 de facto standard
   â€¢ It doesn't allow for clear separation of strings from numbers
   â€¢ If you need a more full-featured, compliant CSV parser, then see the below ref.

REF:    For a more robust, extensive process, see:
           CSV-to-list converter by Nigel Garvey, 2010-03-13
           [url]http://macscripter.net/viewtopic.php?pid=125444#p125444[/url]
           
SAMPLE INPUT DATA (from CSV file):
   Parent,Tag,Num Notes            <=== first line/row is column titles
   !SYMBOLS,SYM.ES,TBD
   ZIP_List,ZIP.77077,TBD
   FINANCE,FIN.Call,TBD
   Evernote,EN.UI,TBD
   HISTORY,HiS.NatlSec,TBD
   . . .

SAMPLE OUTPUT DATA (log)
   (*Number of Rows: 102*)

   (*!SYMBOLS, SYM.ES, TBD*)
   (*ZIP_List, ZIP.77077, TBD*)
   (*FINANCE, FIN.Call, TBD*)
   (*Evernote, EN.UI, TBD*)
   (*HISTORY, HiS.NatlSec, TBD*)
   . . .
====================================================
*)
--- GET THE CSV FILE YOU WANT TO READ ---
set pathInputFile to (choose file with prompt "Select the CSV file" of type "csv")

--- READ THE FILE CONTENTS ---
set strFileContents to read pathInputFile

--- BREAK THE FILE INTO PARAGRAPHS (i.e., ROWS or LINES) ---
--        (AS Paragraphs are separated by LF or CR)

set parFileContents to (paragraphs of strFileContents)
set numRows to count of parFileContents
log "Number of Rows: " & numRows
--log parFileContents

--- PROCESS EACH PARAGRAPH (AKA LINE or ROW) OF INPUT FILE ---

--–––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
repeat with iPar from 2 to number of items in parFileContents
   --––––––––––––––––––––––––––––––––––––––––––––––––––––––––
   --        Skip first row since it has column titles, data starts in 2nd row
   
   set lstRow to item iPar of parFileContents
   if lstRow = "" then exit repeat -- EXIT Loop if Row is empty, like the last line
   
   set lstFieldsinRow to parseCSV(lstRow as text)
   
   --- THE FOLLOWING LINES ARE SPECIFIC TO YOUR FILE/DATA ---
   
   set strParTag to item 1 of lstFieldsinRow -- COL 1 of CSV file
   set strTag to item 2 of lstFieldsinRow -- COL 2 of CSV file
   set numNotes to item 3 of lstFieldsinRow -- COL 3 of CSV file
   
   log lstFieldsinRow
   --log "[" & (iPar - 1) & "]: " & strTag
   --–––––––––––––––––––––––––––––––––––––––––––––––––––
   
   --––––––––––––––––––    
end repeat -- with iPar
--–––––––––––––––––––––––––––––––––––––––––––––––––––––––––

--=============== END OF MAIN SCRIPT ==============

on parseCSV(pstrRowText)
   set {od, my text item delimiters} to {my text item delimiters, ","}
   set parsedText to text items of pstrRowText
   set my text item delimiters to od
   return parsedText
end parseCSV


iMac-27 Late 2015 Retina 5K Screen (& others)
macOS 10.11.6 (El Capitan)


Filed under: parse, csv, import, Read file

Offline

 

#2 2015-12-11 02:43:42 am

alastor933
Member
From:: Utrecht, NL
Registered: 2008-09-12
Posts: 544

Re: How to Read CSV File and Parse into AppleScript List

This routine will convert csv data to a list of lists: each sublist contains one line of the input file, each sublist item is a field from the input.
I've used it for years to deal with the transactions files from my bank.

Offline

 

#3 2015-12-13 01:35:20 pm

ccstone
Member
Registered: 2009-02-07
Posts: 409

Re: How to Read CSV File and Parse into AppleScript List

I think Shane has posted an ASObjC method for handling CSV data that intelligently handles commas within quoted field values.

But for simple field separation into a list of lists I'd use the Satimage.osax.  It boils down to a couple of lines of code.

Applescript:


set cvsData to text 2 thru -1 of "
Parent,Tag,Num Notes
!SYMBOLS,SYM.ES,TBD
ZIP_List,ZIP.77077,TBD
FINANCE,FIN.Call,TBD
Evernote,EN.UI,TBD
HISTORY,HiS.NatlSec,TBD
"


# Grab each line containing non-whitespace characters into a list:
set cvsDataList to find text "^.*\\S.*" in cvsData with regexp, all occurrences and string result

# Split each list item into fields creating a list of lists:
set cvsDataList to find text "[^,]+" in cvsDataList with regexp, all occurrences and string result

--
Chris
_________________________________________________________
{ MacBookPro6,1 · 2.66 GHz Intel Core i7 · 8GB RAM · OSX 10.11.2 }
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

Last edited by ccstone (2015-12-13 01:36:13 pm)

Offline

 

#4 2015-12-14 12:23:34 am

JMichaelTX
Member
From:: Houston, TX (The Woodlands)
Registered: 2014-07-12
Posts: 139

Re: How to Read CSV File and Parse into AppleScript List

ccstone wrote:

But for simple field separation into a list of lists I'd use the Satimage.osax.  It boils down to a couple of lines of code.


Thanks Chris.  As always I appreciate your excellent solutions.

For the benefit of all of us (especially me), would you mind commenting on the advantages of your solution vs the one I posted?

Always looking to learn.

Best Regards,
JMichaelTX


iMac-27 Late 2015 Retina 5K Screen (& others)
macOS 10.11.6 (El Capitan)

Offline

 

#5 2016-01-11 11:09:02 am

DJ Bazzie Wazzie
Member
From:: the Netherlands
Registered: 2004-10-20
Posts: 2728
Website

Re: How to Read CSV File and Parse into AppleScript List

Since we're sharing osaxen solutions here: I have finished the AppleScript Toolbox 2.0 osax today and released I waited with an reply how you can use that osax as well for both quoted and unquoted fields to parse CSV data easily.

When you have simple fields as in ccstone's example with no quoted fields:

Applescript:

set csvData to "Parent,Tag,Num Notes
!SYMBOLS,SYM.ES,TBD
ZIP_List,ZIP.77077,TBD
FINANCE,FIN.Call,TBD
Evernote,EN.UI,TBD
HISTORY,HiS.NatlSec,TBD"



set csvDataList to AST find regex "(?:" & linefeed & "?)([^" & linefeed & "]*)" in string csvData regex group 2
set csvDataList to AST find regex "(?:,?)([^,]*)" in string csvDataList regex group 2

When you have mixed quoted and unquoted fields:

Applescript:

set csvData to "Parent,Tag,Num Notes
!SYMBOLS,SYM.ES,TBD,\"Just an example
with multiline comments, field separators
and \"\"escpaped\"\" characters\"
ZIP_List,ZIP.77077,TBD
FINANCE,FIN.Call,TBD
Evernote,EN.UI,TBD
HISTORY,HiS.NatlSec,TBD"


set csvDataList to AST find regex "([" & linefeed & "])?((?:[^" & linefeed & "\"]*(?:(?:\"([^\"]*|\"\")*\"))*)*)" in string csvData regex group 3
set csvDataList to AST find regex "(,)?((?:\"((?:[^\"]|\"\")*)\")|([^,]*))" in string csvDataList regex group 3

Note: the field itself is not unquoted.

Last edited by DJ Bazzie Wazzie (2016-01-11 11:09:17 am)

Offline

 

#6 2018-01-07 09:05:30 pm

dlaurentny
Member
Registered: 2014-07-12
Posts: 24

Re: How to Read CSV File and Parse into AppleScript List

Thanks much!!  Creating CSS/HTML Timelines and using script to read a CSV file and then create the HTML code.  Used your code almost verbatim

Applescript:


-- Thank you AUTHOR: JMichaelTX 2015 (on MacScripter.net and StackOverflow.com forums) for sample scripts used here --

set strOutput to "" as string

-- Choose and read contents of CSV file

set pathInputFile to (choose file with prompt "Select the CSV file" of type "csv")
set strFileContents to read pathInputFile

set parFileContents to (paragraphs of strFileContents)
set numRows to count of parFileContents
log "Number of Rows: " & numRows

-- Parse CSV Row into 'Columns'

repeat with iPar from 2 to number of items in parFileContents -- set to 1 if no header row
   set lstRow to item iPar of parFileContents
   if lstRow = "" then exit repeat -- EXIT Loop if Row is empty, like the last line
   
   set lstFieldsinRow to parseCSV(lstRow as text)
   
   set strYear to item 1 of lstFieldsinRow -- COL 1 of CSV file
   set strTitle to item 2 of lstFieldsinRow -- COL 2 of CSV file
   set strDesc to item 3 of lstFieldsinRow -- COL 3 of CSV file
   
   log lstFieldsinRow
   --log "[" & (iPar - 1) & "]: " & strTag
   
   
   -- Write content of Row to strOutput
   -- Use [\"] for Quotes inside output string
   
   set strOutput to strOutput & "<div class=\"column\">" & return & "<div class=\"title\">" & return & "<h1> " & strYear & "</h1>" & return & "<h2> " & strTitle & "</h2>" & return & "</div>" & return & "<div class=\"description\">" & return & "<p>" & strDesc & "</p>" & return & "</div>" & return & "</div>" & return & return
   
   
   log strOutput
   
   -- Loop to EOF or Exit
end repeat -- with iPar


-- Set Output File
set outputFile to ((path to desktop as text) & "TimeLineOutput_html.txt")

-- Write Body to File
try
   set fileReference to open for access file outputFile with write permission
   write strOutput to fileReference
   close access fileReference
on error
   try
       close access file outputFile
   end try
end try


-- Functions

on parseCSV(pstrRowText)
   set {od, my text item delimiters} to {my text item delimiters, ","}
   set parsedText to text items of pstrRowText
   set my text item delimiters to od
   return parsedText
end parseCSV



-- On errors --



Filed under: HTML, TXT, csv, import

Offline

 

Board footer

Powered by FluxBB

RSS (new topics) RSS (active topics)