You are not logged in.
I'm excited about using sqlite3 to solve some problems. I want to make a droplet that will take a CSV file and create a database.
I've used these handlers that were kindly uploaded and where I need some advice.
http://macscripter.net/viewtopic.php?id=27928
Here's a representative string of a row I want to insert into a table:
{"caiv2,2010BBDGRC,\"President, Board of Directors\""}
I'd like to end up with ('caiv2','2010BBDGRC',\"President, Board of Directors\") to insert.
You can see that I need help with how to handle the escaped field value.
But before that, I need to get the values ready to work. The handlers seem to work great except for the values. I can say that I'm able to create the db, tables and columns, but I'm breaking on the values. Reading the comments, this ought to work but it's breaking as below:
Applescript:
set a to {"caiv2,2010BBDGRC,\"President, Board of Directors\""}
set b to my tidStuff(quote, a) --> {"caiv2,2010BBDGRC,\"President, Board of Directors\""} Doesn't help me.
set c to my returnCommaSepQuotedString(a, quote) --> "\"caiv2,2010BBDGRC,\"\"President, Board of Directors\"\"\"" --as you can see it doesn't break up into single quotes
set d to my returnCommaSepQuotedStringEntities(a) --> "\"\"\"caiv2,2010BBDGRC,\"\"\"\"President, Board of Directors\"\"\"\"\"\"\"" --This was just to try it.
return c
-----
-- TURN STRING INTO LIST
on tidStuff(paramHere, textHere)
set OLDtid to AppleScript's text item delimiters
set AppleScript's text item delimiters to paramHere
set theItems to text items of textHere
set AppleScript's text item delimiters to OLDtid
return theItems
end tidStuff
--for colums
on returnCommaSepQuotedStringEntities(the_array) -- use for column names (eg in create_db), table names, not values
set quoteChar to quote -- AppleScript's reserved word for a double quote: "
return returnCommaSepQuotedString(the_array, quoteChar)
end returnCommaSepQuotedStringEntities
on returnCommaSepQuotedStringValues(the_array) -- use for values in rows (eg in sql_insert), not for column names
set quoteChar to "'" -- single quote
returnCommaSepQuotedString(the_array, quoteChar)
end returnCommaSepQuotedStringValues
on returnCommaSepQuotedString(the_array, quoteChar)
set oldDelimiters to AppleScript's text item delimiters
-- Find any text values, quote them and escape any quote characters with quotes:
repeat with item_ref in the_array
set item_value to contents of item_ref
if class of item_value is text then
if item_value contains quoteChar then
set AppleScript's text item delimiters to quoteChar
set parsedList to text items in item_value
set AppleScript's text item delimiters to (quoteChar & quoteChar)
set item_value to parsedList as text
end if
set item_value to (quoteChar & item_value & quoteChar) as text
set contents of item_ref to item_value
end if
end repeat
-- Join all the values together as a string, separated by commas:
set AppleScript's text item delimiters to ", "
set return_string to the_array as text
set AppleScript's text item delimiters to oldDelimiters
return return_string
end returnCommaSepQuotedString
The handler's see the one string as one field so it's breaking down.
How do I use it, or something like to break down the one text string into values, while keeping the escaped value because I really need that comma as part of the customer's job title.
What am I overlooking? I've beaten myself up trying to see past this, thanx, sam
Offline
You may find something useful here
Offline
Thanks, Craig, but this was helpful, but not for this problem.
If I have a list of CSV as in the above, how would I extract out the actual values without using comma as a text item delimiter?
I cannot use a comma because it would rip this value:
\"President, Board of Directors\"
How should I parse out:
{"caiv2,2010BBDGRC,\"President, Board of Directors\""}
to get:
{'caiv2','2010BBDGRC',\"President, Board of Directors\"}
thanx, sam
Offline
Get all of the quoted elements first and then get the comma-sep elements then put them back together.
Offline
If you're not to fussy about retaining commas in the third element:
Applescript:
set tStart to {"caiv2,2010BBDGRC,\"President, Board of Directors\""}
-- note that this is a list of one item; a string.
set tid to AppleScript's text item delimiters
set AppleScript's text item delimiters to ","
set tParts to text items of (tStart as text)
set AppleScript's text item delimiters to tid
-- assuming that items 1 and 2 never have commas
-- and that you don't mind losing the comma in the
-- title:
tell tParts
set CSV_1 to item 1 --> "caiv2"
set CSV_2 to item 2 --> "2010BBDGRC"
set CSV_3 to items 3 thru -1 as text --> "\"President Board of Directors\""
end tell
To get fancier than that, you'd have to know more about that third part.
Offline
I wrote a CSV-to-list converter a few years ago. I don't know if it's what you want.
Applescript:
(* Assumes that the CSV text adheres to the convention:
Records are delimited by LFs or CRLFs (but CRs are also allowed here).
The last record in the text may or may not be followed by an LF or CRLF (or CR).
Fields in the same record are separated by commas (unless specified differently by parameter).
The last field in a record must not be followed by a comma.
Trailing or leading spaces in unquoted fields are not ignored (unless so specified by parameter).
Fields containing quoted text are quoted in their entirety, any space outside them being ignored.
Fields enclosed in double-quotes are to be taken verbatim, except for any included double-quote pairs, which are to be translated as double-quote characters.
No other variations are currently supported. *)
on csvToList(csvText, implementation)
-- The 'implementation' parameter must be a record. Leave it empty ({}) for the default assumptions: ie. comma separator, leading and trailing spaces in unquoted fields not to be trimmed. Otherwise it can have a 'separator' property with a text value (eg. {separator:tab}) and/or a 'trimming' property with a boolean value ({trimming:true}).
set {separator:separator, trimming:trimming} to (implementation & {separator:",", trimming:false})
script o -- Lists for fast access.
property qdti : getTextItems(csvText, "\"")
property currentRecord : {}
property possibleFields : missing value
property recordList : {}
end script
-- o's qdti is a list of the CSV's text items, as delimited by double-quotes.
-- Assuming the convention mentioned above, the number of items is always odd.
-- Even-numbered items (if any) are quoted field values and don't need parsing.
-- Odd-numbered items are everything else. Empty strings in odd-numbered slots
-- (except at the beginning and end) indicate escaped quotes in quoted fields.
set astid to AppleScript's text item delimiters
set qdtiCount to (count o's qdti)
set quoteInProgress to false
considering case
repeat with i from 1 to qdtiCount by 2 -- Parse odd-numbered items only.
set thisBit to item i of o's qdti
if ((count thisBit) > 0) or (i is qdtiCount) then
-- This is either a non-empty string or the last item in the list, so it doesn't
-- represent a quoted quote. Check if we've just been dealing with any.
if (quoteInProgress) then
-- All the parts of a quoted field containing quoted quotes have now been
-- passed over. Coerce them together using a quote delimiter.
set AppleScript's text item delimiters to "\""
set thisField to (items a thru (i - 1) of o's qdti) as string
-- Replace the reconstituted quoted quotes with literal quotes.
set AppleScript's text item delimiters to "\"\""
set thisField to thisField's text items
set AppleScript's text item delimiters to "\""
-- Store the field in the "current record" list and cancel the "quote in progress" flag.
set end of o's currentRecord to thisField as string
set quoteInProgress to false
else if (i > 1) then
-- The preceding, even-numbered item is a complete quoted field. Store it.
set end of o's currentRecord to item (i - 1) of o's qdti
end if
-- Now parse this item's field-separator-delimited text items, which are either non-quoted fields or stumps from the removal of quoted fields. Any that contain line breaks must be further split to end one record and start another. These could include multiple single-field records without field separators.
set o's possibleFields to getTextItems(thisBit, separator)
set possibleFieldCount to (count o's possibleFields)
repeat with j from 1 to possibleFieldCount
set thisField to item j of o's possibleFields
if ((count thisField each paragraph) > 1) then
-- This "field" contains one or more line endings. Split it at those points.
set theseFields to thisField's paragraphs
-- With each of these end-of-record fields except the last, complete the field list for the current record and initialise another. Omit the first "field" if it's just the stub from a preceding quoted field.
repeat with k from 1 to (count theseFields) - 1
set thisField to item k of theseFields
if ((k > 1) or (j > 1) or (i is 1) or ((count trim(thisField, true)) > 0)) then set end of o's currentRecord to trim(thisField, trimming)
set end of o's recordList to o's currentRecord
set o's currentRecord to {}
end repeat
-- With the last end-of-record "field", just complete the current field list if the field's not the stub from a following quoted field.
set thisField to end of theseFields
if ((j < possibleFieldCount) or ((count thisField) > 0)) then set end of o's currentRecord to trim(thisField, trimming)
else
-- This is a "field" not containing a line break. Insert it into the current field list if it's not just a stub from a preceding or following quoted field.
if (((j > 1) and ((j < possibleFieldCount) or (i is qdtiCount))) or ((j is 1) and (i is 1)) or ((count trim(thisField, true)) > 0)) then set end of o's currentRecord to trim(thisField, trimming)
end if
end repeat
-- Otherwise, this item IS an empty text representing a quoted quote.
else if (quoteInProgress) then
-- It's another quote in a field already identified as having one. Do nothing for now.
else if (i > 1) then
-- It's the first quoted quote in a quoted field. Note the index of the
-- preceding even-numbered item (the first part of the field) and flag "quote in
-- progress" so that the repeat idles past the remaining part(s) of the field.
set a to i - 1
set quoteInProgress to true
end if
end repeat
end considering
-- At the end of the repeat, store any remaining "current record".
if (o's currentRecord is not {}) then set end of o's recordList to o's currentRecord
set AppleScript's text item delimiters to astid
return o's recordList
end csvToList
-- Get the possibly more than 4000 text items from a text.
on getTextItems(txt, delim)
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to delim
set tiCount to (count txt's text items)
set textItems to {}
repeat with i from 1 to tiCount by 4000
set j to i + 3999
if (j > tiCount) then set j to tiCount
set textItems to textItems & text items i thru j of txt
end repeat
set AppleScript's text item delimiters to astid
return textItems
end getTextItems
-- Trim any leading or trailing spaces from a string.
on trim(txt, trimming)
if (trimming) then
repeat with i from 1 to (count txt) - 1
if (txt begins with space) then
set txt to text 2 thru -1 of txt
else
exit repeat
end if
end repeat
repeat with i from 1 to (count txt) - 1
if (txt ends with space) then
set txt to text 1 thru -2 of txt
else
exit repeat
end if
end repeat
if (txt is space) then set txt to ""
end if
return txt
end trim
set csvText to "caiv2,2010BBDGRC,\"President, Board of Directors\"" & linefeed & "Another line, for demo purposes"
csvToList(csvText, {})
--> {{"caiv2", "2010BBDGRC", "President, Board of Directors"}, {"Another line", "for demo purposes"}}
Edit: It's been pointed out to me that the original script omitted empty fields. That's now corrected and the trim() hander's a trifle faster than it was. The script also now conforms to RFC 4180, but has options to trim leading and trailing spaces from unquoted fields — csvToList(csvText, {trimming:true}) — and/or to recognise a different field separator — eg. csvToList(csvText, {separator:tab, trimming:true}). An empty record gets the defaults: unquoted fields not trimmed; comma separator. Returns are now tolerated as record separators.
Last edited by Nigel Garvey (2012-03-29 11:51:52 am)
Offline
Yes, yes, thank you for seeing what I was trying to communicate. This is an excellent gift to the community. I've learned a big deal going through it.
Offline