In need of help with tokenizing

If anybody would help me out with this, that would be totally awesome…

I am trying to analyze a small block of text in a plain text file (about 2500 bytes), and remove any unnecessary returns. I then need the modified text to be written into a new text file.

I figured that the best way to do this was with ACME’s tokenize command. I was thinking that if I could get the text to tokenize by returns, I could then remove any blank returns. I only need one return between the lines, not multiple.

I can make the text into return-delimited tokens, but I cannot remove/delete the excess returns. I’m sure it can be done, I just don’t know how.

The source text kinda looks like this:
"blah blah blah blah blah blah blah blah
blah blah blah blah blah blah blah blah

blah blah blah blah blah blah blah blah
blah blah blah blah blah blah blah blah

blah blah blah blah blah blah blah blah

blah blah blah blah blah blah blah blah"

Seriously, if any one could help, I would be very grateful!

You could use two returns as the delimiter – break the string into a list – change the delimiter to one return – and finally convert the list to a string.

hope this helps.

Using vanilla AppleScript:

Jon

Cool, look at the pretty colors! I downloaded but it doesn’t work with Script DeBugger 3. Do you know if there is another version out there for 3rd party editors?

Here is a short, but sweet, and completely non-vanilla solution to the original post using Akua List Suite (I have v106)

set someFile to read "Mac HD:Desktop Folder:somefile" as list using delimiter return--get the file contents as a return delimited list
set newList to collect items of someFile that match "" with negation and just contents--now just get the items of that list that do not match the blank lines

That will return every item of the original return delimited list that does not match “”. If there is an invisble character there you will have to use that in place of (that match “”).

I couldn’t get the vanilla solution to work but must confess I didn’t play with it much.

Best,

Here is a slightly modified version of the script to test for different line endings (Mac, Unix, etc.):

I don’t have Script Debugger so i can’t re-write the Markup Code script for it, sorry.

Jon

Hey, Mytzlscript, thanks for the help! I found this to be the most simplistic script of the two posted here… I seem to be having another slight issue, though (I just don’t quite understand the properties of delimiting yet, sorry).

I would like to replace the empty return fields with a tab character, if possible. I thought I could figure it out after getting some help with tokenizing/delimiting, but like I said, I still need to learn a little more about delimiting first.

Any ideas?

Thanks a bunch again, and thanks in advance for any help with this issue!

Just use my script above and change the line

to

Jon

If you want to take a list like:

and turn it into a list like:

Then this should work for you. I omitted the collect items of command - don’t need in this case.

set someFile to read "Macintosh HD:Desktop Folder:spam copy" as list using delimiter return --get the file contents as a return delimited list 
set compiledList to "" --set an empty variable we can fill with our new tab delimited list 
repeat with thisLine in someFile --repeat with every item in the list of items read
	set thisLine to thisLine as string --coerce to string / I don't know why I needed this but without it I end up with extra spaces
	if thisLine does not equal "" then --if it is not an empty line
		set compiledList to compiledList & thisLine & tab --then add it and a tab to the end of our compiled list
	end if
end repeat

You could also try reading the text file as text instead of a list and using Acme to replace carraige returns with tab.

Hope this helps