how would you go about doing this ? can use bbedit in fact would prefer that
I tryed to “normalize line endings” but I just get an error on that
thanks
mm
how would you go about doing this ? can use bbedit in fact would prefer that
I tryed to “normalize line endings” but I just get an error on that
thanks
mm
Try something like this:
-- Example
choose file without invisibles
set test to read result
set ASTID to AppleScript's text item delimiters
set AppleScript's text item delimiters to return -- ASCII character 13
set test to "" & paragraphs of test
set AppleScript's text item delimiters to ASTID
test
Bruce thanks for that unfortunately that is going to take way to long 18 MB text file to process I need something a little more robust bbedit does a great job at this kind of stuff I am doing some string subs already and the are light years faster then the alternative applescript solution so I’m hoping to find a way to do this using bbedit.
again thank you
mm
I disagree that an AppleScript approach will take much longer than having BBEdit do this. The idea here is to read the text into a script variable (so it’s held in memory) and then act on that.
script T
property myText : missing value
end script
choose file without invisibles
set T's myText to read result
set CR to ASCII character 13
set CRLF to CR & (ASCII character 10)
set myText to findAndReplace(CRLF, CR, T's myText)
-- Nigel Garvey's find/replace handler
-- returns class of original.
on findAndReplace(toFind, replaceWith, theText)
set ASTID to AppleScript's text item delimiters
set AppleScript's text item delimiters to toFind
set textItems to theText's text items
set AppleScript's text item delimiters to replaceWith
tell textItems to set editedText to beginning & replaceWith & rest
set AppleScript's text item delimiters to ASTID
return editedText
end findAndReplace
I got an out of memory error trying to run that. I have 2 gigs of ram on my machine so I’m not sure how that works…
mm
ok I figured out how to do it in bbedit. as long as you have your default line breaks set up as “Mac (CR)” then you can do this
set filePath to choose file
tell application "BBEdit"
open filePath with LF translation
-- do stuff
end tell
at least I think this is working I’m not sure how to tell ?
mm
Turn on “Show Invisibles” in the Edit>Text Options sheet.
correction
tell application "BBEdit"
tell text document 1
set line breaks to Mac
end tell
end tell
read/write on the fly with reference numbers is pretty quick.
set f to choose file
set lf to (ASCII character 10)
set new_file to ((path to desktop as string) & "new.txt") as file specification
set ref1 to open for access f
set ref2 to open for access new_file with write permission
try
repeat
set t to read ref1 before lf
write t to ref2
end repeat
on error
close access ref1
close access ref2
end try
This avoids out of memory errors.
gl,
Kel,
That works great even on the 18 MB text file it took two seconds if that…
I’m unclear about how it know what it is repeating with and I’m not sure how I would incorperate a find and replace at the same time it seem applescript my do that faster if it is done this way
mm
Hi,
Come to think of it if I’m remembering right, reading in blocks was way faster. Like in 10MB blocks. Nigel Garvey posted some quick scripts that avoids out of memory errors. I have to do some review.
gl,
Something like this is way faster I think. It reads in 1MB blocks instead of paragraphs. You could still get stack overflow errors if the paragraphs are very small.
set f to choose file
set lf to (ASCII character 10)
set dos_ending to return & lf
set block_size to 1024 -- 1KB
set new_file to ((path to desktop as string) & "new.txt") as file specification
set ref1 to open for access f
set ref2 to open for access new_file with write permission
try
repeat
set t to read ref1 for block_size
if t ends with return then -- move the byte marker 1B
read ref1 for 1
end if
set t to ReplaceText(t, dos_ending, return)
write t to ref2
end repeat
on error err_msg
close access ref1
close access ref2
end try
--
-- searches text t for string s and replaces with string r
on ReplaceText(t, s, r)
set utid to AppleScript's text item delimiters
set AppleScript's text item delimiters to {s}
set temp_list to text items of t
set AppleScript's text item delimiters to {r}
set temp_text to temp_list as string
set AppleScript's text item delimiters to utid
return temp_text
end ReplaceText
In the other script I posted, it reads a dos paragraph before the linefeed and writes that to the new file. When the byte marker is at the end of file, it errors when read and closes the file. There is no replacing needed because it’s not reading the linefeeds. This script, on the other hand, does a search and replace of the blocks read. If it’s still too slow, you might be able to speed it up by using a reference to the list in the ReplaceText subroutine.
gl,
gl,
Kel this is good stuff thank you.!
I modified the script to what I am working on. it returns a time of 29 which is much better than previous attempts
set f to choose file
set FRPrefs to choose file
set FRPrefs to read FRPrefs as text
set StartTime to current date
set LF to (ASCII character 10)
set dos_ending to return & LF
set block_size to 1024 -- 1KB
set new_file to ((path to desktop as string) & "new.txt") as file specification
set ref1 to open for access f
set ref2 to open for access new_file with write permission
try
repeat
set T to read ref1 for block_size
if T ends with return then -- move the byte marker 1B
read ref1 for 1
end if
set T to replacetext(T, dos_ending, return)
repeat with i from 1 to count of paragraphs in FRPrefs
set apara to paragraph i of FRPrefs
set apara to my makeList(apara, "|")
set fText to item 1 of apara
set rText to item 2 of apara
set T to replacetext(T, fText, rText)
end repeat
write T to ref2
end repeat
on error err_msg
close access ref1
close access ref2
end try
set EndTime to current date
set SubRoutineTimeDuration to (EndTime - StartTime)
display dialog SubRoutineTimeDuration
--
-- searches text t for string s and replaces with string r
on replacetext(T, s, r)
set utid to AppleScript's text item delimiters
set AppleScript's text item delimiters to {s}
set temp_list to text items of T
set AppleScript's text item delimiters to {r}
set temp_text to temp_list as string
set AppleScript's text item delimiters to utid
return temp_text
end replacetext
on makeList(astring, asep)
set tid to AppleScript's text item delimiters
set AppleScript's text item delimiters to asep
set thelist to {}
repeat with i from 1 to count of text items in astring
copy text item i of astring to end of thelist
end repeat
set AppleScript's text item delimiters to tid
return thelist
end makeList
Now here is the bbeidt versions it returns in 18
-- This is a simple translation droplet using BBEdit's replace feature.
on open fileList
set StartTime to current date
set FilesToProcess to {}
-- extract the prefs file from the group of files
tell application "Finder"
repeat with aFile in fileList
if name of aFile = "FRPrefs.txt" then
set FRPrefs to aFile
else
copy aFile to end of FilesToProcess
end if
end repeat
end tell
repeat with aFile in FilesToProcess
set filePath to (aFile as Unicode text)
-- use text item delimiters to parse path
set AppleScript's text item delimiters to ":"
-- get path to the folder file is in
set folderPath to (text 1 thru text item -2 of filePath) & ":"
-- get file's name
set fileName to text item -1 of filePath
set AppleScript's text item delimiters to "."
-- remove extension from name
try -- next line will error if there is no extension
set fileNameStub to (text 1 thru text item -2 of fileName)
set theExtension to "." & text item -1 of fileName
on error -- there is no extension
set theExtension to ""
end try
set AppleScript's text item delimiters to ""
set translationFilePath to folderPath & fileNameStub & "_converted" & theExtension
set FRPrefs to read FRPrefs as text
--open the file
tell application "BBEdit"
activate
open alias filePath
tell text document 1
set line breaks to Mac
end tell
tell text window 1
--replacing code
repeat with i from 1 to count of paragraphs in FRPrefs
set apara to paragraph i of FRPrefs
set apara to my makeList(apara, "|")
set fText to item 1 of apara
set rText to item 2 of apara
replace fText using rText searching in text 1 of text document 1 options {search mode:literal, starting at top:true, wrap around:false, backwards:false, case sensitive:false, match words:false, extend selection:false}
end repeat
--write translated text to a new file
save to file translationFilePath
close saving no
end tell
end tell
end repeat
set EndTime to current date
set SubRoutineTimeDuration to (EndTime - StartTime)
display dialog SubRoutineTimeDuration
end open
on makeList(astring, asep)
set tid to AppleScript's text item delimiters
set AppleScript's text item delimiters to asep
set thelist to {}
repeat with i from 1 to count of text items in astring
copy text item i of astring to end of thelist
end repeat
set AppleScript's text item delimiters to tid
return thelist
end makeList
I am running both of these on the same 18 mb file
thsnks for the help
mm
I don’t have any files that big… How fast does this run for you?
choose file without invisibles
set sourceFile to POSIX path of result
choose file name -- Oops; Should be `file name` instead of `file`
set outputFile to POSIX path of result
set timer to current date
do shell script "/usr/bin/ruby -e 'print $stdin.read.gsub(\"\\r\\n\", \"\\r\")' < " & quoted form of sourceFile & " > " & quoted form of outputFile
set timer to (current date) - timer
Bruce,
what am I suppose to select as the output file ?
mm
You’re supposed to destroy an existing file, of course! :rolleyes: Actually, that was just a mistake; I edited the script above.
ok so the results are in … just doing the LFCR to CR
Ruby scores a time of 0
BBedit scores a time of 3
while AppleScript comes in last with 4
Now there are more find and replaces that I am doing notice in both the AS and BBe scripts that I posted earlier I am supplying an input file(s) as well as a prefs.txt file this is a list of pipe seperated find and replace so that the user can do all sorts of find and replace.
So with the ruby I am worried about how to escape special characters I haven’t really got to know ruby and I am reluctant to use something I know nothing about; is that also a standard install I know I had the ruby extension install on my computer so I could make ruby applescript but I think it the basic install was already there.
mm
I believe ruby is part of a standard install as of Mac OS X 10.3.
Hey Bruce,
What about just using sed? or even tr
sed ‘/\r/\n/g’ < Input.txt > output.txt
From the terminal it is as fast as my computer can write to disk. There may be a slight performance hit in wrapping it in a do shell script.
Andy
Browser: Safari 412
Operating System: Mac OS X (10.4)
Edit:
Actually, the OP seems to want
sed ‘/\r\n/\r/g’ < Input.txt > output.txt
A slightly different script where you can manually add more replacements:
choose file without invisibles
set sourceFile to POSIX path of result
choose file name
set outputFile to POSIX path of result
--set timer to current date
do shell script "/usr/bin/ruby -e '
text = $stdin.read
text.gsub!(\"\\r\\n\", \"\\r\")
text.gsub!(\"find this\", \"replace with\")
text.gsub!(\"find that\", \"replace with\")
print text
' < " & quoted form of sourceFile & " > " & quoted form of outputFile
--set timer to (current date) - timer
I believe the only characters to watch out for would be quotes (you’d have to protect double quotes from ruby and single quotes from the shell).
If you can give me a small example of your prefs.txt file, I could try making a script that reads it.
tugboat, that should also work.