speed issues with repeat loop to clear duplicates from a small text fi

Hello,

I’m trying to parse a file with about 1300 lines - yet the following code takes about 5 minutes.
I’m wondering what would be the more efficient way to do it?

set prefsContents to (read myPrefsFile)
close access myPrefsFile

set presentNonDuplicates to {}
set AppleScript's text item delimiters to tab

repeat with i in paragraphs of prefsContents
	if text item 1 of i is in presentNonDuplicates then
		-- do nothing - don't need to add it
	else
		set end of presentNonDuplicates to text item 1 of i
	end if
end repeat

Welcome. :slight_smile:

I don’t know if it would be much faster, but you can try something like this:

set prefsContents to paragraphs of (read myPrefsFile)

set presentNonDuplicates to {}

set prevTIDs to AppleScript's text item delimiters
set AppleScript's text item delimiters to tab

repeat with i in prefsContents
	tell (text item 1 of i) to if it is not in presentNonDuplicates then
		set end of presentNonDuplicates to it
	end if
end repeat

set AppleScript's text item delimiters to prevTIDs

presentNonDuplicates --> View result in editor

There’s also the embedded script approach:

script O
	property pc : {}
	property pnd : {}
end script

set O's pc to paragraphs of (read myPrefsFile)

set prevTIDs to AppleScript's text item delimiters
set AppleScript's text item delimiters to tab

repeat with i in O's pc
	tell (text item 1 of i) to if it is not in O's pnd then set end of O's pnd to it
end repeat

set AppleScript's text item delimiters to prevTIDs

O's pnd --> View result in editor

Alternatively:


set myPrefsFile to choose file without invisibles -- Example

do shell script "/usr/bin/ruby -e 'duplicates = []; gets(\"\").each { |line| item = line.split(\"\\t\")[0]; duplicates << item if !duplicates.include?(item) }; puts duplicates' " & quoted form of POSIX path of myPrefsFile
set presentNonDuplicates to paragraphs of result

There’s always the embedded script approach:

script O
	property prefsContents : {}
	property presentNonDuplicates : {}
end script

set prevTIDs to AppleScript's text item delimiters
set AppleScript's text item delimiters to tab

repeat with i in O's prefsContents
	tell (text item 1 of i) to if it is not in O's presentNonDuplicates then
		set end of O's presentNonDuplicates to it
	end if
end repeat

set AppleScript's text item delimiters to prevTIDs
O's presentNonDuplicates --> View result in editor

If you can live with having the results sorted, here is another do shell script variation:

set myPrefsFile to choose file without invisibles
do shell script "/usr/bin/sort -u -t $'\\t' -k 1,1 " & quoted form of POSIX path of myPrefsFile & " | /usr/bin/cut -f 1 "
set presentNonDuplicates to paragraphs of result

Model: iBook G4 933
AppleScript: 1.10.7
Browser: Safari 3.0.4 (523.12)
Operating System: Mac OS X (10.4)