Monday, June 18, 2018

#1 2018-06-06 04:04:04 am

Registered: 2016-03-21
Posts: 2

keep newest file of two almost same files

Hi all,

Newbie on this, but with googling and trying I came quite far. This last bit I cannot really figure out.

So finally I have a folder with loads of files named something like this:

Where the number is the pagenumber.
If there are two files with the same pagenumber (001 and 132 in this case), I need to delete the oldest of the two.

I know I need to check if there are two files with the same pagenumber and then to get the mod-dates, but I can't figure out how to do that.

Could somebody point me in the right direction?
Thank you very much in advance.



#2 2018-06-06 11:12:08 am

From:: BFE, Massachusetts
Registered: 2013-01-13
Posts: 258

Re: keep newest file of two almost same files

Welcome to MacScripter.

This is a quick job of it and I only tested on one test folder with the specific names you gave, so please test some more for possible bugs before you go off and delete a ton of critical files.


use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions

tell application "Finder"
   set bookPagesFolder to choose folder
   set bookFiles to every item of bookPagesFolder
   set bookFileNames to the name of every item of bookPagesFolder
   set truncatedFileNames to {}
   set {delimitHolder, AppleScript's text item delimiters} to {AppleScript's text item delimiters, "-"}
   repeat with aFileName in bookFileNames
       set truncatedName to text item 1 of aFileName & "-" & text item 2 of aFileName
       set truncatedFileNames to truncatedFileNames & truncatedName
   end repeat
   set AppleScript's text item delimiters to delimitHolder
   set fileCount to the count of bookFiles
   set deleteItems to {}
   set checkedNames to {}
   repeat with i from 1 to fileCount - 1
       set currentFile to item i of bookFiles
       set currentTruncatedName to item i of truncatedFileNames
       if currentTruncatedName is not in checkedNames then
           set checkedNames to checkedNames & currentTruncatedName
           set fileOccurances to my list_positions(truncatedFileNames, currentTruncatedName, true)
           if (count of fileOccurances) ≠ 1 then -- you should always get one hit for the item itself
               set modDates to {}
               repeat with j in fileOccurances
                   set modDates to modDates & the modification date of item j of bookFiles
               end repeat
               set latestDate to item -1 of my simple_sort(modDates)
               repeat with k in fileOccurances
                   if the modification date of item k of bookFiles is not latestDate then set deleteItems to deleteItems & k
               end repeat
           end if
       end if
   end repeat
   repeat with l in deleteItems
       delete item l of bookFiles
   end repeat
end tell

on simple_sort(my_list)
   set the index_list to {}
   set the sorted_list to {}
   repeat (the number of items in my_list) times
       set the low_item to ""
       repeat with i from 1 to (number of items in my_list)
           if i is not in the index_list then
               set this_item to item i of my_list
               if the low_item is "" then
                   set the low_item to this_item
                   set the low_item_index to i
               else if this_item comes before the low_item then
                   set the low_item to this_item
                   set the low_item_index to i
               end if
           end if
       end repeat
       set the end of sorted_list to the low_item
       set the end of the index_list to the low_item_index
   end repeat
   return the sorted_list
end simple_sort

on list_positions(thisList, thisItem, listAll)
   set the offsetList to {}
   repeat with i from 1 to the count of thisList
       if item i of thisList is thisItem then
           set the end of the offsetList to i
           if listAll is false then return item 1 of offsetList
       end if
   end repeat
   if listAll is false and offsetList is {} then return 0
   return the offsetList
end list_positions

Note that the way I wrote this, if the modification dates of files are changing while the script is running, it could delete all the copies of that file. (It deletes all copies whose modification date is not the most recent - if it finds the most recent date, then the date on that file changes before it gets to deleting, it will delete all copies.) Seems unlikely you'll be saving the relevant files it's operating on during that microsecond, but thought I'd mention it.

Last edited by t.spoon (2018-06-06 11:20:08 am)

Hackintosh built February, 2012 |  Mac OS Sierra
GIGABYTE GA-Z68X-UD3H-B3 | Core i5 2500k | 16 GB DDR3 | GIGABYTE Geforce 1050 TI 4GB
250 GB Samsung 850 EVO | 4 TB RAID
Dell Ultrasharp U3011 | Dell Ultrasharp 2007FPb



#3 2018-06-07 05:42:49 am

Registered: 2016-03-21
Posts: 2

Re: keep newest file of two almost same files

Thank you t.spoon.... for the welcome and very much for your effort.

I quickly tested it and it works !
I'll now integrate it in the bigger script and see if the complete thing does what it needs to do :-)

Thanks !



Board footer

Powered by FluxBB

RSS (new topics) RSS (active topics)