Tuesday, November 19, 2019

#1 2019-10-16 07:51:30 am

alear
Member
Registered: 2016-10-02
Posts: 30

Script to count files by extension

Hello,

I am trying to make an script that count all the files extension in an HDD or folder and save it into a csv file or text file.

I have several external HDDs with many files inside folders and subfolders.
I want to count how many files are by file extension.
I do not know the file extensions in the HDDs, in some could be mp4, doc, xls in others mkv, flac, jpg, txt and many more.


I found the next command that count all files extension recursively and save as txt file:

find /Users/mac/Documents/zCARPA -type f | sed -n 's/..*\.//p' | sort | uniq -c > /Users/mac/Desktop/zCARPA.txt


result:

      2 DS_Store
     31 mkv
     89 mp3
     17 flac
      2 mp4
     30 pdf
     71 png
     23 txt


I tried my script something like this but doesn’t work:

Applescript:


set theSourceFolder to (choose folder with prompt "Select an HDD or folder:")
set theDestinationList to (choose folder with prompt "Select where to save the CSV file and name:”)

tell application "
Terminal"
find theSourceFolder -type f | sed -n 's/..*\.//p' | sort | uniq -c > theDestinationList/*.csv
end tell

Also, is there any way to exclude the hidden files?
and gives the the total count of all the files?


Thank you in advance for your kind help.

Model: mac mini 2011
AppleScript: 2.6.1
Browser: Safari 537.36
Operating System: macOS 10.9

Last edited by alear (2019-10-16 07:57:12 am)


Filed under: count, files, extension

Offline

 

#2 2019-10-16 09:39:43 am

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 219

Re: Script to count files by extension

Command-line tools are run in Applescripts with the do-shell-script command, and, FWIW, I've included a script below that does what you want. However, running this on an entire hard drive (especially if it's a boot drive) may not work, and, for that, basic AppleScript probably is not a good solution.

Applescript:

set theSourceFolder to (choose folder with prompt "Select an HDD or folder:")
set theSourceFolder to quoted form of POSIX path of theSourceFolder

set textFile to choose file name with prompt "Choose file name" default name "zCARPA"
set textFile to quoted form of (POSIX path of textFile & ".txt")

do shell script "find " & theSourceFolder & " -type f | sed -n " & quoted form of "s/..*\\.//p" & " | sort | uniq -c > " & textFile

BTW, I do not know how to modify the above command line to omit hidden files, and I'll add something later to report the total number of files. Also, the Find command sees apps (and some other files) as folders, which may cause you to see some odd results.

Last edited by peavine (2019-10-16 03:13:28 pm)


2018 Mac mini - macOS Mojave

Offline

 

#3 2019-10-16 10:31:00 am

alear
Member
Registered: 2016-10-02
Posts: 30

Re: Script to count files by extension

peavine,

Thank you very much.

Your script certainly does what I am looking for.

It could be a way to name the text file?


The hidden files are a problem, the resulting txt report is not clean.

Offline

 

#4 2019-10-16 10:49:36 am

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 219

Re: Script to count files by extension

The following will prompt for a file name--do not include the file extension. I don't know any simple way to omit the hidden files but I will give that some thought.

Applescript:


set theSourceFolder to (choose folder with prompt "Select an HDD or folder:")
set theSourceFolder to quoted form of POSIX path of theSourceFolder

set textFile to choose file name with prompt "Choose file name" default name "zCARPA"
set textFile to quoted form of (POSIX path of textFile & ".txt")

do shell script "find " & theSourceFolder & " -type f | sed -n " & quoted form of "s/..*\\.//p" & " | sort | uniq -c > " & textFile

Last edited by peavine (2019-10-16 03:18:19 pm)


2018 Mac mini - macOS Mojave

Offline

 

#5 2019-10-16 11:18:33 am

alear
Member
Registered: 2016-10-02
Posts: 30

Re: Script to count files by extension

Very much obliged for your kind help.

thank you.

Offline

 

#6 2019-10-16 11:44:30 am

alear
Member
Registered: 2016-10-02
Posts: 30

Re: Script to count files by extension

.
I found the command line to omit the hidden files from the below link and it works:
https://www.linuxquestions.org/question … rs-208169/


Applescript:

find /Users/mac/Documents/zCARPA \( ! -regex '.*/\..*' \) -type f | sed -n 's/..*\.//p' | sort | uniq -c > /Users/mac/Desktop/zCARPA.txt

Model: mac mini 2011
AppleScript: 2.6.1
Browser: Safari 537.36
Operating System: macOS 10.9

Offline

 

#7 2019-10-16 11:55:38 am

alear
Member
Registered: 2016-10-02
Posts: 30

Re: Script to count files by extension

.
Also, I found the command line for total files. But I do not know how to implement it in the code.

Applescript:

find /Users/mac/Documents/zCARPA \( ! -regex '.*/\..*' \) -type f | wc -l > /Users/mac/Desktop/zCARPA.txt

Offline

 

#8 2019-10-16 01:26:14 pm

KniazidisR
Member
Registered: 2019-03-03
Posts: 712

Re: Script to count files by extension

External HDD disks?

This problem can be solved successfully with plain AppleScript using recursive search and System Events.

But this algorithm is time-consuming, therefore it is better to use AppleScriptObjC here, for maximum speed and reliability of the results. In addition, AppleScriptObjC has a simple method for filtering unnecessary content, such as hidden files.

I will try to publish a similar script, but now I need to go to work.

Last edited by KniazidisR (2019-10-16 01:27:42 pm)


Model: MacBook Pro
macOS Mojave -- version 10.14.4
Safari -- version 12.1
Firefox -- version 70.0

Offline

 

#9 2019-10-16 01:36:04 pm

alear
Member
Registered: 2016-10-02
Posts: 30

Re: Script to count files by extension

Yes, I will use the script for external HDDs.

Thank you very much for your help.

Offline

 

#10 2019-10-16 04:21:59 pm

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 219

Re: Script to count files by extension

alear wrote:

I found the command line to omit the hidden files from the below link and it works:

Applescript:

find /Users/mac/Documents/zCARPA \( ! -regex '.*/\..*' \) -type f | sed -n 's/..*\.//p' | sort | uniq -c > /Users/mac/Desktop/zCARPA.txt


I incorporated the above command line in my script and it does appear to work. Getting the quoting correct was a real bear. Anyways, my script remains a kludge and KniazidisR's AppleScriptObjC  version will be the one to use.

Applescript:

set theSourceFolder to (choose folder with prompt "Select an HDD or folder:")
set theSourceFolder to quoted form of POSIX path of theSourceFolder

set textFile to choose file name with prompt "Choose file name" default name "zCARPA"
set textFile to quoted form of (POSIX path of textFile & ".txt")

do shell script "find " & theSourceFolder & " \\( ! -regex " & quoted form of ".*/\\..*" & " \\) -type f | sed -n " & quoted form of "s/..*\\.//p" & " | sort | uniq -c > " & textFile

Last edited by peavine (2019-10-16 04:27:31 pm)


2018 Mac mini - macOS Mojave

Offline

 

#11 2019-10-16 04:54:31 pm

Shane Stanley
Member
From:: Australia
Registered: 2002-12-07
Posts: 6034

Re: Script to count files by extension

This is an AppleScriptObjC solution. It assumes every item with an extension is a file, which may include some folders, depending on how you name things. But it skips invisible items, as well as items within package files. And it sorts the list by count.

Applescript:

use AppleScript version "2.5" -- macOS 10.11 or later
use framework "Foundation"
use scripting additions

set theSourceFolder to (choose folder with prompt "Select an HDD or folder:")
-- get all files
set fileManager to current application's NSFileManager's defaultManager()
set theOptions to (current application's NSDirectoryEnumerationSkipsPackageDescendants) + (get current application's NSDirectoryEnumerationSkipsHiddenFiles)
set theFiles to (fileManager's enumeratorAtURL:theSourceFolder includingPropertiesForKeys:{} options:theOptions errorHandler:(missing value))'s allObjects()
-- get just extensions
set theExtensions to theFiles's valueForKey:"pathExtension"
-- remove empty/missing extensions
set theFilter to current application's NSPredicate's predicateWithFormat:"SELF != ''"
set theExtensions to theExtensions's filteredArrayUsingPredicate:theFilter
-- make counted set, which does counting for us
set theSet to current application's NSCountedSet's setWithArray:theExtensions
-- build array of records so we can sort
set theResults to current application's NSMutableArray's array()
repeat with aValue in theSet's allObjects()
   (theResults's addObject:{theCount:theSet's countForObject:aValue, theValue:((theSet's countForObject:aValue) as text) & "," & (aValue as text)})
end repeat
-- get total count
set theCount to theResults's valueForKeyPath:"@sum.theCount"
-- sort list by count
set sortDesc to current application's NSSortDescriptor's sortDescriptorWithKey:"theCount" ascending:false
theResults's sortUsingDescriptors:{sortDesc}
-- create text and save
set theText to (((theResults's valueForKey:"theValue")'s componentsJoinedByString:linefeed) as text) & linefeed & (theCount as integer as text) & ",Total"
-- write file to desktop
set deskPath to path to desktop as text
set fileRef to (open for access ((deskPath & "Results.csv") as «class furl») with write permission)
set eof fileRef to 0
write theText to fileRef as «class utf8»
close access fileRef


Shane Stanley <sstanley@myriad-com.com.au>
www.macosxautomation.com/applescript/apps/
latenightsw.com

Offline

 

#12 2019-10-16 04:57:16 pm

Shane Stanley
Member
From:: Australia
Registered: 2002-12-07
Posts: 6034

Re: Script to count files by extension

peavine wrote:

I incorporated the above command line in my script and it does appear to work.



I tried a couple of your versions but they seem to be having trouble with subfolders --  it's like they're treating the whole subpath as an extension.


Shane Stanley <sstanley@myriad-com.com.au>
www.macosxautomation.com/applescript/apps/
latenightsw.com

Offline

 

#13 2019-10-16 05:23:58 pm

alear
Member
Registered: 2016-10-02
Posts: 30

Re: Script to count files by extension

.
Thank you very much for the AppleScriptObjC solution.

I am using Mavericks 10.9.5
Versión 2.6.1 (152.1)
AppleScript 2.3.2

mac mini 2011

Is there any way to make it run on my system?
.

Last edited by alear (2019-10-16 05:27:09 pm)

Offline

 

#14 2019-10-16 05:45:10 pm

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 219

Re: Script to count files by extension

Shane Stanley wrote:
peavine wrote:

I incorporated the above command line in my script and it does appear to work.



I tried a couple of your versions but they seem to be having trouble with subfolders --  it's like they're treating the whole subpath as an extension.



Shane. I don't know why that is because it works fine for me even with a large folder with many subfolders. I ran my script on such a folder and got:

   1 numbers
412 pdf
   1 tax2018
   2 txt

Your script returned:

412,pdf
2,txt
1,tax2018
1,numbers

Anyways, my script breaks with packages and is kludgey, so the OP should clearly use your script. I wish I had the skills to write stuff like that.

Last edited by peavine (2019-10-16 05:46:46 pm)


2018 Mac mini - macOS Mojave

Offline

 

#15 2019-10-16 05:47:30 pm

Nigel Garvey
Moderator
From:: Warwickshire, England
Registered: 2002-11-20
Posts: 5105

Re: Script to count files by extension

Shane Stanley wrote:
peavine wrote:

I incorporated the above command line in my script and it does appear to work.



I tried a couple of your versions but they seem to be having trouble with subfolders --  it's like they're treating the whole subpath as an extension.


This version only returns extensions and omits anything from paths containing "/.":

Applescript:

set theSourceFolder to (choose folder with prompt "Select an HDD or folder:")
set theSourceFolder to quoted form of POSIX path of theSourceFolder

set textFile to choose file name with prompt "Choose file name" default name "zCARPA.txt"
set textFile to quoted form of (POSIX path of textFile)

do shell script "find " & theSourceFolder & " -type f | sed -En " & quoted form of "/.*\\/\\..*/ !s|.+\\.([^./]+$)|\\1|p" & " | sort | uniq -c > " & textFile


NG

Offline

 

#16 2019-10-16 05:55:48 pm

Shane Stanley
Member
From:: Australia
Registered: 2002-12-07
Posts: 6034

Re: Script to count files by extension

peavine wrote:

Shane. I don't know why that is because it works fine for me even with a large folder with many subfolders.



I had another look, and it appears the part-paths were appearing only for items in packages. That might be coincidence, but they're also files without extensions.


Shane Stanley <sstanley@myriad-com.com.au>
www.macosxautomation.com/applescript/apps/
latenightsw.com

Offline

 

#17 2019-10-16 06:06:19 pm

Shane Stanley
Member
From:: Australia
Registered: 2002-12-07
Posts: 6034

Re: Script to count files by extension

alear wrote:

.I am using Mavericks 10.9.5
[...]
Is there any way to make it run on my system?



There is, but it's a bit tricky. You need to save it as a script library in ~/Library/Script Libraries, and call it from there. You would also have to convert theSourceFolder to an NSURL. So the code would be like:

Applescript:

use AppleScript version "2.3" -- macOS 10.9
use framework "Foundation"
use scripting additions

on reportOnFolder(theSourceFolder)
   set theSourceFolder to current application's |NSURL|'s fileURLWithPath:(POSIX path of theSourceFolder)
   -- get all files
   set fileManager to current application's NSFileManager's defaultManager()

[...as above in here]

   close access fileRef
end reportOnFolder

You would need to save it as a .scptd bundle, and edit the bundle's Info.plist file to include:

Applescript:

   <key>OSAAppleScriptObjCEnabled</key>
   <true/>

Then your main script would call it something like:

Applescript:

use AppleScript version "2.3"
use scripting additions
use theLib: script "Name of your lib file" -- change to suit

set theSourceFolder to (choose folder with prompt "Select an HDD or folder:")
tell theLib to reportOnFolder(theSourceFolder)


Shane Stanley <sstanley@myriad-com.com.au>
www.macosxautomation.com/applescript/apps/
latenightsw.com

Offline

 

#18 2019-10-16 06:13:45 pm

alear
Member
Registered: 2016-10-02
Posts: 30

Re: Script to count files by extension

.
gentlemen,

thank you very much for all your solutions and kind help.

very much obliged to you.
.

Offline

 

#19 2019-10-16 07:06:48 pm

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 219

Re: Script to count files by extension

Nigel Garvey wrote:

This version only returns extensions and omits anything from paths containing "/.":


Nigel. Your version works great. Just as a test, I ran the script on a huge folder on an external drive, and it's surprising how fast it is (under a second).


2018 Mac mini - macOS Mojave

Offline

 

#20 2019-10-16 08:21:28 pm

Marc Anthony
Member
From:: Dallas, TX
Registered: 2006-04-27
Posts: 907

Re: Script to count files by extension

Nigel Garvey wrote:


This version only returns extensions and omits anything from paths containing "/.":

Applescript:


do shell script "find " & theSourceFolder & " -type f | sed -En " & quoted form of "/.*\\/\\..*/ !s|.+\\.([^./]+$)|\\1|p" & " | sort | uniq -c > " & textFile




For kicks, here's a variation that pares down the sed (and won't descend into packages):

Applescript:

do shell script "find -E " & theSourceFolder & " -iname '*.*' -prune ! -regex '.*/[.].*' | sed 's/.*[.]//g' | sort | uniq -c > " & textFile

Last edited by Marc Anthony (2019-10-16 08:28:28 pm)

Offline

 

#21 2019-10-16 09:54:12 pm

Shane Stanley
Member
From:: Australia
Registered: 2002-12-07
Posts: 6034

Re: Script to count files by extension

Marc Anthony wrote:

For kicks, here's a variation that pares down the sed (and won't descend into packages):



FWIW, on my test folder (NAS drive, no packages) it comes up with lower numbers than Nigel's script and my ASObjC version for several extensions. It also ( like mine) including folders.


Shane Stanley <sstanley@myriad-com.com.au>
www.macosxautomation.com/applescript/apps/
latenightsw.com

Offline

 

#22 2019-10-17 03:55:05 am

Nigel Garvey
Moderator
From:: Warwickshire, England
Registered: 2002-11-20
Posts: 5105

Re: Script to count files by extension

For a shell script solution, I'd go for Marc's rather than mine. It doesn't descend into packages and it does actually include them! On the other hand, it doesn't descend into folders with dots in their names either, presumably believing them to be packages.

"-iname" could perhaps be "-name", since there's no need for case insensitivity when looking for a dot. And "! -regex '.*/[.].*' " might be simplified to "! -path '*/.*' " or possibly "! -name '.*' ", either of which would also make the -E option unnecessary. But I doubt any of these make much difference in practice. The "s" command in the "sed" string doesn't need the "g" because the regex used there initially includes everything in the line and then winds back looking for a dot, so the match is always to the last dot in the line anyway.


NG

Offline

 

#23 2019-10-17 04:17:26 pm

Nigel Garvey
Moderator
From:: Warwickshire, England
Registered: 2002-11-20
Posts: 5105

Re: Script to count files by extension

peavine wrote:

1) Spaces at the beginning of lines in the text file need to be removed.


The spaces are in fact an indent, in which the counts are right-aligned. But it has a fixed width.

I don't think any of the shell scripts above are totally bullet-proof when we don't know what's in the folder(s). Mine overlooks packages, Marc's overlooks items in folders which have dots in their names. Shane's ASObjC script is the best option in these respects, but includes extensions from folders which have dots in their names. (alear does specifically mention files.) While a CSV file is one of the options mentioned, we don't know what separator's considered the default on alear's system.

Here's a version of Shane's script which produces a text file similar in appearance to those of the shell scripts. Folder extensions aren't counted but file and package extensions are. Package contents are ignored, but folder contents aren't. The results are sorted by extension and are preceded on their lines by their counts, which are dynamically indented to the minimum extent required. The total is displayed at the bottom. It would also be possible to include a header indicating the folder to which the results pertain, but I haven't bothered here. Hopefully the script's compatible with Mavericks ….

Applescript:

use AppleScript version "2.3.1" -- macOS 10.9 (Mavericks) or later
use framework "Foundation"
use scripting additions

-- For testing:
--set theSourceFolder to (choose folder with prompt "Select an HDD or folder:")
--reportOnFolder(theSourceFolder)

on reportOnFolder(theSourceFolder)
   set theDestinationFile to (choose file name with prompt "Choose file name" default name "zCARPA.txt")
   set destinationURL to current application's class "NSURL"'s fileURLWithPath:(POSIX path of theDestinationFile)
   
   -- get all files
   set theSourceFolder to current application's |NSURL|'s fileURLWithPath:(POSIX path of theSourceFolder)
   set fileManager to current application's NSFileManager's defaultManager()
   set URLKeys to current application's class "NSArray"'s arrayWithArray:({current application's NSURLIsRegularFileKey, current application's NSURLIsPackageKey})
   set theOptions to (current application's NSDirectoryEnumerationSkipsPackageDescendants) + (get current application's NSDirectoryEnumerationSkipsHiddenFiles)
   set theFiles to (fileManager's enumeratorAtURL:theSourceFolder includingPropertiesForKeys:(URLKeys) options:theOptions errorHandler:(missing value))'s allObjects()
   -- remove items with no extensions
   set theFilter to current application's NSPredicate's predicateWithFormat:"pathExtension != ''"
   set theFiles to theFiles's filteredArrayUsingPredicate:theFilter
   -- Build a counted set containing the extensions of those items which aren't folders.
   set theSet to current application's NSCountedSet's new()
   repeat with thisItem in theFiles
       if (((thisItem's resourceValuesForKeys:(URLKeys) |error|:(missing value)) as record as list) contains true) then (theSet's addObject:(thisItem's pathExtension()))
   end repeat
   -- build array of records so we can sort
   set theResults to current application's NSMutableArray's array()
   set theSum to 0
   tell (space & space) to tell (it & it) to set eightSpaces to (it & it) -- MacScripter /displays/ the literal string as a single space.
   repeat with aValue in theSet's allObjects()
       set theCount to (theSet's countForObject:(aValue)) as integer
       set theSum to theSum + theCount
       -- The spaces at the beginning of theEntry are padding for an indent, whose size will be adjusted later.
       (theResults's addObject:{theValue:aValue, theEntry:(eightSpaces & theCount) & (space & aValue)})
   end repeat
   -- sort on the dictionaries 'theValue' values.
   set sortDesc to current application's NSSortDescriptor's sortDescriptorWithKey:"theValue" ascending:true
   theResults's sortUsingDescriptors:{sortDesc}
   -- create the text with an entry for the total count at the end.
   set theSum to theSum as text
   theResults's addObject:({theEntry:linefeed & eightSpaces & theSum & " TOTAL"})
   set theText to (theResults's valueForKey:"theEntry")'s componentsJoinedByString:(linefeed)
   -- Adjust the width of the indent to the number of characters in the total.
   set theText to theText's stringByReplacingOccurrencesOfString:("(?m)^ +(?=[ \\d]{" & (count theSum) & "} )") withString:("") options:(current application's NSRegularExpressionSearch) range:({0, theText's |length|()})
   -- Write the text to the specified text file as UTF-8.
   theText's writeToURL:(destinationURL) atomically:(true) encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
end reportOnFolder

Edit: Bug pointed out below by Shane fixed. String of eight spaces explicitly set in a variable to avoid confusion when viewed on MacScripter.
Edit 2: Eight-space string set more efficiently for fun and to take up less room on the page.

Last edited by Nigel Garvey (2019-10-19 02:35:08 am)


NG

Offline

 

#24 2019-10-17 05:24:04 pm

peavine
Member
From:: Prescott, Arizona
Registered: 2018-09-04
Posts: 219

Re: Script to count files by extension

Nigel Garvey wrote:
peavine wrote:

1) Spaces at the beginning of lines in the text file need to be removed.


The spaces are in fact an indent, in which the counts are right-aligned. But it has a fixed width.


Thanks Nigel. I guess I should have figured that out.


2018 Mac mini - macOS Mojave

Offline

 

#25 2019-10-17 05:47:08 pm

Shane Stanley
Member
From:: Australia
Registered: 2002-12-07
Posts: 6034

Re: Script to count files by extension

Nigel,

I think the line:

Applescript:

(theResults's addObject:{theValue:aValue, theEntry:(" " & theSum) & " " & aValue})

should be:

Applescript:

   (theResults's addObject:{theValue:aValue, theEntry:(" " & theCount) & " " & aValue})

I'm also not seeing the alignment happening.

Last edited by Shane Stanley (2019-10-17 05:59:20 pm)


Shane Stanley <sstanley@myriad-com.com.au>
www.macosxautomation.com/applescript/apps/
latenightsw.com

Offline

 

Board footer

Powered by FluxBB

RSS (new topics) RSS (active topics)