I have a use case for Finder tags that I want to exploit.
- Read a text/PDF file
- Find hashtags in the text of the file
- A hashtag is a continuous (no space) string that begins with a “#” and ends with a space or line break/new line
- There may be multiple hashtags in a file, or even in a line/row of text.
- Write each hashtag as a separate finder tag to the text/PDF file
- The home brew terminal command, “tag” can be used.
I haven’t had any luck finding a ready-made solution for numbers 1 and 2, above, but they seem so obvious that I have to believe this has already been solved multiple times.
Does anyone have a solution?
Thank you,
John
Here is one method for accomplishing what you specify. Note that this routine specifically uses only a space, line feed, or return to demark the ending of a hash tag. Just as you specified.
This method relies heavily on tag routines of Shane Stanley. Thank you Shane.
use scripting additions
use framework "Foundation"
property endChars : space & "
" & "
"
set readFile to POSIX path of (choose file of type {"txt", false})
set theText to read readFile as text
set hashTagList to my getHashTags(theText)
my setTags:hashTagList forPath:readFile
on getHashTags(fileText)
set hashTagsList to {}
repeat until fileText does not contain "#"
set poundLoc to offset of "#" in fileText
set thisHashTag to ""
repeat with i from poundLoc to (count of fileText)
set thisChar to item i of fileText
if endChars does not contain thisChar then
set thisHashTag to thisHashTag & thisChar
else
copy thisHashTag to the end of the hashTagsList
set fileText to characters (poundLoc + (count of thisHashTag)) thru -1 of fileText as text
exit repeat
end if
end repeat
end repeat
return hashTagsList
end getHashTags
-- The following routines are by Shane Stanley
on returnTagsFor:posixPath -- get the tags
set aURL to current application's |NSURL|'s fileURLWithPath:posixPath -- make URL
set {theResult, theTags} to aURL's getResourceValue:(reference) forKey:(current application's NSURLTagNamesKey) |error|:(missing value)
if theTags = missing value then return {} -- because when there are none, it returns missing value
return theTags as list
end returnTagsFor:
on setTags:tagList forPath:posixPath -- set the tags, replacing any existing
set aURL to current application's |NSURL|'s fileURLWithPath:posixPath -- make URL
aURL's setResourceValue:tagList forKey:(current application's NSURLTagNamesKey) |error|:(missing value)
end setTags:forPath:
on addTags:tagList forPath:posixPath -- add to existing tags
set aURL to current application's |NSURL|'s fileURLWithPath:posixPath -- make URL
-- get existing tags
set {theResult, theTags} to aURL's getResourceValue:(reference) forKey:(current application's NSURLTagNamesKey) |error|:(missing value)
if theTags ≠ missing value then -- add new tags
set tagList to (theTags as list) & tagList
set tagList to (current application's NSOrderedSet's orderedSetWithArray:tagList)'s allObjects() -- delete any duplicates
end if
aURL's setResourceValue:tagList forKey:(current application's NSURLTagNamesKey) |error|:(missing value)
end addTags:forPath:
Model: Mac Pro (Mid 2010)
AppleScript: 2.7
Browser: Firefox 79.0
Operating System: macOS 10.14
Wow! This is really amazing. I was having a bit of a rough week, but this complete solution, simply handed to me, makes me feel happy and grateful.
Thank you!
I did find that property endChars will only respect the first value and will loop indefinitely if there are actually other end characters like linefeed or return in the file. So, if you reorder the endChars values and put “return” as the first check, it will respect only returns and will loop if there is only a space but no return after the hashtag.
I’m going to try to figure it out in the morning… time for bed, now.
Thanks again.
Here are the text file contents that was used to test the routine.
Hash Tag Test Document
#HashTag1 this is the first hash tag.
#HashTag2 this is the second hash tag.
The following hash tag is inside and at the end of a paragraph: #HashTag3
The next hash tag #HashTag4 is in the middle of a paragraph.
Thank you for the test data. Based on that, I was able to discover that there was one additional endChars that I should be checking for - the end of the text, with no space, return, or linefeed.
If the last line in the text is a hashtag, and there is no space, return, or linefeed, that will cause the script to never end.
Since I wasn’t sure how to add a check for end of text/file to the endChars variable, I simply added a return to the theText variable. Not elegant, but it works.
Thanks again… here is your script with my hack:
use scripting additions
use framework "Foundation"
property endChars : space & "
" & "
"
set readFile to POSIX path of (choose file of type {"txt", false})
set theText to read readFile as text
-- The line below hacked in by johncatalano
set theText to theText & return
-- The line above hacked in by johncatalano
set hashTagList to my getHashTags(theText)
my setTags:hashTagList forPath:readFile
on getHashTags(fileText)
set hashTagsList to {}
repeat until fileText does not contain "#"
set poundLoc to offset of "#" in fileText
set thisHashTag to ""
repeat with i from poundLoc to (count of fileText)
set thisChar to item i of fileText
if endChars does not contain thisChar then
set thisHashTag to thisHashTag & thisChar
else
copy thisHashTag to the end of the hashTagsList
set fileText to characters (poundLoc + (count of thisHashTag)) thru -1 of fileText as text
exit repeat
end if
end repeat
end repeat
return hashTagsList
end getHashTags
-- The following routines are by Shane Stanley
on returnTagsFor:posixPath -- get the tags
set aURL to current application's |NSURL|'s fileURLWithPath:posixPath -- make URL
set {theResult, theTags} to aURL's getResourceValue:(reference) forKey:(current application's NSURLTagNamesKey) |error|:(missing value)
if theTags = missing value then return {} -- because when there are none, it returns missing value
return theTags as list
end returnTagsFor:
on setTags:tagList forPath:posixPath -- set the tags, replacing any existing
set aURL to current application's |NSURL|'s fileURLWithPath:posixPath -- make URL
aURL's setResourceValue:tagList forKey:(current application's NSURLTagNamesKey) |error|:(missing value)
end setTags:forPath:
on addTags:tagList forPath:posixPath -- add to existing tags
set aURL to current application's |NSURL|'s fileURLWithPath:posixPath -- make URL
-- get existing tags
set {theResult, theTags} to aURL's getResourceValue:(reference) forKey:(current application's NSURLTagNamesKey) |error|:(missing value)
if theTags ≠ missing value then -- add new tags
set tagList to (theTags as list) & tagList
set tagList to (current application's NSOrderedSet's orderedSetWithArray:tagList)'s allObjects() -- delete any duplicates
end if
aURL's setResourceValue:tagList forKey:(current application's NSURLTagNamesKey) |error|:(missing value)
end addTags:forPath:
To accomodate the situation where a hash tag is at the end of the entire text, you could replace
if endChars does not contain thisChar then
with
if (endChars does not contain thisChar) or (i = count of fileText) then
This does not account for the cases where the hash tag ends the character before any of these characters “.)]}?!” or similar terminators. You know your data so if you need any of these, add them to the endChars.