Extract Hyperlinks from RTF and save as readable plaintext

I have a folder with ONLY textedit rtf files.

Is there a way in applescript to choose the folder where they are located to then export the various Hyperlink as readable text and save each file in to a folder of choice?

Where “website” and “email” become “http://www.site.com” and “address@site.com

Converting as plain text I lose the hyperlinks

The files include short text I’d like to keep and these links.

Here an example:

This Site (with Hyperlink)
Submission Date
December 12, 2018
Notification Date
October 1, 2019
Event Date
October 25, 2019
Tracking Number
SIFF4964

	[b]Email[/b]  (with Hyperlink)
	
	[b]Website[/b]  (Hyperlink)

I hope to get a plain text file as such

Site.com
Submission Date
December 12, 2018
Notification Date
October 1, 2019
Event Date
October 25, 2019
Tracking Number
SIFF4964
address@site.com
http://www.site.com/

Thanks a lot

Yes. I’d just read the RTF into an Applescript variable using Applescript’s “read” command, which will give you the plain text markup behind the RTF.

Example, I made this RTF:

With the “Hyperlink” and “email” being hyperlinks. Saved as RTF.

Then Applescripted this:


set rawRTFtext to read alias [file path]

and now the contents of the rawRTFtext variable is:

From there, you can use vanilla Applescript, or regex via terminal or ASObjC or whatever, to separate out the link addresses via the preceding delimiter:

and the terminating delimiter

Then format as needed and save as a .txt file wherever you like.

Post and script deleted as seeing Shane’s script below made me realise I’d misread the query.

Here’s an approach that deals with the text as RTF:

use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use framework "AppKit"
use scripting additions

set thePath to "/Users/shane/Desktop/Test.rtf"
-- read file into attributed string
set theURL to current application's NSURL's fileURLWithPath:thePath
set {attString, theError} to current application's NSAttributedString's alloc()'s initWithURL:theURL options:(missing value) documentAttributes:(missing value) |error|:(reference)
-- get elngth so we can start from the end
set start to (attString's |length|()) - 1
-- make plain string copy to work on
set theString to attString's |string|()'s mutableCopy()
repeat
	-- find link
	set {aURL, theRange} to attString's attribute:(current application's NSLinkAttributeName) atIndex:start effectiveRange:(reference)
	if aURL is not missing value then
		-- get linked text
		set linkText to theString's substringWithRange:theRange
		if aURL's |scheme|()'s isEqualToString:"mailto" then -- email address
			set newLink to aURL's resourceSpecifier()
		else if linkText's containsString:"This Site" then -- resource specifier, remove //
			set newLink to aURL's resourceSpecifier()'s substringFromIndex:2
		else -- full URL
			set newLink to aURL's absoluteString()
		end if
		-- replace link
		theString's replaceCharactersInRange:theRange withString:newLink
	end if
	set start to (location of theRange) - 2
	if start < 0 then exit repeat
end repeat
return theString as text

1 Like

Thanks Shane
your script works perfectly
However which steps do I need to add so the script will scan each files in my folder (I have 450 files there)

/Users/danwan/Desktop/RTFFiles

and convert the result in a folder of choice keeping each file original name

/Users/danwan/Desktop/RTFFilesConverted

Kind regards

Danwan

Browser: Safari 537.36
Operating System: macOS 10.12

What code have you written already?

Thanks for your time and reply

I didn’t add anything as I am not skilled enough to do as follows:

1 ask the script to select the folder with my 400+files
2 create a new file with the same name as the one the script convert to plaintext
3 export the result to this new file
4 save the new file in a folder of choice

Regards

I think perhaps you misunderstand the role of MacScripter. It’s a help facility, not a free script writing service. This should help you get started:

use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use framework "AppKit"
use scripting additions

set theFolder to choose folder -- choose the folder containing the .rtf files
tell application id "com.apple.finder" -- Finder
	set theFiles to every file of theFolder as alias list
end tell
repeat with aFile in theFiles
	if (theFile as text) ends with ".rtf" then
		set theURL to (current application's NSURL's fileURLWithPath:(POSIX path of aFile))
		set {attString, theError} to (current application's NSAttributedString's alloc()'s initWithURL:theURL options:(missing value) documentAttributes:(missing value) |error|:(reference))
		-- get elngth so we can start from the end
		set start to (attString's |length|()) - 1
		-- make plain string copy to work on
		set theString to attString's |string|()'s mutableCopy()
		repeat
			-- find link
			set {aURL, theRange} to (attString's attribute:(current application's NSLinkAttributeName) atIndex:start effectiveRange:(reference))
			if aURL is not missing value then
				-- get linked text
				set linkText to (theString's substringWithRange:theRange)
				if (aURL's |scheme|()'s isEqualToString:"mailto") then -- email address
					set newLink to aURL's resourceSpecifier()
				else if (linkText's containsString:"This Site") then -- resource specifier, remove //
					set newLink to (aURL's resourceSpecifier()'s substringFromIndex:2)
				else -- full URL
					set newLink to aURL's absoluteString()
				end if
				-- replace link
				(theString's replaceCharactersInRange:theRange withString:newLink)
			end if
			set start to (location of theRange) - 2
			if start < 0 then exit repeat
		end repeat
		set newFile to (theURL's URLByDeletingPathExtension()'s URLByAppendingPathExtension:"text")
               (theString's writeToURL:newFile atomically:true encoding:(current application's NSUTF8StringEncoding) |error|:(missing value))
	end if
end repeat

I am sorry for misunderstanding how to use MacScripter
It won’t happen again.
Regards and thanks a lot for your time

Shane, you probably mean

(theString's writeToURL:newFile atomically:true encoding:(current application's NSUTF8StringEncoding) |error|:(missing value))

Thanks, Stefan – I’ve edited the original accordingly.

Thanks To Shane and StefanK for their time and help.
This is the final fully working script.
Fixed as StefanK suggested, and another small typo.
This could be very useful to other “not very skilled as myself” in ObjectiveC and Applescript users dealing with rtf files when in need to convert hyperlinks to plain text while keeping plain text as it is in the rtf file.


use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use framework "AppKit"
use scripting additions

set theFolder to choose folder -- choose the folder containing the .rtf files
tell application id "com.apple.finder" -- Finder
	set theFiles to every file of theFolder as alias list
end tell
repeat with aFile in theFiles
	if (theFiles as text) ends with ".rtf" then
		set theURL to (current application's NSURL's fileURLWithPath:(POSIX path of aFile))
		set {attString, theError} to (current application's NSAttributedString's alloc()'s initWithURL:theURL options:(missing value) documentAttributes:(missing value) |error|:(reference))
		-- get elngth so we can start from the end
		set start to (attString's |length|()) - 1
		-- make plain string copy to work on
		set theString to attString's |string|()'s mutableCopy()
		repeat
			-- find link
			set {aURL, theRange} to (attString's attribute:(current application's NSLinkAttributeName) atIndex:start effectiveRange:(reference))
			if aURL is not missing value then
				-- get linked text
				set linkText to (theString's substringWithRange:theRange)
				if (aURL's |scheme|()'s isEqualToString:"mailto") then -- email address
					set newLink to aURL's resourceSpecifier()
				else if (linkText's containsString:"This Site") then -- resource specifier, remove //
					set newLink to (aURL's resourceSpecifier()'s substringFromIndex:2)
				else -- full URL
					set newLink to aURL's absoluteString()
				end if
				-- replace link
				(theString's replaceCharactersInRange:theRange withString:newLink)
			end if
			set start to (location of theRange) - 2
			if start < 0 then exit repeat
		end repeat
		set newFile to (theURL's URLByDeletingPathExtension()'s URLByAppendingPathExtension:"txt")
		(theString's writeToURL:newFile atomically:true encoding:(current application's NSUTF8StringEncoding) |error|:(missing value))
	end if
end repeat

Browser: Safari 537.36
Operating System: macOS 10.12