Manipulate text with Applescript and perhaps regex

I have this as a result of a script:

/path/to/document.html (some-text) — Brackets

I need only: /path/to/document.html

Be aware that the path of the document could be any url and inside the brackets could be any text.

I know a bit of regex and I tried here: https://regexr.com/4mabg
I know almost nothing about the terminal and all the solutions I found on Google seem related to that and I just do not understand. Is there any easy to learn solution?

Generally text item delimiters are used to do what you want. For example:

set theItem to "/path/to/document.html (some-text) — Brackets"

set ATID to AppleScript's text item delimiters -- save existing TID
set AppleScript's text item delimiters to {" ("} -- set TID to " ("
set thePath to text item 1 of theItem
set AppleScript's text item delimiters to ATID -- set TID to saved TID

thePath

The following contains introductory information on text item delimiters:

https://macscripter.net/viewtopic.php?id=24422

An alternative, but microscopically slower, approach uses the offset command:

set theItem to "/path/to/document.html (some-text) — Brackets"

set thePath to text 1 thru ((offset of " (" in theItem) - 1) of theItem

Thank you very much. It works very well and the link you provide explains it very well also. Only one question: do I need this line?
set ATID to AppleScript’s text item delimiters

This works well:
set theItem to “/path/to/document.html (some-text) — Brackets”
set AppleScript’s text item delimiters to {" ("} – set new TID’s
set thePath to text item 1 of theItem
set AppleScript’s text item delimiters to theItem

The direct answer to your question is no. However, it is a generally-accepted practice to save/reset TIDs as done in my script in order to avoid breaking something later on. If you do decide to delete that line, you should also delete or modify the last line of my script.

The last line of the above code sets the TID to:

“/path/to/document.html (some-text) — Brackets”

I can’t think of any reason to do this. Instead, IMO, you should 1) delete that line, 2) modify it to set the TID to some value that makes sense within the context of your script, or 3) modify it to set the TID to the default value of “”.

Hi, NMG.
Yes, you need following 2 code lines always, to preserve the default AppleScript behaviour of your Mac for text manipulations:


set ATID to AppleScript's text item delimiters -- save existing (default) TID 
-- Any text manipulations with text item delimiters
set AppleScript's text item delimiters to ATID -- set TID to saved TID (returning to default TID)

The need to preserve the global text delimiter property is a holdover belief—akin to a superstition. Prior to Snow Leopard, there may have been a rationale to preserve TIDS, however, there have been functional changes that make retaining them nonessential. It has always been setting them explicitly that’s important.

I think this is one of the regexp question. Shane’s routine will be a suitable answer.


-- Created 2019-04-18 by Shane Stanley
-- From his book "Everyday AppleScriptObjC"
-- https://macosxautomation.com/applescript/apps/everyday_book.html
use AppleScript version "2.5" --macOS 10.11 or later
use scripting additions
use framework "Foundation"

property NSString : a reference to current application's NSString
property NSRegularExpression : a reference to current application's NSRegularExpression

set aURLText to "/Users/narcis/Dropbox/feines/mig-basic/ordinador/solucions/prova.html (mig) — Brackets"
set regExpStr to "(.*).\\(.*\\) —.Brackets"

set sheetsID to my findPattern:regExpStr inString:aURLText capturing:1



-- find Grep matches, specifying capture number (0 = all) --by Shane Stanley
on findPattern:thePattern inString:theString capturing:n
	set theNSString to NSString's stringWithString:theString
	set theNSRegularExpression to NSRegularExpression's regularExpressionWithPattern:thePattern options:0 |error|:(missing value)
	set findsNSArray to theNSRegularExpression's matchesInString:theNSString options:0 range:{location:0, |length|:theNSString's |length|()}
	set theFinds to findsNSArray as list -- so we can loop through
	set theResult to {} -- we will add to this
	set theNSString to NSString's stringWithString:theString
	
	repeat with i from 1 to count of items of theFinds
		set anNSTextCheckingResult to (item i of theFinds)
		if (anNSTextCheckingResult's numberOfRanges()) as integer < (n + 1) then
			set end of theResult to missing value
		else
			set theRange to (anNSTextCheckingResult's rangeAtIndex:n)
			set end of theResult to (theNSString's substringWithRange:theRange) as string
		end if
	end repeat
	
	return theResult
end findPattern:inString:capturing:

Model: MacBook Pro 2012
AppleScript: 2.7
Browser: Safari 13.0.1
Operating System: macOS 10.14

With ObjC.
Your can get a NSRange of a NSRegularExpression in a NSString
The use stringByReplacingStringInRange:withString:

And use the foundRange for the replacement
And use “” for The replacement

There was no change to how text item delimiters behave in Snow Leopard – what changed was the way Script Editor behaves. Before then, Script Editor used a single AppleScript component instance for all scripts, which meant they all shared delimiters. That was changed so that each script uses its own component instance, and in fact creates a new one each time you compile. And each component instance has its own text item delimiters property.

However, if you intend to run your scripts from another host – say, a script panel in InDesign, or some other app’s scripts menu, or FastScripts – it’s most likely that that the host will run them all within the same component instance, so they will share the global text item delimiters property.

It’s a philosophy rather than a superstition. A handler which uses the TIDs should leave them in the state they were when it received them, which may not necessarily be the default. This is particularly true with handlers in libraries or collaborative projects, where the philosophy and assumptions of the person writing the calling code aren’t known. It’s the same courtesy principle as preserving the clipboard contents or closing files you’ve opened for access. A process which calls a handler should be able to assume that the values it set itself, and didn’t specifically call the handler to change, are still the same afterwards.

On the other hand, as with all things in life, one should proceed “courteously but defensively” and not assume anything when using other people’s code. So setting the TIDs explicitly before each use is also a good idea.

Not tidying up after yourself is fine if you only write simple scripts for your own use, write all the code yourself, and (as Shane’s pointed out) run the results in their own AppleScript instances. There’s also something to be said for not going into too much detail when posting example code to scripting fora. :wink: But generally speaking, “courteously but defensively” is a good way to go.

I’m in the philosophical camp that it’s precisely because the conditions aren’t evident that preserving them potentiates a problem and is inadvisable. Let’s say you live in a mixed sex household and wake in the dark of night for a call of nature. Do you assume what the last person may have done or do you set the lid in the desired use position? Similarly, it’s up to the coder to explicitly enforce conditions rather than assume the environment.