extract all numbers between parentheses in a string

robertfern · July 10, 2020, 1:59pm

Here is a slightly modified version using “middle item” to get the correct value inside multiple parentheses like the first item.

I get “1773” for the first item.
I also changed the line “(count (get text items of anitem))” to “(number of text items of anItem)”

set theString to "Test (10 (1773)) -- returns missing value
Test (aa) -- returned missing value
Test 8 (1600)
Gò Vấp (10062)
Phú Nhuận (898)
Tân Bình (925)
Bình Chánh (78)
Bình Tân (500)
Bình Thạnh (400)
Test 9 (471)
Test 1 (615)
Test 12 (463)
Test 3 (39)
Tân Phú (522)
Thủ Đức (423)
Test 7 (351)
Cần Giờ (9)
Hóc Môn (98)
Test 2 (127)
Test 4 (8)
Test 5 (12)
Test 6 (228)
Test 11 (111)
Củ Chi (112)"

set theString to paragraphs of theString

set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to {"(", ")"}

set numberList to {}

repeat with anItem in theString
	if (number of text items of anItem) mod 2 = 1 then -- odd number  of items
		try
			set end of numberList to (middle text item of anItem) as integer
		on error
			set end of numberList to missing value
		end try
	else
		set end of numberList to missing value
	end if
end repeat

set AppleScript's text item delimiters to astid

numberList

Yvan_Koenig · July 10, 2020, 2:09pm

Good catch,
I missed that Shane’s version return an entire paragraph which doesn’t contain a value matching the original requirements.

Is it useful to write that I’m not pleased by such behavior ?

Yvan KOENIG running High Sierra 10.13.6 in French (VALLAURIS, France) vendredi 10 juillet 2020 16:07:47

peavine · July 12, 2020, 12:56am

Thanks Yvan.

peavine · July 12, 2020, 1:03am

When writing my script, I unsuccessfully tried using “(count (text items of anItem)” but I got it to work with the following change:

if (count (text items of (contents of anItem))) mod 2 = 1 then

Yvan’s use of “get” and “number of” seem good alternatives to the above.

wayland · July 12, 2020, 6:48am

Thanks guys. This was very helpful.

For a non-coder, the TEXTBETWEEN function in Numbers also works for this purpose.

CK11 · July 12, 2020, 12:48pm

The OP’s question was solved in the first two replies by Yvan, although Peavine and Shane each offered a refinement of their own. After that, I struggle to see where all of this was leading–or led to–and why there was such a degree of importance placed upon one possible, but very specific mutation of the original data. I admire the passion, and I think more people should have curiosity and a desire to experiment. But don’t become obsessional, which I think happens more easily when there’s no clear objective, and no established rules around which to understand the nature of the problem you’re wishing to devise and to solve.

One thing I was taught–which surprisingly turns out to apply to many areas in life outside of computing–was to begin by first asking myself why I wish to solve the particular problem in front of me. If I couldn’t articulate a reason, or justify claims I used to support my reasons, then either there’s no value in solving the problem, or I wasn’t looking at the problem in the most helpful way.

Having been a while since stopping by this site, I should say it was nice to see this thread with people genuinely excited about scripting and problem-solving, so keep doing it, but also know why you’re doing it.

Anyway, all of that has no bearing on my offering below. It’s just a slightly different method to tackle the original problem that I noticed hadn’t been put forward:

set T to "...<snip>..."
set L to T's paragraphs

repeat with x in L
	set x's contents to x's last word ¬
		as {number, «class utf8»}
end repeat

return L's numbers

--OR, if one prefers:
set my text item delimiters to linefeed

# return L's numbers as text
return paragraphs of (L's numbers as text)

The obviously failure case is where the numerical string forms part of a word with other non-numerical word characters, e.g. “foo (bar446)”. Currently, a line like this would be discarded, but there’s no indication this would be a likely or possible occurrence, and I already know how to solve for it, but if there’s no requirement to do so, KISS wins out.

Yvan_Koenig · July 12, 2020, 2:43pm

Hello CK

Just matter of curiosity, are you sure that ‘last word’ would return the wanted value in every language ?
I ask because I’m not sure that parenthesis are treated as word delimiters worldwide.

Yvan KOENIG running High Sierra 10.13.6 in French (VALLAURIS, France) dimanche 12 juillet 2020 16:42:53

CK11 · July 19, 2020, 4:38pm

No, but I didn’t imply any internationalisation. Generally speaking, it is never a priority for me. My only incentive to localise anything would be by specific request by the OP. But that isn’t to say I’m not interested in hearing how mileage varies when tested within a different locale. I think you highlight a very interesting point.