I have been messing with this for hours and cannot figure it out. I am trying to find text in a string, that is different every time, and assign that text to a variable.
The text will always begin after some variation of “Message-ID: <“ and end with “>” and the text I want is always between that start and that end. There is normally a significant amount of text before and after the string that I am looking for as the start.
Examples:
“more text Message-ID:<1>more text”
I want to return 1 in this case
“more text
Message-ID: <1234567> more text”
I want to return 1234567 in this case
“more text Message-ID:
<1asioduasdoipasu877> more text”
I want to return 1asioduasdoipasu877 in this case
There are some instances where “Message-ID: <“ or some variation of it is found more than once in the text file but so far it has always been the first instance.
I have been messing with AppleScript’s text item delimiters but am just guessing and have not guessed right, any help would be greatly appreciated.
set theText to "
“more text Message-ID:
<asdkjdfkjdfkj@sdfksjfd>more text”
I want to return asdkjdfkjdfkj@sdfksjfd in this case
"
set searched to item 2 of my decoupe(theText, "Message-ID")
set searched to item 2 of my decoupe(searched, {"<", ">"})
#=====
on decoupe(t, d)
local oTIDs, l
set {oTIDs, AppleScript's text item delimiters} to {AppleScript's text item delimiters, d}
set l to text items of t
set AppleScript's text item delimiters to oTIDs
return l
end decoupe
#=====
Yvan KOENIG running High Sierra 10.13.6 in French (VALLAURIS, France) vendredi 3 avril 2020 10:17:45
set aText to "“more text Message-ID:<1>more text”
I want to return 1 in this case
“more text
Message-ID: <1234567> more text”
I want to return 1234567 in this case
“more text Message-ID:
<1asioduasdoipasu877> more text”
I want to return 1asioduasdoipasu877 in this case
“more text Message-ID:
<asdkjdfkjdfkj@sdfksjfd>more text”
I want to return asdkjdfkjdfkj@sdfksjfd in this case"
set theSubTexts to {}
set countText to count aText
set mOffset to 1
repeat
set a to text mOffset thru (mOffset + 10) of aText
if a is "Message-ID:" then
repeat with i from mOffset + 11 to countText
if character i of aText is "<" then exit repeat
end repeat
repeat with j from i + 1 to countText
if character j of aText is ">" then exit repeat
end repeat
set end of theSubTexts to text (i + 1) thru (j - 1) of aText
set mOffset to j + 1
else
set mOffset to mOffset + 1
end if
if mOffset > countText - 13 then exit repeat
end repeat
return theSubTexts
--> RESULT: {"1", "1234567", "1asioduasdoipasu877", "asdkjdfkjdfkj@sdfksjfd"}
If you need to extract every different occurences, you may use :
set aText to "“more text Message-ID:<1>more text”
I want to return 1 in this case
“more text
Message-ID: <1234567> more text”
I want to return 1234567 in this case
“more text Message-ID:
<1asioduasdoipasu877> more text”
I want to return 1asioduasdoipasu877 in this case
“more text Message-ID:
<asdkjdfkjdfkj@sdfksjfd>more text”
I want to return asdkjdfkjdfkj@sdfksjfd in this case
“more text Message-ID:
<1asioduasdoipasu877> more text”
I want to return 1asioduasdoipasu877 in this case
“more text Message-ID:
<asdkjdfkjdfkj@sdfksjfd>more text”
I want to return asdkjdfkjdfkj@sdfksjfd in this case"
set inList to rest of my decoupe(aText, "Message-ID:")
set allStrings to {}
repeat with aString in inList
set aString to (item 2 of my decoupe(aString, {"<", ">"})) as text
if aString is not in allStrings then set end of allStrings to aString
end repeat
allStrings
#=====
on decoupe(t, d)
local oTIDs, l
set {oTIDs, AppleScript's text item delimiters} to {AppleScript's text item delimiters, d}
set l to text items of t
set AppleScript's text item delimiters to oTIDs
return l
end decoupe
#=====
Yvan KOENIG running High Sierra 10.13.6 in French (VALLAURIS, France) vendredi 3 avril 2020 17:30:46
Yes, your code is better than mine. I’ll take it to the library as a good example of parsing.
Your code is very well if you want get only different occurences. My script returnes all the occurences, so I will also leave it in case duplicates are also needed. Users should also understand the difference between the 2 approaches, so that they can choose what they need.
I would call my script “Parse the text by keywords”, and yours would say “Parse the text by keywords & remove the duplicates”. In most cases, users will need the option to remove duplicates.
Thank you both! I am trying to read the contents of a file and got it to work by adding Line 4 (set aText to "" " & aText & "" ") but it fails partially through the script. I am a hack and do not know why some things work and others don’t so I am sure it is something simple but I do not know what it is.
set theFile to choose file with prompt "Please select a text file to read:"
set theFile to theFile as string
set aText to read file theFile
set aText to "\" " & aText & "\" "
--set aText to "“more text Message-ID:<1>more text”
--I want to return 1 in this case
--
--“more text
--Message-ID: <1234567> more text”
--I want to return 1234567 in this case
--
--“more text Message-ID:
-- <1asioduasdoipasu877> more text”
--I want to return 1asioduasdoipasu877 in this case
--
--“more text Message-ID:
--
--<asdkjdfkjdfkj@sdfksjfd>more text”
--I want to return asdkjdfkjdfkj@sdfksjfd in this case
--
--“more text Message-ID:
-- <1asioduasdoipasu877> more text”
--I want to return 1asioduasdoipasu877 in this case
--
--“more text Message-ID:
--
--<asdkjdfkjdfkj@sdfksjfd>more text”
--I want to return asdkjdfkjdfkj@sdfksjfd in this case"
set inList to rest of my decoupe(aText, "Message-ID:")
set allStrings to {}
repeat with aString in inList
set aString to (item 2 of my decoupe(aString, {"<", ">"})) as text
if aString is not in allStrings then set end of allStrings to aString
end repeat
allStrings
#=====
on decoupe(t, d)
local oTIDs, l
set {oTIDs, AppleScript's text item delimiters} to {AppleScript's text item delimiters, d}
set l to text items of t
set AppleScript's text item delimiters to oTIDs
return l
end decoupe
#=====
The second one:
X-MS-Exchange-Organization-Network-Message-Id:
0e0751b6-dc27-4a9a-75c7-08d7d7748d7c
X-MS-Exchange-Organization-SCL: -1
isn’t. The characters “<” and “>” are missing.
If this structure is the common one, you must replace my original script by :
set theFile to choose file with prompt "Please select a text file to read:"
-- set theFile to theFile as string
set aText to read theFile
set aText to "\" " & aText & "\" "
if aText does not contain "Message-ID:" then error "This file doesn't contain the strin “Message-ID:” !"
--set aText to "“more text Message-ID:<1>more text”
--I want to return 1 in this case
--
--“more text
--Message-ID: <1234567> more text”
--I want to return 1234567 in this case
--
--“more text Message-ID:
-- <1asioduasdoipasu877> more text”
--I want to return 1asioduasdoipasu877 in this case
--
--“more text Message-ID:
--
--<asdkjdfkjdfkj@sdfksjfd>more text”
--I want to return asdkjdfkjdfkj@sdfksjfd in this case
--
--“more text Message-ID:
-- <1asioduasdoipasu877> more text”
--I want to return 1asioduasdoipasu877 in this case
--
--“more text Message-ID:
--
--<asdkjdfkjdfkj@sdfksjfd>more text”
--I want to return asdkjdfkjdfkj@sdfksjfd in this case"
set inList to rest of my decoupe(aText, {"Message-ID:<", "Message-ID:" & linefeed & tab & "<", "Message-ID:" & return & tab & "<"})
set allStrings to {}
repeat with aString in inList
set aString to (item 1 of my decoupe(aString, {">"})) as text
if aString is not in allStrings then set end of allStrings to aString
end repeat
allStrings
#=====
on decoupe(t, d)
local oTIDs, l
set {oTIDs, AppleScript's text item delimiters} to {AppleScript's text item delimiters, d}
set l to text items of t
set AppleScript's text item delimiters to oTIDs
return l
end decoupe
#=====
But I have no guarantee that I treat every config of the delimiter “Message-ID:”, something,“<”.
It’s why I used the original code but I didn’t guess that your documents may fail to match your description.
Yvan KOENIG running High Sierra 10.13.6 in French (VALLAURIS, France) vendredi 3 avril 2020 20:00:46
I only want it to return a result if this criteria is met “Message-ID:” ‘maybe some text, maybe not’ followed by “<” ‘needed text’ followed by “>”
This is the only time I need it to return a result independent of any other occurrences that may come close to matching the above criteria. I apologize if I was not clear initially. I was just trying to show that there could be multiple variances of the text / characters before after the needed text. Either way thank you very much, greatly appreciated.
set theText to "
“more text Message-ID:<1>more text”
I want to return 1 in this case
“more text
Message-ID: <1234567> more text”
I want to return 1234567 in this case
“more text Message-ID:
<1asioduasdoipasu877> more text”
I want to return 1asioduasdoipasu877 in this case
“more text Message-ID:
<asdkjdfkjdfkj@sdfksjfd>more text”
I want to return asdkjdfkjdfkj@sdfksjfd in this case
There are some instances where “Message-ID: <“ or some variation of it is found more than once in the text file but so far it has always been the first instance.
“more text Message-ID:
1asioduasdoipasu877@> more text” <asdkjdfkjπdfkj@sdfksjfd>more text”
I want to return 1asioduasdoipasu877 in this case
“more text Message-ID:
<asdkjdfkjdfkj@sdfksjfd>more text”
I want to return asdkjdfkjdfkj~sdfksjfd in this case
There are some instances where “Message-ID: <“ or some variation of it is found more than once in the text file but so far it has always been the first instance."
set inList to rest of my decoupe(theText, "Message-ID:")
set allStrings to {}
repeat with aString in inList
if aString contains "<" then
set subList to rest of my decoupe(aString, "<")
repeat with bString in subList
if bString contains ">" then
set cString to item 1 of my decoupe(bString, ">") as string
if cString is not in allStrings then set end of allStrings to cString
end if
end repeat
--if aString is not in allStrings then set end of allStrings to aString
end if
end repeat
allStrings
#=====
on decoupe(t, d)
local oTIDs, l
set {oTIDs, AppleScript's text item delimiters} to {AppleScript's text item delimiters, d}
set l to text items of t
set AppleScript's text item delimiters to oTIDs
return l
end decoupe
#=====
Yvan KOENIG running High Sierra 10.13.6 in French (VALLAURIS, France) samedi 4 avril 2020 11:04:38