Beginner's query on deleting HTML markup

I’m new to Applescript, and to coding, and am seeking a way to delete all content between two brackets, including the brackets-- --from various Word documents.

I’ve searched the forums, paged through several manuals, and am currently staring into an abyss of incapacity. Any advice to a beginner would be appreciated.

Hi mordred. Welcome to MacScripter.

I don’t know how things are in Word documents, but there’s usually more to converting HTML to plain text than just stripping out the tags. However, to answer your specific query, one possible solution would be this:

-- Say the HTML text is in a variable called htmlText.

-- Break up the text using tag boundary characters as delimiters.
-- (The use of multiple delimiters requires Mac OS 10.6 or later.)
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to {"<", ">"}
set theBits to htmlText's text items

-- Even-numbered text items in the resulting list are the contents of tags. Replace them with something other than text.
repeat with i from 2 to (count theBits) by 2
	set item i of theBits to missing value
end repeat

-- Coerce the text items remaining in the list back to a single text.
set AppleScript's text item delimiters to ""
set taglessText to theBits's text as text
set AppleScript's text item delimiters to astid

taglessText --> The desired result.

Thanks for your generous help, Mr. Garvey. I’ve compiled your guide into the following script, which is currently hanging at the bolded bit in the sixth line: error “The variable theBits is not defined.” number -2753 from “theBits”. Is my mistake in the way I’ve composed the fifth line (“selection’s text items”)?

All thanks once again for your help.

tell application “Microsoft Word”
tell selection
set astid to AppleScript’s text item delimiters
set AppleScript’s text item delimiters to {“<”, “>”}
set theBits to selection’s text items
repeat with i from 2 to (count theBits) by 2
set item i of theBits to missing value
end repeat
set AppleScript’s text item delimiters to “”
set taglessText to theBits’s text as text
set AppleScript’s text item delimiters to astid
end tell
end tell

Hi,

in Word the syntax might be


set tid to text item delimiters
set text item delimiters to {"<", ">"}
tell application "Microsoft Word" to set theBits to text items of (get content of selection)
repeat with i from 2 to (count theBits) by 2
	set item i of theBits to missing value
end repeat
set text item delimiters to ""
set taglessText to theBits's text as text
set text item delimiters to tid
tell application "Microsoft Word" to set content of selection to taglessText

Thank you for this, StefanK!