Somewhat fuzzy description of 'offset'

What does that last sentence mean? Looking for either a single char or a string consistently returns the first match.
Sample code:

considering case
	set blok to "The result of matching part of a character cluster is undefined."
	offset of " p" in blok
end considering

I’d like to see a variant which returns an “undefined result”, please.

I think it’s refering to the fact that diacritical characters like “é” may be represented internally either directly as themselves or as clusters consisting of the unadorned characters and modifying codes. In the latter case, ‘offset’ would treat the cluster as one character and wouldn’t be able to pick out any of the consitituent members.

A ghost from the days of multiple text classes, then?

No. It’s to do with the different “Normalization Forms” allowed in Unicode. I don’t know a lot about them, but there’s plenty about them on the Web. Basically, the ASLG extract you posted is saying “Normalization doesn’t affect the results from ‘offset’.”

It is funny to speculate a little :smiley:

A character can consist of several glyphs (like é, which consists of two, but both of them are characters in their own right.) On another level, we deal with ligatures like florin ƒ which is also a character in its own right but may also be put together with other ligatures to form characters.

I read the whole thing as a very difficult way of saying, that the smallest character unit AppleScript deals with is a dsiplayed chararacter. :slight_smile: