Hi everyone,
I made the following function to replace all non-alphanumeric characters with a space, but when I started using it on somewhat bigger text files (like 16K) it became obvious that it is very very slow really.
I just looks at each character and then adds it to a new string when it is ok, otherwise it replaces it with a space.
So, am I doing something completely wrong? It 's a straightforward operation, but processing times seem to grow exponentially longer with longer texts. 2k text => 15 seconds, 16k text =>  let’s just say I stopped waiting after 20 minutes (not exaggerating), go figure  When I do this in php for instance it is done in under 1 second.
 When I do this in php for instance it is done in under 1 second.
-- replace non-alphanumeric characters in source_text with a space and removes duplicate spaces
-- note: returns an empty string when the entire input is reduced to 1 space
on fn_normalize(source_text)
	
	-- check input
	try
		set source_text to source_text as string
	on error
		return false
	end try
	
	if length of source_text < 1 then return ""
	
	
	-- replace all characters by a space that are not plain alphanumeric
	set allowed_characters to "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890"
	
	-- built result text with allowed characters
	set result_string to ""
	set last_insert_is_space to true
	
	-- step trhough input characters
	repeat with single_character in the characters of source_text
		
		if ((offset of single_character in allowed_characters) > 0) then
			set result_string to result_string & single_character
			set last_insert_is_space to false
		else
			
			if last_insert_is_space is false then
				set result_string to result_string & space
				set last_insert_is_space to true
			end if
			
		end if
		
	end repeat
	
	
	-- trim end of string
	-- if the string is only one space (all characters were illegal) it will be truncated to an empty string
	if (character -1 of result_string as string) is space then
		set result_string to (characters 1 thru ((length of result_string) - 1) of result_string) as string
	end if
	
	
	return result_string
	
	
end fn_normalize

