Saturday, November 18, 2017

#1 2006-09-07 02:41:06 pm

Bruce Phillips
Administrator
Registered: 2004-07-16
Posts: 2649

Trim [Remove Spaces]

trim:

Applescript:

-- theseCharacters : A list of characters to trim
-- someText : The text to be trimmed
--
on trim(theseCharacters, someText)
   -- Lazy default (AppleScript doesn't support default values)
   if theseCharacters is true then set theseCharacters to ¬
       {" ", tab, ASCII character 10, return, ASCII character 0}
   
   repeat until first character of someText is not in theseCharacters
       set someText to text 2 thru -1 of someText
   end repeat
   
   repeat until last character of someText is not in theseCharacters
       set someText to text 1 thru -2 of someText
   end repeat
   
   return someText
end trim

-- Example
trim(" Hello, World!    ", true)

A slightly simpler version that only removes spaces:

Applescript:

on trim(someText)
   repeat until someText does not start with " "
       set someText to text 2 thru -1 of someText
   end repeat
   
   repeat until someText does not end with " "
       set someText to text 1 thru -2 of someText
   end repeat
   
   return someText
end trim


Filed under: trim, strip

Offline

 

#2 2015-08-10 02:21:37 am

JMichaelTX
Member
From:: Houston, TX (The Woodlands)
Registered: 2014-07-12
Posts: 139

Re: Trim [Remove Spaces]

@Bruce Phillips:  Thanks for sharing your code for a trim function.

I have optimized it somewhat to hopefully:
1. Improve performance for large strings by updating the source string only after calculating the area to be trimmed.
2. Use more descriptive variable names.
3. Added option to trim left, right, or both sides of string.

But the core code is the same.

Applescript:


--

set strTest to "\tsome text here and here\t "
my trimThis(strTest, true, "full")

on trimThis(pstrSourceText, pstrCharToTrim, pstrTrimDirection)
   
   -- pstrCharToTrim     : A list of characters to trim, or true to use default
   -- pstrSourceText : The text to be trimmed
   -- pstrTrimDirection : Direction of Trim ("right","left", "full")
   
   set strTrimedText to pstrSourceText
   
   ---    USE DEFAULT IF true IS PASSED ---
   -- Lazy default (AppleScript doesn't support default values)
   
   if pstrCharToTrim is true then
       set pstrCharToTrim to {" ", tab, ASCII character 10, return, ASCII character 0}
   end if
   
   --- TRIM LEFT SIDE OF STRING ---
   
   if (pstrTrimDirection = "full") or (pstrTrimDirection = "left") then
       set iLoc to 1
       repeat until character iLoc of strTrimedText is not in pstrCharToTrim
           set iLoc to iLoc + 1
       end repeat
       
       set strTrimedText to text iLoc thru -1 of strTrimedText
   end if
   
   --- TRIM RIGHT SIDE OF STRING ---
   
   
   if (pstrTrimDirection = "full") or (pstrTrimDirection = "right") then
       set iLoc to count of strTrimedText
       repeat until character iLoc of strTrimedText is not in pstrCharToTrim
           set iLoc to iLoc - 1
       end repeat
       
       set strTrimedText to text 1 thru iLoc of strTrimedText
       
   end if
   
   return strTrimedText
   
end trimThis


iMac-27 Late 2015 Retina 5K Screen (& others)
macOS 10.11.6 (El Capitan)

Offline

 

#3 2015-08-10 05:39:30 am

DJ Bazzie Wazzie
Member
From:: the Netherlands
Registered: 2004-10-20
Posts: 2724
Website

Re: Trim [Remove Spaces]

Thanks for the update Michael but when bringing old posts back to live you should consider changes of AppleScript through the years as well. Improvements over Michaels scripts are:

- Since AppleScript 2.0 the script should use (character) id instead of ASCII character command.
- AppleScript 2.0 is Unicode, by default all Unicode spaces are supported now, not only limited to MacRoman whitespaces
- BugFix: The handler will return "" if the string contains only whitespaces
- BugFix: The handler will return "" if the string is empty
- Optimization: The string will only be trimmed once, not twice (text n thru n ...)
- Use justification constants left and right to indicate direction, missing value (any value) a full trim will be applied.
- pstrCharToTrim works now with a more obvious value than a boolean. Use missing value (read: undefined) to use default whitespaces. It will also use default values when the given object is not a list instead of returning an error.

Applescript:

set strTest to "    Hello World! "
trimThis(strTest, missing value, left) -- result: "Hello World! "
trimThis(strTest, missing value, right) -- result: "    Hello World!"
trimThis(strTest, {tab, return, linefeed}, missing value) -- result: "Hello World! "

on trimThis(pstrSourceText, pstrCharToTrim, pstrTrimDirection)
   -- pstrCharToTrim     : A list of characters to trim, or true to use default
   -- pstrSourceText : The text to be trimmed
   -- pstrTrimDirection : Direction of Trim left, right or any value for full
   
   set strTrimedText to pstrSourceText
   
   -- If undefinied use default whitespaces
   if pstrCharToTrim is missing value or class of pstrCharToTrim is not list then
       -- trim tab, newline, return and all the unicode characters from the 'separator space' category
       -- [url]http://www.fileformat.info/info/unicode/category/Zs/list.htm[/url]
       set pstrCharToTrim to {tab, linefeed, return, space, character id 160, character id 5760, character id 8192, character id 8193, character id 8194, character id 8195, character id 8196, character id 8197, character id 8198, character id 8199, character id 8200, character id 8201, character id 8202, character id 8239, character id 8287, character id 12288}
   end if
   
   set lLoc to 1
   set rLoc to count of strTrimedText
   
   --- From left to right, get location of first non-whitespace character
   if pstrTrimDirection is not right then
       repeat until lLoc = (rLoc + 1) or character lLoc of strTrimedText is not in pstrCharToTrim
           set lLoc to lLoc + 1
       end repeat
   end if
   
   -- From right to left, get location of first non-whitespace character
   if pstrTrimDirection is not left then
       repeat until rLoc = 0 or character rLoc of strTrimedText is not in pstrCharToTrim
           set rLoc to rLoc - 1
       end repeat
   end if
   
   if lLoc ≥ rLoc then
       return ""
   else
       return text lLoc thru rLoc of strTrimedText
   end if
end trimThis

Last edited by DJ Bazzie Wazzie (2015-08-10 05:44:23 am)

Offline

 

#4 2015-08-10 08:04:52 am

Shane Stanley
Member
From:: Australia
Registered: 2002-12-07
Posts: 5175

Re: Trim [Remove Spaces]

DJ Bazzie Wazzie wrote:

you should consider changes of AppleScript through the years as well.


And why not wink

Applescript:

use AppleScript version "2.4"
use framework "Foundation"

set strTest to "    Hello World! "
trimThis(strTest, missing value, left) -- result: "Hello World! "
trimThis(strTest, missing value, right) -- result: " Hello World!"
trimThis(strTest, tab & return & linefeed, missing value) -- result: "Hello World! "

on trimThis(pstrSourceText, pstrCharToTrim, pstrTrimDirection)
   -- pstrCharToTrim : A list of characters to trim, or true to use default
   -- pstrSourceText : The text to be trimmed
   -- pstrTrimDirection : Direction of Trim left, right or any value for full
   if pstrCharToTrim = missing value or pstrCharToTrim = true then
       set setToTrim to current application's NSCharacterSet's whitespaceAndNewlineCharacterSet()
   else
       set setToTrim to current application's NSCharacterSet's characterSetWithCharactersInString:pstrCharToTrim
   end if

   set anNSString to current application's NSString's stringWithString:pstrSourceText
   if pstrTrimDirection = left then
       set theRange to anNSString's rangeOfCharacterFromSet:(setToTrim's invertedSet())
       if |length| of theRange = 0 then return ""
       set anNSString to anNSString's substringFromIndex:(theRange's location)
   else if pstrTrimDirection = right then
       set theRange to anNSString's rangeOfCharacterFromSet:(setToTrim's invertedSet()) options:(current application's NSBackwardsSearch)
       if |length| of theRange = 0 then return ""
       set anNSString to anNSString's substringToIndex:(theRange's location)
   else
       set anNSString to anNSString's stringByTrimmingCharactersInSet:setToTrim
   end if
   return anNSString as text
end trimThis

Considerably slower on the sample string, but gets more competitive with longer strings.


Shane Stanley <sstanley@myriad-com.com.au>
www.macosxautomation.com/applescript/apps/

Offline

 

#5 2015-08-10 09:59:22 am

JMichaelTX
Member
From:: Houston, TX (The Woodlands)
Registered: 2014-07-12
Posts: 139

Re: Trim [Remove Spaces]

DJ Bazzie Wazzie wrote:

Thanks for the update Michael but when bringing old posts back to live you should consider changes of AppleScript through the years as well. Improvements over Michaels scripts are:

- Since AppleScript 2.0 the script should use (character) id instead of ASCII character command.
- AppleScript 2.0 is Unicode, by default all Unicode spaces are supported now, not only limited to MacRoman whitespaces
- BugFix: The handler will return "" if the string contains only whitespaces
- BugFix: The handler will return "" if the string is empty
- Optimization: The string will only be trimmed once, not twice (text n thru n ...)
- Use justification constants left and right to indicate direction, missing value (any value) a full trim will be applied.


@DJ:  Thanks for the backhanded compliment.  sad

When providing what you think is an improved version of someone else's script, it is not necessary to be critical if nothing wrong was done.  Whether or not your version is better will be left up to the reader.

I see nothing wrong with using the ASCII character command.  It works and is more obvious than the character ID approach.
I also think it is more obvious to use a value of "full" rather than missing value for parameter pstrTrimDirection.  There's no reason not to allow for both.

Your "optimization" of trimming once instead of twice is a very minor optimization.  But it good.

Thanks for sharing your version.  It is good the thread is alive again.  cool


iMac-27 Late 2015 Retina 5K Screen (& others)
macOS 10.11.6 (El Capitan)

Offline

 

#6 2015-08-10 11:20:01 am

DJ Bazzie Wazzie
Member
From:: the Netherlands
Registered: 2004-10-20
Posts: 2724
Website

Re: Trim [Remove Spaces]

JMichaelTX wrote:

@DJ:  Thanks for the backhanded compliment.  sad


That was no my intention and didn't meant to.

JMichaelTX wrote:

I see nothing wrong with using the ASCII character command.


Me neither but I prefer to use Apple's guidelines and the AppleScript Language Guide who clearly says no longer to use the ASCII number and ASCII character commands since AppleScript 2.0.

JMichaelTX wrote:

Whether or not your version is better will be left up to the reader.


With that analogy Bruce's version can still be the best, depending on the mind that's reading it wink

I still consider removing the possible errors and making it AppleScript 2.0 compatible as an improvement which is more than a preference. The improvement is not in performance but in reliability and support. The "right" "left" "full" was to make it cleaner (personal preference to make the if statement faster/cleaner inside the routine) and not to use true but missing value is based on the global gentlemen agreement between programmers. The latter is technically not wrong, but it's uncommon and may lead to confusing.

Offline

 

#7 2015-08-10 11:50:38 am

JMichaelTX
Member
From:: Houston, TX (The Woodlands)
Registered: 2014-07-12
Posts: 139

Re: Trim [Remove Spaces]

DJ Bazzie Wazzie wrote:
JMichaelTX wrote:

I see nothing wrong with using the ASCII character command.


Me neither but I prefer to use Apple's guidelines and the AppleScript Language Guide who clearly says no longer to use the ASCII number and ASCII character commands since AppleScript 2.0.


OK, I did not realize the ASCII commands had been depreciated.

From the Apple Guidelines:

id
    Access:    read-only
    Class:    integer or list of integer
    A value (or list of values) representing the Unicode code point (or code points) for the character (or characters) in the text object. (A Unicode code point is a unique number that represents a character and allows it to be represented in an abstract way, independent of how it is rendered. A character in a text object may be composed of one or more code points.)
    This property, added in AppleScript 2.0, can also be used as an address, which allows mapping between Unicode code point values and the characters at those code points. For example, id of "A" returns 65, and character id 65 returns "A".
    The id of text longer than one code point is a list of integers, and vice versa: for example, id of "hello" returns {104, 101, 108, 108, 111}, and string id {104, 101, 108, 108, 111} returns "hello". (Because of a bug, text id ... does not work; you must use one of string, Unicode text, or character.)
    These uses of the id property obsolete the older ASCII character and ASCII number commands, since, unlike those, they cover the full Unicode character range and will return the same results regardless of the user's language preferences.


Maybe even better is to use the constants:

AS-WhiteSpace-Constants.gif


iMac-27 Late 2015 Retina 5K Screen (& others)
macOS 10.11.6 (El Capitan)

Offline

 

#8 2015-08-10 09:02:51 pm

Shane Stanley
Member
From:: Australia
Registered: 2002-12-07
Posts: 5175

Re: Trim [Remove Spaces]

JMichaelTX wrote:

I see nothing wrong with using the ASCII character command.  It works


AS you've found, it's deprecated. And yes, it does work -- but only for true ASCII characters -- that is, for numbers 0 to 127. Above that, it's unreliable.

But there's another reason to prefer the id approach: it's much faster. The difference is insignificant in this example because you're only calling it once, but it's worth keeping in mind. Using ASCII character involves sending an Apple event, where using id does not. For example:

Applescript:

set theStart to current date
repeat 3000 times
   repeat with i from 0 to 127
       ASCII character i
   end repeat
end repeat
return (current date) - theStart

And this:

Applescript:

set theStart to current date
repeat 3000 times
   repeat with i from 0 to 127
       string id i
   end repeat
end repeat
return (current date) - theStart

Using a constant like linefeed is quicker still, but only by the narrowest of margins.


Shane Stanley <sstanley@myriad-com.com.au>
www.macosxautomation.com/applescript/apps/

Offline

 

#9 2015-08-10 10:36:49 pm

JMichaelTX
Member
From:: Houston, TX (The Woodlands)
Registered: 2014-07-12
Posts: 139

Re: Trim [Remove Spaces]

Shane Stanley wrote:
JMichaelTX wrote:

I see nothing wrong with using the ASCII character command.  It works


AS you've found, it's deprecated. And yes, it does work -- but only for true ASCII characters -- that is, for numbers 0 to 127. Above that, it's unreliable.


You're beating a dead horse.  wink

I noted it was depreciated, and even quoted the Apple Guidelines.
I was already in agreement to not use the ASCII commands.

JMichaelTX wrote:

OK, I did not realize the ASCII commands had been depreciated.

From the Apple Guidelines:
. . .


iMac-27 Late 2015 Retina 5K Screen (& others)
macOS 10.11.6 (El Capitan)

Offline

 

Board footer

Powered by FluxBB

RSS (new topics) RSS (active topics)