First off: AppleScript is terribly slow when it comes to processing strings. To process longer text blobs, use a shell script.
I found myself writing a method to extract portions of a string.
substrid(text_string, first_character_position, last_character_position)
My goals for it where:
- must always return a string:
- even if paramaters point out of bounds → “”
- if empty string is supplied → “”
- if a parameter is zero → depending on the other but always a string
- must reverse the resulting string when last_character_position comes before first_character_position
- should an AppleScript error occur (never say never), the resulting string contains the message
Then something drove me to write a method that would behave almost exactly as the PHP function. The first difference is that a length value must always be passed.
(AppleScript’s or my limitation)
The other difference: PHP’s substr() function returns boolean false on occassion.
My AppleScript ‘clone’ doesn’t, it returns an empty string
In my oppinion that is the better return value in an AppleScript context where
you can’t use:
if returned_string then …
this is the base script
substrid(text_string, first_character_position, last_character_position)
to substrid(s_text, i_start, i_stop)
(* core: string from pos to pos of string *)
--log "substrid 1: " & s_text & ", " & i_start & ", " & i_stop
try
if 0 = i_stop then
if i_start ≤ 0 then
return ""
else
set i_stop to 1
end if
end if
if 0 = i_start then set i_start to 1
set i_text_length to length of s_text
if 0 = i_text_length then return ""
--log " " & (i_text_length) & ", " & (i_text_length * -1)
-- both out of bounds
if (i_start < i_text_length * -1 and i_stop < i_text_length * -1) or (i_start > i_text_length and i_stop > i_text_length) then return ""
if i_start > i_text_length then set i_start to i_text_length --return ""
if i_start < i_text_length * -1 then set i_start to 1 --i_text_length * -1 --return ""--
if i_stop > i_text_length then set i_stop to i_text_length
if i_stop < i_text_length * -1 then set i_stop to 1 --i_text_length * -1
set sReturn to characters i_start thru i_stop of s_text as text
--log "substrid 2: " & s_text & ", " & i_start & ", " & i_stop & " => " & sReturn
if (i_stop < 0) and (i_start > 0) then
set i_start_true_pos to i_start
set i_stop_true_pos to i_text_length + i_stop
else if (i_stop > 0) and (i_start < 0) then
set i_start_true_pos to i_text_length + i_start
set i_stop_true_pos to i_stop
(*else if (i_stop > 0) and (i_start > 0) then*)
else
set i_start_true_pos to 0
set i_stop_true_pos to 0
set i_start_true_pos to i_start
set i_stop_true_pos to i_stop
end if
-- reverse?
if i_stop_true_pos < i_start_true_pos then
-- some alternative code lines that all result in the same
--return reverse of sReturn's characters as text
-- or 'super abrev'
--return sReturn's characters's reverse as text
-- or more 'old school'
return the reverse of the characters of the sReturn as text
else
return sReturn
end if
on error msg number n
return msg & " (" & n & ") original string:[" & s_text & "]:"
end try
end substrid
this is the php substr() ‘clone’, it uses the substrid() method above
to PHP_substr(s_text, i_start, i_length)
(* variation: like php substr() index -> start with 0 not 1 and instead of returning false, it returns an empty string; also i_length is not optional -> use -1 to get remainder of s_text *)
try
--log "PHP_substr a: " & s_text & ", " & i_start & ", " & i_length
-- first those that clearly leave an empty string
-- zero length
if 0 = i_length then return ""
-- both negative and start is bigger or same as length
if 0 > i_start and 0 > i_length and i_start ≥ i_length then return ""
set i_text_length to length of s_text
-- zero text length
if 0 = i_text_length then return ""
-- zero start, negative length and length points beyond start
if 0 = i_start and 0 > i_length and i_length ≤ i_text_length * -1 then return ""
-- both positive and start points beyond end
if 0 < i_start and 0 < i_length and i_start ≥ i_text_length then return ""
-- zero start and negative length
--if 0 = i_start and 0 < i_length then set i_start to 1
-- determine real positions and use substrid to return string
if 0 < i_start then
-- positive start
if 0 < i_length then
-- positive length
set i_stop_tmp to i_start + i_length
set i_start_tmp to i_start + 1
if 1 > i_start_tmp then set i_start_tmp to 1
else
-- negative length
set i_stop_tmp to i_text_length + i_length
set i_start_tmp to i_start + 1
if i_start_tmp > i_stop_tmp then return "" --set i_stop_tmp to 1
end if
else if 0 = i_start then
if 0 < i_length then
-- positive length
set i_start_tmp to 1
set i_stop_tmp to i_start_tmp + i_length - 1
else
-- negative length
set i_stop_tmp to i_text_length + i_length
if 1 > i_stop_tmp then set i_stop_tmp to 1
set i_start_tmp to i_start --+ 1
end if
else
-- negative start -> start from back
if i_start < i_text_length * -1 then return ""
if 0 < i_length then
set i_start_tmp to i_text_length + i_start + 1
set i_stop_tmp to i_start_tmp + i_length - 1
--log i_start_tmp
if 1 > i_start_tmp then set i_start_tmp to 1
else
set i_stop_tmp to i_text_length + i_length
if 1 > i_stop_tmp then set i_stop_tmp to 1
set i_start_tmp to i_start --+ 1
end if
end if
-- make sure we don't get a reversed string back
if i_start_tmp < i_stop_tmp then
set i_start to i_start_tmp
set i_stop to i_stop_tmp
else
set i_start to i_stop_tmp
set i_stop to i_start_tmp
end if
return substrid(s_text, i_start, i_stop)
on error msg number n
return msg & " (" & n & ") original string:[" & s_text & "]:"
end try
end PHP_substr
and here’s some stuff I used to test both methods with (will require TextWrangler.app to show results)
(* testing methods *)
(*
I tested PHP_substr() method with following parameters
Then I checked that against the output of the attached
PHP script (last in this file) with TextWrangler's Search -> Compare Two Front Documents
menu
I didn't check with special characters
*)
set s to "abcdefghijklmnopqrstuvwxyz"
--performTest("0123456", 10)
--performTest(s, 30)
--performTest("", 5)
to performTest(s, i_spectrum)
set sLog to ""
repeat with i from -i_spectrum to i_spectrum
repeat with j from -i_spectrum to i_spectrum
log "Ai_start: " & i & " Ai_length: " & j
set sLog to sLog & "Ai_start: " & i & " Ai_length: " & j & return & PHP_substr(s, i, j) & return
end repeat
repeat with j from i_spectrum to -i_spectrum by -1
log "Bi_start: " & i & " Bi_length: " & j
set sLog to sLog & "Bi_start: " & i & " Bi_length: " & j & return & PHP_substr(s, i, j) & return
end repeat
end repeat
repeat with i from i_spectrum to -i_spectrum by -1
repeat with j from -i_spectrum to i_spectrum
log "Ci_start: " & i & " Ci_length: " & j
set sLog to sLog & "Ci_start: " & i & " Ci_length: " & j & return & PHP_substr(s, i, j) & return
end repeat
repeat with j from i_spectrum to -i_spectrum by -1
log "Di_start: " & i & " Di_length: " & j
set sLog to sLog & "Di_start: " & i & " Di_length: " & j & return & PHP_substr(s, i, j) & return
end repeat
end repeat
tell application "TextWrangler"
make new window
tell window 1
set selection to sLog
end tell
end tell
end performTest
--performTestSubstrID("0123456", 10)
--performTestSubstrID(s, 30)
--performTestSubstrID("", 5)
to performTestSubstrID(s, i_spectrum)
set sLog to ""
repeat with i from -i_spectrum to i_spectrum
repeat with j from -i_spectrum to i_spectrum
log "Ai_start: " & i & " Ai_length: " & j
set sLog to sLog & "Ai_start: " & i & " Ai_length: " & j & return & substrid(s, i, j) & return
end repeat
repeat with j from i_spectrum to -i_spectrum by -1
log "Bi_start: " & i & " Bi_length: " & j
set sLog to sLog & "Bi_start: " & i & " Bi_length: " & j & return & substrid(s, i, j) & return
end repeat
end repeat
repeat with i from i_spectrum to -i_spectrum by -1
repeat with j from -i_spectrum to i_spectrum
log "Ci_start: " & i & " Ci_length: " & j
set sLog to sLog & "Ci_start: " & i & " Ci_length: " & j & return & substrid(s, i, j) & return
end repeat
repeat with j from i_spectrum to -i_spectrum by -1
log "Di_start: " & i & " Di_length: " & j
set sLog to sLog & "Di_start: " & i & " Di_length: " & j & return & substrid(s, i, j) & return
end repeat
end repeat
tell application "TextWrangler"
make new window
tell window 1
set selection to sLog
end tell
end tell
end performTestSubstrID
(*
#!/usr/bin/php
<?php
/* * * *
* * substr output testing
* *
* * I wanted a PHP substr() equivelant in AppleScript
* * to make sure my AS-script was behaving as exactly
* * as possible, I wrote this script to output the same
* * data structure.
* *
* * the output of the AppleScript performTest() method
* * was compared against the output of this PHP performTest() function
* *
* * there is one difference: php substr() returns false in some cases
* * the AS PHP_substr() method simply returns an empty string in those cases
* * I have no intention of changing that, as I feel it is better this
* * way and trivial in usage
* *
* * version 20090521 (CC) Luke JZ aka SwissalpS
* * * */
$s = 'abcdefghijklmnopqrstuvwxyz';
//performTest('0123456', 10);
//performTest($s, 30);
//performTest('', 5);
function performTest($s, $iSpectrum) {
for ($i = -$iSpectrum; $i < $iSpectrum + 1; $i++) {
for ($j = -$iSpectrum; $j < ($iSpectrum + 1); $j++) {
echo 'Ai_start: ' . $i
. ' Ai_length: ' . $j . '
' . substr($s, $i, $j) . '
';
}
for ($j = $iSpectrum; $j > -($iSpectrum + 1); $j--){
echo 'Bi_start: ' . $i
. ' Bi_length: ' . $j . '
' . substr($s, $i, $j) . '
';
}
}
for ($i = $iSpectrum; $i > -($iSpectrum + 1); $i--){
for ($j = -($iSpectrum); $j < ($iSpectrum + 1); $j++){
echo 'Ci_start: ' . $i
. ' Ci_length: ' . $j . '
' . substr($s, $i, $j) . '
';
}
for ($j = $iSpectrum; $j > -($iSpectrum + 1); $j--){
echo 'Di_start: ' . $i
. ' Di_length: ' . $j . '
' . substr($s, $i, $j) . '
';
}
}
}
/*
$res = substr($s, 22,22);
echo $res;
echo '
';
echo gettype($res);
/* * * *\ substrTest.php 20090521 (CC) Luke JZ aka SwissalpS /* * * */
?>
*)
EDIT: minor edit in substrid() to fix backwards returns
NOTE: this code has not been tested more than described and only on my current setup: OS X 10.5.x; AS 2.0.1
needles to say: use at own risk…etc.