In another topic I asked you to run a piece of code, which you wouldn’t try. You should, and try opening a folder with a special character. When you’re not using unicode characters it won’t even open, when you use my example, that supports unicode characters, the file will open without problems.
Actually I may have worded it wrong, was I meant that (without going and finding it in the standard), is that extended characters are not permitted for urls. That is the meaning of what I got out of the w3.org standard for urls.
Now, we both know, that when having folders on your disk, or files, that you want to use in hyperlinks, this scheme is totally impractical, and kind of fascistic!
Still I do wonder in which RFC extended ascii characters for urls, are acknowledged.
Opening with the open command was never any problem, the problem is really that Safari has declared the url scheme file, so it handles it interally. That is I how I have understood it, so no matter the characters, it won’t open the folder, I have tried with nice characters, characters below 127, and nothing really happened.
Now, to really put it to a test, I’ll encode a file with your handler, and I’ll make an anchor tag out of it, and see if that opens, I know the result up front, but just to reach a conclusion!
As for your code, if you go back and read, you’ll see that I indeed tried it.
Opening a url without any encoding necessary, well that actually worked, so I guess the encoding I have used, has messed it up!
Not only those Shane, but the SIMBL’s too. AsobjC-Runner, is a nice exception, in that it dies after a minute of idleness.
Whether those osaxen’s does something or not, there are still pages to be administred, and it eat cycles, and battery. That is my opinon, and I think I will stick with it!
Thanks for your code, consider it snagged, that is really the easy way out of it!
You routine performed equally bad on the folder I had problems with Bazzie Wazzie. Looking at the properties at that folder, it was shared! When folders have the normal rights, I guess everything will work all right.
The folder I had problems with, had the same right for groups and everybody, I think that to be the obstacle, as I see Safari do something, but not enough!
I’ll use something else, than the old routines from the guidbook for the future!
set chars to "#$%##&/???????!#$%!!!!!!%&**@@**@@@@^^^"
set str to ""
repeat with i from 1 to (get random number from 5 to 9)
set str to str & some character of chars
end repeat
tell application "SystemUIServer"
activate
display dialog str
-- your code goes here...
end tell
You can stop wondering because when the URL is encoded there are only 7-bit ascii characters in the URL. when we decode it, it’s an UTF-8 string again. It’s not only Inside Mac OS X, my own web servers, Google, Bing and MacScripter use all UTF-8 encoded URLs for instance.
Here’s a vanilla encoding handler which seems to work OK:
on URIEncode from str given |encoding reserveds|:encodingReserveds
set unreservedChars to "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~"
set reservedChars to ":/?#[]@!$&'()*+,;=%"
set chars to str's characters
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to {"«data rdat", "»"}
considering case
repeat with i from 1 to (count chars)
set thisChar to item i of chars
if not ((thisChar is in unreservedChars) or ((not encodingReserveds) and (thisChar is in reservedChars))) then
-- This character needs to be encoded.
-- Write it to a temporary file as UTF-8 and read it back as data.
set fref to (open for access file ((path to temporary items as text) & "utf8.txt") with write permission)
try
set eof fref to 0
write thisChar as «class utf8» to fref
set d to (read fref from 1 as data)
on error errMsg
display dialog errMsg buttons {"OK"} default button 1
end try
close access fref
-- Use the deliberate-error hack to get text containing a representation of the data object and extract the hex digits from that.
try
d as text
on error errMsg
set hex to text item 2 of errMsg
end try
-- Make up a list of texts consisting of hex-digit pairs prefaced with "%".
set percentCodes to {}
repeat with j from 1 to (count hex) by 2
set end of percentCodes to "%" & text j thru (j + 1) of hex
end repeat
-- Repace the original character with the list of "%" codes.
set item i of chars to percentCodes
end if
end repeat
end considering
-- Coerce the character list back to text.
set AppleScript's text item delimiters to ""
set encodedURL to chars as text
set AppleScript's text item delimiters to astid
return encodedURL
end URIEncode
-- Rubbish URL for testing.
URIEncode from "http://www.アsp.net/Às pqrs.html" without |encoding reserveds|
--> "http://www.%E3%82%A2sp.net/%C3%80s%20pqrs%EF%A3%BF.html"
-- The use of '. with |encoding reserveds|' should be obvious if you need it.
This should give the correct encoding according to RFC 3986
I looked into it yesterday, writing the correct encoding routines in Applescript, it seems just too hard to do it is too many bytes floating around there. Not worth the effort!
But. This must be considered as a vanilla solution?
tell application "Finder"
set a to make new file at folder (path to desktop folder as text) with properties {name:"アsp.net/Às pqrs.html"}
get URL of a
log result
”>%E3%82%A2sp.net:A%CC%80s%20pqrs%EF%A3%BF.html (just the interesting part without path to desktop )
end tell
Edit
Finders decoding do in fact deviate from all of the routines found by google, that I cared to investigate! (The reason for this, is that Finder seem to encode the hfs path of the file!)
I have now tested several routines for encoding and decoding url’s and my conclusion is that this is a very flaky business indeed! It was hard to find two sets that delivered the same results! I guess having heard a lot about RFC 3986 that Bazzie Wazzie’s php routines are fitting the bill, and I will seek no further
” DJ Bazzie Wazzie's encode/decode handlers
on rawURLEncode(str)
return do shell script "/bin/echo -n " & quoted form of str & " | php -r ' echo rawurlencode(fgets(STDIN)); '"
end rawURLEncode
on rawURLDecode(str)
return do shell script "/bin/echo -n " & quoted form of str & " | php -r ' echo rawurldecode(fgets(STDIN)); '"
end rawURLDecode
Well I only read section 2. My test results seem to conform to that, whereas the A-grave in McUsr’s Finder URL does not. Presumably the Finder’s encoding its (or the filing system’s) private system for accented characters.
Edit: Satimage OSAX’s ‘escapeURL’ command gives the same result as my script with my URL. However, its ‘unescapeURL’ command successfully turns McUsr’s Finder result into a readable form and ‘escapeURL’ turns that result back into the Finder original! It’s obviously too pointlessly complex and boring a subject for a vanilla solution.
Oh well, so, we do gamble on that Safari conforms to the RFC, don’t we?
Can of worms. I’ll stick with Bazzie Wazzie’s routines, and if that shceme ever breaks, then I’ll just file a bug with Safari!
I am not in the mood for more testing of this at the moment.
The rejecting of opening folders with spaces in foldernames, escaped or unescaped, in furls, has put me back a little… :o
Here is the new version of my library, as you can see, there are superfluos parameters, for backwards compatibility, that you may remove, should you choose to use it.
I want to add, that when it comes to encoding items on my own disk, then I’ll use the URL of the item from finder, I’ll use this for links that I open locally on my machine, not server, until it breaks in Safari, which I doubt it will, as the file we were testing on, opened with the open command. I know that doens’t give any guarrantee though, as the open command can open folders with spaces in their name, but the file opened at least flawlessly in Safari!
-- URL LIB
script URLLib
on isAvalidHtmlFileUrl(theUrl)
local ok, astid
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to ":"
if not text item 1 of theUrl is "file" then
set AppleScript's text item delimiters to astid
return false
end if
set AppleScript's text item delimiters to "."
if not text item -1 of theUrl is "html" then
set AppleScript's text item delimiters to astid
return false
end if
set AppleScript's text item delimiters to astid
return true
end isAvalidHtmlFileUrl
on decodefurl(anUrlFromABrowser)
-- 27/08/12 Tested!
-- konverterer escaped chars tilbake til normal
-- fjerner file, og local host.
-- localhost starter helt til å begynne med i tilfelle.
local tmpUrl
set tmpUrl to my rawURLDecode(anUrlFromABrowser)
set tmpUrl to my privateHandlers's str_replace({substringToReplace:"file://", replacementString:"", OriginalString:tmpUrl})
if (offset of "localhost" in tmpUrl) is 1 then set tmpUrl to text 10 thru -1 of tmpUrl
return tmpUrl
end decodefurl
-- DJ Bazzie Wazzie http://macscripter.net/edit.php?id=154949
on rawURLEncode(str)
return do shell script "/bin/echo -n " & quoted form of str & " | php -r ' echo rawurlencode(fgets(STDIN)); '"
end rawURLEncode
on rawURLDecode(str)
return do shell script "/bin/echo -n " & quoted form of str & " | php -r ' echo rawurldecode(fgets(STDIN)); '"
end rawURLDecode
on filepath_to_URL(this_file, encode_URL_A, encode_URL_B)
set this_file to this_file as text
set AppleScript's text item delimiters to ":"
set the path_segments to every text item of this_file
repeat with i from 1 to the count of the path_segments
set this_segment to item i of the path_segments
set item i of the path_segments to my rawURLEncode(this_segment)
end repeat
set AppleScript's text item delimiters to "/"
set this_file to the path_segments as string
set AppleScript's text item delimiters to ""
return this_file
end filepath_to_URL
--> set b to getIP from "https://127.0.0.1/path/to/file/"
-->"127.0.0.1"
to getIP from anUrl
local a, b
set a to offset of "//" in anUrl
set b to offset of "/" in (text (a + 2) thru -1 of anUrl)
set ipAddr to text (a + 2) thru (a + b) of anUrl
return ipAddr
end getIP
script privateHandlers
on str_replace(R) -- Returns modified string
-- R {substringToReplace: _ent, replacementString: _rent,OriginalString: _str}
local _tids
set _tids to AppleScript's text item delimiters
set AppleScript's text item delimiters to R's substringToReplace
set _res to text items of R's OriginalString as list
set AppleScript's text item delimiters to R's replacementString
set _res to items of _res as string
set AppleScript's text item delimiters to _tids
return _res
end str_replace
end script
end script
Well it’s maybe nitpicking but what I mean is that an URL consist of a Scheme, Authority, Path, Query and Fragment. Queries and path for instance should be encoded differently. For instance I use urlencode php function encode the query part of an url and I use rawurlencode to encode the path component of the URL. Also a scheme is different encoded as well. So like any other URL controller you should split the URL string into Scheme, Authority, Path, Query and Fragment components and encode it separately with their according encoding rules.
‘Hello World’ should be in a query component ‘Hello+World’ and in an path component ‘Hello%20World’.
This topic started as url encoding of string data but now we’re trying to create a complete solution so that’s why I’m getting picky.
The Dark Lord of Warwickshire
Helps them all out of their quagmire
In space that is quite economical.
The best I've seen
Solutions fast and clean
He helps them out when need is dire
I really guess it holds for encoding the path component of the Urls then, and that is really all what I care about!
I read the W3.org standard during the weekend, and I am not touching any RFC at the moment.
Yes, I have been vading through to day also, to find that clause, stating about regular characters, but maybe it was just a misconception for my part.
I find php superior over perl, in the way that not so many tinker with it. I am far more pessimistic when it comes to perl. Having had some encoding experiences with it.
It’s not that hard to encode a non-encoded URL, but you have to choose the proper language to do in. Because Apple keeps saying to Objective-C developers, who uses NSApplescript objects, not to use URLs because there is no support I think that would apply to AppleScripters as well. Python and Ruby are good programming languages with support for URLs and therefore will save you hundred lines of code in AS to write yourself. For instance to encode an HTML URL I choose PHP. Since 10.6 this code can’t run (http_build_url function) because Mac OS X (including server) doesn’t include php’s pecl extension by default anymore.
set theURL to "http://www.mywebserver.com/path/to/text script.php?string=größe maße&page=2#overview"
do shell script "/bin/echo -n " & quoted form of theURL & " | php -r '$c=parse_url(trim(fgets(STDIN)));
parse_str($c[\"query\"], $c[\"query\"]);
$c[\"query\"] = http_build_query($c[\"query\"]);
$c[\"path\"] = implode(\"/\", array_map(\"rawurlencode\", explode(\"/\", $c[\"path\"])));
echo http_build_url($c);'"
For persons that doesn’t have pecl installed I have the following code (made quickly)
set theURL to "http://djbw:password@www.mywebserver.com/path/to/text script.php?string=größe maße&page=2#overview"
--first parse the url with help from PHP
set rawUrlComponents to do shell script "/bin/echo -n " & quoted form of theURL & " | php -r '$c=parse_url(trim(fgets(STDIN)));
parse_str($c[\"query\"], $c[\"query\"]);
$c[\"query\"] = http_build_query($c[\"query\"]);
$c[\"path\"] = implode(\"/\", array_map(\"rawurlencode\", explode(\"/\", $c[\"path\"])));
foreach($c as $key => $value){printf(\"%s=%s\\n\", $key, $value);}'"
--now built the url string again according to chapter 5.3 in RFC 3986
set urlString to {"", "", "", "", "", "", "", "", "", "", "", "", ""}
repeat with cmp in paragraphs of rawUrlComponents
if cmp begins with "scheme=" then
set item 1 of urlString to text 8 thru -1 of cmp
set item 2 of urlString to ":"
else if cmp begins with "host=" then
set item 8 of urlString to text 6 thru -1 of cmp
set item 3 of urlString to "//"
else if cmp begins with "user=" then
set item 4 of urlString to text 6 thru -1 of cmp
set item 7 of urlString to "@"
else if cmp begins with "pass=" then
set item 6 of urlString to text 6 thru -1 of cmp
set item 5 of urlString to ":"
else if cmp begins with "path=" then
set item 9 of urlString to text 6 thru -1 of cmp
else if cmp begins with "query=" then
set item 11 of urlString to text 7 thru -1 of cmp
set item 10 of urlString to "?"
else if cmp begins with "fragment=" then
set item -1 of urlString to text 10 thru -1 of cmp
set item -2 of urlString to "#"
end if
end repeat
return urlString as string
Also I don’t encode the authority because normally you won’t accept a special character username, password or hostname because there is world wide to many software that doesn’t support that. For example IE built-in ftp client is buggy when reserved characters are used in the authority part.
While you are at it, (and php) seems like the ide language for doing this). How would you decode a POST/GET from a html form with applescript, if applescript was to be the reciepient?
And how would you encode an applescript, if applescript was to be put on a page with the the applescript:// protocol?
The pleasure’s mine, it’s some thing I have been wrestling with through the years. I needed all these information to write a proper and universal SOAP and XML-RPC server. And therefore I had to dug into HTTP, and all it’s related documented that’s needed to write a proper server. All the effort resulted in an general XML-RPC/SOAP/JSON/JSON-RPC server that can be used for every programming languages i’ve worked with so far including AppleScript.
First of all POST and GET (get=url query), but also PUT and DELETE (for crud and rest), are typical HTTP and isn’t typical URL related. POST and GET are almost the same except that the query is in the URL when using GET and the query is saved in the HTTP body (not HTML Body) when using POST.
But you’re totally right because when it comes to general URL encoding the query is differently encoded between different protocols. The example code in my previous post (encoding an http url) I’ve used an other way of encoding the query than you need for AppleScript. For AppleScript the URL is completely RFC3986 while HTTP isn’t.
An applescript can be encoded like this:
set theScript to "tell application \"Finder\"
display dialog \"Hello, i'm created with an URL.\"
end tell"
set encodedScript to rawurlencode(theScript)
open location "applescript://com.apple.scripteditor?action=new&script=" & encodedScript
on rawurlencode(str)
return do shell script "/bin/echo -n " & quoted form of str & " | php -r ' while($line = fgets(STDIN)){echo rawurlencode($line);} '"
end rawurlencode