Convert to lower case | error message

First let me say, that I have no trouble using third party tools, and that I started out by just showing a quick and dirty alternative, which DJ Bazzie Wazzie had a more elegant solution to.

Now, the problem with Python, at least in the past where broken installations, so that is one thing I wouldn’t use, for that reason first and foremost. The second thing about Python, is that it is more typing, and therefore more error prone. So, I think frankly that the tr command is a better alternative here, than Python. :slight_smile:

The problem with tr is that we must define the list of extended characters to translate in the command.
When I receive documents, I don’t know which extended characters are embedded so the tr scheme is perfectly useless. I will not build a list of every possible extended chars.
I know that it’s not a problem with texts written in English but English is not ruling the world.
So for a non English user, only a scheme treating the Unicode encoding(s) is relevant.

I never used PHP. In fact I try to remain far from Unix tools as long as there is no other scheme available.

If someone is fair enough to post the PHP convertcase incantations I will be glad to try them.

For me there is no problem with Python’s convertcase commands.
Tested handler is available in a library file from which I extract it when needed.
I posted it here some days ago.

KOENIG Yvan (VALLAURIS, France) dimanche 1 septembre 2013 16:01:00

In PHP you could use something like:

do shell script "php -r 'echo mb_strtoupper(fgets(STDIN), \"UTF-8\");' <<<'größe'"

Python is about 2 times faster than PHP.

I’m not trying to say that I’m right, just explaining why I’m using tr, not trying to say why someone else should. In my language we only have a few more characters than English so the list is very small. I understand why someone else would use 3rd party software or use python. The reason for me not using 3rd party software is because I need to write scripts for hundreds of macs of all kind of different OS. All my scripts needs to run on a out-of-the-box mac.

My problem is that I receive files sent by users from numerous countries so I must have tools able to treat all of them.

When I write a script for an asker, I use the python handler.
When I write for myown use, I use the ASObjC Runner handler because this application is always open during the boot process on my mac.

Thanks for the PHP code.

KOENIG Yvan (VALLAURIS, France) dimanche 1 septembre 2013 16:45:21

Searching on the Net I found the other instructions to convert case with PHP.
So I wrote :

do shell script "php -r 'echo mb_strtoupper(fgets(STDIN), \"UTF-8\");' <<<'größe'"
quoted form of result
"php -r 'echo mb_strtolower(fgets(STDIN), \"UTF-8\");' <<<" & result
do shell script result
quoted form of result
"php -r 'echo mb_ucfirst(fgets(STDIN), \"UTF-8\");' <<<" & result
do shell script result
quoted form of result
"php -r 'echo mb_ucwords(fgets(STDIN), \"UTF-8\");' <<<" & result
do shell script result

The conversion to lower works well but the two others fail.

tell current application
do shell script “php -r ‘echo mb_strtoupper(fgets(STDIN), "UTF-8");’ <<<‘größe’”
→ “GRÖßE”
do shell script “php -r ‘echo mb_strtolower(fgets(STDIN), "UTF-8");’ <<<‘GRÖßE’”
→ “größe”
do shell script “php -r ‘echo mb_ucfirst(fgets(STDIN), "UTF-8");’ <<<‘größe’”
→ error "
Fatal error: Call to undefined function mb_ucfirst() in Command line code on line 1" number 255

If I don’t prefix the command name with mb_ I get :

tell current application
do shell script “php -r ‘echo mb_strtoupper(fgets(STDIN), "UTF-8");’ <<<‘größe’”
→ “GRÖßE”
do shell script “php -r ‘echo mb_strtolower(fgets(STDIN), "UTF-8");’ <<<‘GRÖßE’”
→ “größe”
do shell script “php -r ‘echo ucfirst(fgets(STDIN), "UTF-8");’ <<<‘größe’”
→ "
Warning: ucfirst() expects exactly 1 parameter, 2 given in Command line code on line 1"
do shell script “php -r ‘echo ucwords(fgets(STDIN), "UTF-8");’ <<<’
Warning: ucfirst() expects exactly 1 parameter, 2 given in Command line code on line 1’”
→ "
Warning: ucwords() expects exactly 1 parameter, 2 given in Command line code on line 1"
end tell

Puzzling.

KOENIG Yvan (VALLAURIS, France) dimanche 1 septembre 2013 17:17:16

Perhaps your list is small, but that’s exactly the kind of logic that so many programmers used to avoid offering proper Unicode support for a long time. I’m used to that sort of thing justified by English-speakers, but it troubles me more to hear it from others who have seen the problems it can cause.

You seem adamant about this, so I guess you never have to script non-Apple applications. Nonetheless, it seems a bit odd that in your quest to support all Macs, you’re happy to use language-specific hacks that, by definition, won’t run properly on lots of out-of-the-box Macs.

That’s the whole point of my post: You don’t have to convince me of your right, for me (including my clients) the tr solution can be perfect when I’m programming in AppleScript. For anyone else and global solution and not the restriction of using plain AppleScript there are many solutions out there. If it’s any consolidation, in C I use the standard Unicode functions. :cool:

To me what is good is not starting too many processes. So, that’s always the objective.

On my side, I like to learn so, at this time my point is to know what am I doing wrong when trying to convert only the first character of a sentence or the first character of every words using PHP.

Also, it would be fine if one day, Unix become Unicode aware.

KOENIG Yvan (VALLAURIS, France) lundi 2 septembre 2013 15:39:48

Aren’t you looking for this?

do shell script "php -r 'echo mb_strtoupper(fgets(STDIN), \"UTF-8\");' <<<'größe'"
quoted form of result
"php -r 'echo mb_strtolower(fgets(STDIN), \"UTF-8\");' <<<" & result
do shell script result
quoted form of result
"php -r 'echo ucfirst(fgets(STDIN));' <<<" & result
do shell script result
quoted form of result
"php -r 'echo ucwords(fgets(STDIN));' <<<" & result
do shell script result

If you want multibyte support I suggest you’re using multibyte substring command to get the first character, uppercase the one unicode character string and concatenate that with the rest of the original string. Unfortunatly PHP doesn’t have a built-in solution today. It’s a bit cumbersome in PHP but the only method to uppercase words that begins with extended characters like the dutch word for 1 (één).

Thanks.

Now I understand what was wrong in my attempts and I know that PHP isn’t the tool which I will use.

KOENIG Yvan (VALLAURIS, France) mardi 3 septembre 2013 11:05:02

Here’s one I wrote a while back. Still gets the magic done.

--Upper / Lower 1.0.3
--Panah Neshati

on uppercase(a)
	set b to ASCII number of a
	set b to b - 32
	set b to ASCII character b
	return b
end uppercase

on lowercase(a)
	set b to ASCII number of a
	set b to b + 32
	set b to ASCII character b
	return b
end lowercase

lowercase("A")

uppercase("a")

Sorry to hear yours is broken, that sucks. Hope mine works for you. By the way, it uses no Python or PHP.

Panah

Model: MacBook Pro
AppleScript: 2.2.4
Browser: Firefox 23.0
Operating System: Mac OS X (10.8)

Yours isn’t multi bye supported, that’s where this topic is about. You should also use id instead of ascii number and ascii character since AppleScript is no longer MacRoman encoded (AppleScript 2.0). PHP, Python, satimage and ASObjC runner support case conversions of all unicode characters. tr command supports unicode characters but you have to manually define the conversion table unlike the others, but from all shell commands a lot faster.