Pages: Mark (or delete) paragraphs that have a certain language?

Is it possible to mark (or delete) paragraphs that have a certain language (e.g. German or English), as automatically detected by Pages?

(The purpose is to to count all text in a document with German and English, that’s not in English.)

It’s always a complete paragraph that has a certain language.

Hi,

Marking English paragraphs with BLUE color (for German paragraphs use German alphabet):


set englishAlphabet to {" ", "!", "\"", "#", "$", "%", "&", "'", "(", ")", "*", "+", ",", "-", ".", "/", "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", ":", ";", "<", "=", ">", "?", "@", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "[", "\\", "]", "^", "_", "`", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "{", "|", "}", "~", ""}

tell application "Pages" to tell document 1
	repeat with textItem in (get text items)
		tell textItem
			set theParagraphs to paragraphs of object text
			repeat with i from 1 to count theParagraphs
				set theParagraph to item i of theParagraphs
				set isEnglish to true
				repeat with char in (get characters of theParagraph)
					if not (contents of char is in englishAlphabet) then
						set isEnglish to false
						exit repeat
					end if
				end repeat
				-- MARK English text with BLUE color:
				if isEnglish then set color of paragraph i of object text to {0, 0, 65535}
			end repeat
		end tell
	end repeat
end tell

Do delete paragraph directly, use this instead of color setting:


if isEnglish then delete paragraph i of object text

This should delete all the English paragraphs:

use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions

set theLangs to {}
tell application id "com.apple.iWork.Pages" -- Pages
	tell document 1
		set theParagraphs to paragraphs of body text
		repeat with i from (count of theParagraphs) to 1 by -1
			if length of item i of theParagraphs > 1 then
				set theLanguage to (my guessLanguageOf:(item i of theParagraphs))
				-- will be "en", "de", etc
				if theLanguage = "en" then
					delete paragraph i of body text
				end if
			end if
		end repeat
	end tell
end tell

on guessLanguageOf:theString
	set theTagger to current application's NSLinguisticTagger's alloc()'s initWithTagSchemes:{current application's NSLinguisticTagSchemeLanguage} options:0
	theTagger's setString:theString
	set languageID to theTagger's tagAtIndex:0 |scheme|:(current application's NSLinguisticTagSchemeLanguage) tokenRange:(missing value) sentenceRange:(missing value)
	return languageID as text
end guessLanguageOf:

This AppleScript solution uses the trans shell script command. In Terminal.app, the translate-shell can be installed using Homebrew or MacPorts.

I use Homebrew so this is the command I used to install the translate-shell formula…
[format]brew install translate-shell[/format]

After successful installation of the Homebrew translate-shell formula, the trans command has tons of options. Paste this in a macOS Terminal shell prompt to learn about its features.
[format]man trans[/format] OR [format]trans -h[/format]

This explains how to run Translate Shell from an AppleScript
https://github.com/soimort/translate-shell/wiki/AppleScript

tell application "Pages" to tell document 1
	set theParagraphs to paragraphs of body text
	repeat with thisParagraph from (count of theParagraphs) to 1 by -1
		tell current application to set detectedLanguage to (do shell script "export PATH=\"/usr/local/bin:$PATH\";/usr/local/bin/trans -id -no-ansi :en " & quoted form of item thisParagraph of theParagraphs & " | " & "grep Name " & "| " & "cut -c 23- | tr -d ' '")
		if detectedLanguage = "English" then
			delete paragraph thisParagraph of body text
		end if
	end repeat
end tell