MS Word: Selecting formatted text from a range

I’m splitting a document of documents by selecting text between bookmarks.

My code is working fine to select and grab plain text. But, I’d also like to have the option of retaining the formatting of the selected text.

I went through the Word dictionary to see properties of the Selection object, and found what looks like the right option: formatted text which I use in this code:

		set myRange to ¬
			(create range active document start (start of bookmark of bookmark beginRng of active document) ¬
				end (end of bookmark of bookmark endRng of active document))
		select myRange
		set docText to formatted text of text object of selection

This runs without errors, but…I’m unable to insert the contents of docText into a new document.
Looking in script debugger, it seems that docText becomes a very complex object when using this option, and I don’t know how to use it.

I guess the easy way out is to use System Events cut & paste to accomplish what I want. But, if anyone knows the secret to doing this within Word, please let me know.

Thanks

You should include the part of your script that doesn’t work as well.

Here is how you could take some formatted text and put it into a new document. Basically, you create a text range in the destination document, the same way you created one in the source document, and then set its formatted text to that taken from the source document.

tell application "Microsoft Word"
	set xd to document "onth.docx"
	tell xd
		-- specify text range in source document
		set srcRge to create range start (start of bookmark of bookmark "bm1") end (end of bookmark of bookmark "bm2")
		set fmtCont to formatted text of content of srcRge
	end tell
	
	set dn to create new document
	activate object dn -- the new document is now the active document
	
	tell document dn
		-- specify text range in destination document
		set destRge to create range dn start 0 end 0
		set formatted text of destRge to fmtCont
	end tell
	
end tell

A couple of notes:

  • active document refers to whichever document is considered active by the script. When you are performing actions that involve multiple documents, it can be helpful to explicitly specify which document you are referencing. In the above script, you could probably put everything inside a tell active document block but you have to be clear as to which document is actually active.
  • In the above snippet, I did set a new active document, but this is just shown as an example in case you want to edit the code. It’s not required.
  • Word has the ability to work directly with text so you don’t need to rely on the selection for ranges of text that you’ve already defined.
  • By the way, when you select and copy text to the clipboard in word, you are putting both the unformatted and formatted text onto the clipboard — the same as when you copy using command-c.
  • Part of what makes anything related to the text ranges or formatted text complex is that these objects/properties are made up of complex bits. So when you look at the properties, instead of seeing a simple text string, you only see a reference to a blob of data. What’s in those blobs is typically just the formatted text (or something like that) which can’t be displayed the same way as plain text.
  • Finally, remember that bookmarks can encompass a range of text, not just a beginning or end of a range.

Was trying to treat it like plain text, using

insert text bodyTxt at end of text object

Thank you, thank you, thank you. It would have taken a ton of trial-and-error to figure this out.

Unfortunately, I’m having trouble setting a range at the end of my existing document text content (which simply consists of an underlined title).

I tried creating the range, and setting the formatted text to the range like this:

		tell activeDoc
			set destRge to create range activeDoc start (end of content) end (end of content)
			set formatted text of destRge to bodyTxt
		end tell

But the formatted text is inserted BEFORE my exsting text.

Any ideas?

That is a quirky issue. I made several attempts and they all had the same result that you’re seeing.

I did find a couple of approaches that you can take though. They assume that you have typed ‘return’ or ‘enter’ after the title line. The first part of the script remains the same. I gave the document a name but you can use other methods to refer to the correct document.

	set dn to document "follow.docx"
	activate object dn -- the new document is now the active document
	tell dn
		set tob to text object of dn 
		set ctob to content of tob -- the text of the document
		set ltob to (length of paragraph 1 of ctob) + 1 -- beginning of paragraph 2

		set destRge to create range dn start ltob end ltob
		set formatted text of destRge to fmtCont
	end tell

You can also work directly with the paragraph’s text object, like so†:

	set dn to document "follow.docx"
	activate object dn -- the new document is now the active document
	tell dn	
		insert paragraph at end of text object of dn
		set destRge to text object of last paragraph of dn
		set alignment of paragraph format of destRge to align paragraph left
		set formatted text of destRge to fmtCont
		insert paragraph at end of destRge
	end tell

Above updated to incorporate paragraph additions; speculation on bookmarks removed based on comment below;

In case you don’t have a return/linefeed at the end of your title line, then you can add one. There are probably many ways to do so but here is one. Generally, you should have an empty paragraph at the end of your document so add two paragraphs. Updated to include insert text and insert paragraph methods.

make new paragraph at end of text object of dn
make new paragraph at end of text object of dn

    –or–

insert text linefeed at end of text object of dn
set alignment of paragraph format of last paragraph of dn to align paragraph left

    –or–

insert paragraph at end of text object of dn
set alignment of paragraph format of last paragraph of dn to align paragraph left

Another quirk of word — and I think this one goes back to the dawn of time — there is sometimes an issue with the end of the document’s formatting affecting new paragraphs. If your title line is centred and is the only line, then you may find that the last paragraph of any added text will become centred, even if it was not in the source. To deal with this in the script, add a line to set the alignment just before you set the formatted text.

set destRge to create range dn start ltob end ltob
set alignment of paragraph format of destRge to align paragraph left
set formatted text of destRge to fmtCont

    –or–

set destRge to text object of paragraph 2 of dn
set alignment of paragraph format of destRge to align paragraph left
set formatted text of destRge to fmtCont
1 Like

You can also do this to insert at the end of a document:

tell active document --or whatever
	set destRge to get story range story type main text story
	set destRge to collapse range destRge direction collapse end
	set formatted text of destRge to fmtCont
end tell
2 Likes

Both Mockman’s and roosterboy’s solutions worked: formatted text was correctly placed AFTER existing text.

But, both had two side-effects:

  1. the original formatting of existing text got corrupted by the inserted formatted text. What’s weird is that the formatting is so corrupted, I can’t even fix it manually. .i.e: I tried re-centering the title text, and it centered 1/4 way into the document, like the margin guides have been moved.

  2. the footers in the document template got blown away for the length of the inserted formatted text

:roll_eyes:

So, I’m now thinking I’d be better off not using a template, but creating a raw doc to insert formatted text into. And then try to add whatever components I need to complete the doc.

Maybe this has more to do with the type of formatted text in the documents I have to work with than any particular issues with Word and AS–I don’t know.

Forgot to mention that I had tried this before you suggested it and had the same result of formatted text inserting BEFORE existing. So bookmarks are a no-go.

Well, I’m glad that you at least have some methods to work with. FWIW, I think that the @roosterboy method, using the collapse range is probably the best.

Just remember that in word, the formatting seems to be hinged on the end of the paragraph. This means that anything added after that will often take on some earlier formatting. This also applies to the end of document, so try and have an empty paragraph at the end of everything before you begin. You can always delete it when you’ve completed the document if it gets in the way.

If you can sanitize a copy of the template, you could then probably zip it and upload it here. I’d be willing to take a look at the formatting to see if there is a way to deal with it.

Here you go.

AppleSourceWordTemplate.zip (15.9 KB)

I re-did my script to insert formatted text into new raw doc.

Then I add a header for title, and footer for page #.

Seems to be working ok, except for alignment of the page # in footer (which is a question for another thread).

But, since this particular script is just to create proof copies, it’s not a big deal. The real docs will be made using plain text via the template (which so far is working just fine).

Thanks for everyone’s help here.

To answer my own question…

Was going to ask how to select plain text from a range. Looked thru the Word 2004 AS Reference, tried a few options, and found this to work:

set unfmtCont to content of myRange

Not sure if its specific enough, but other options I tried returned nothing.

So, I don’t get quite the same results that you do but the formatting does change, especially the font. This is the issue around the end of paragraph/document. Whatever other ‘rules’ Word has, it seems that the most powerful one is that the end of document/paragraph ‘character’ sets some of the formatting.

I think that the most effective way to deal with this would be by modifying your template. Add an empty paragraph that’s got the basic formatting that you want for most paragraphs. Then when you add something at the end of the document, it will default to that. And actually, to take the simplest approach, I’d begin with a blank file and add a few empty paragraphs before returning to the top to format the heading. Then save this as your template.

Also, you can add some formatting code in your script as well, in case you did want to make specific changes.

Oh, and I think that the use of content of myrange is the right approach.

1 Like