automating html page creation question

Hi all, I am a relative AS novice, so excuse me if I am blundering a bit. What I am trying to use applescript for is to take one large html page and split it into multiple html pages.

Here is the contents of the initial larger page html page:


<!--start 232-->

<p>copy for recipe goes here.</p>

<!--end 232-->
<!--start 121-->

<p>copy for recipe goes here.</p>

<!--end 121-->
<!--start 444-->

What I want to do is use applescript to select the contents of each recipe (starts and ends with comments), and then create/save/close a new document with the selected content.

Here is the AS I have so far.


tell application "BBEdit"
	activate
	find ????can i use grep here?
here is the chunk of grep i want to use
<!--start [0-9][0-9][0-9]-->([^!]*)<!--end [0-9][0-9][0-9]-->
	extended copy selection
	
end tell

Thanks for your help.

Reid

BBEdit 7.0.2
Standard Install of Jaguar (no scripting additions)

Assuming that your grep is correct, here’s the BBEdit code that works for me on a similar task. Maybe it will work for you. :slight_smile:

tell application "BBEdit"
	find "\r<!--start [0-9][0-9][0-9]-->([^!]*)<!--end [0-9][0-9][0-9]--> \r" searching in text 1 of window yourTargetWindow ¬
		options {search mode:grep, starting at top:true, wrap around:false, reverse:false, case sensitive:false, match words:true, extend selection:false} ¬
		with selecting match
end tell

You’ll need to replace ‘yourTargetWindow’ with a reference to the BBEdit window that you want the script to act on. It might also be worth mentioning that BBEdit is recordable so if you run into problems debugging, it might help to record it. :slight_smile:

Cool! I am using cut instead of copy. This way I won’t have to loop through the document.

My next question is how can I grab the number from the html to use when I am saving out the documents? I am assuming that I can’t use any variables I grab from my grep statement?


tell application "BBEdit"
	activate
	find "<!--start [0-9][0-9][0-9]-->([^!]*)<!--end [0-9][0-9][0-9]-->" searching in text 1 of text window 1 options {search mode:grep, starting at top:true, wrap around:false, reverse:false, case sensitive:false, match words:false, extend selection:false} with selecting match
	cut selection
	make new text window
	paste
	save text window 1 to file "Macintosh HD:Users:reid:Desktop:232.htm"
	set current glossary set to "HTML Glossary.html"
end tell

Does the commented stuff end up in the new source?

If so, it shouldn’t be too difficult to do another search in the new page to pull the number out.

Yes, the comments do end up in the new document. But, I need to name the file before then.

Can I grab the variable and assign it from within the BBEdit tell?

The way you store a reference in grep is through using the () THEN 1 syntax. This can’t be done from within AS?

Rob,

I think I am getting closer. The last statement when I am using the variable pageNum to save the new document isn’t working. Is it a syntax problem?


tell application "BBEdit"
	activate
	
	find "[0-9][0-9][0-9]" searching in text 1 of text window 1 options {search mode:grep, starting at top:true, wrap around:false, reverse:false, case sensitive:false, match words:false, extend selection:false} with selecting match
	set pageNum to text window 1
	
	find "\r<!--start [0-9][0-9][0-9]-->([^!]*)<!--end [0-9][0-9][0-9]-->" searching in text 1 of text window 1 options {search mode:grep, starting at top:true, wrap around:false, reverse:false, case sensitive:false, match words:false, extend selection:false} with selecting match
	cut selection
	make new text window
	paste
	
	
	save text window 1 to file "Macintosh HD:Users:reid:Desktop:" & pageNum & ".htm"
	set current glossary set to "HTML Glossary.html"
end tell

Does this work? It does for me.

set ptd to path to desktop
set fileNum to ""
set nums to ("1234567890")

tell application "BBEdit 6.5"
	activate
	set foundText to found text of (find "<!--start [0-9][0-9][0-9]-->([^!]*)<!--end [0-9][0-9][0-9]-->" searching in text 1 of text window 1 options {search mode:grep, starting at top:true, wrap around:false, reverse:false, case sensitive:false, match words:false, extend selection:false} with selecting match)
	delete selection
	make new text window with properties {selection:foundText}
	set foundNum to found text of (find "<!--start [0-9][0-9][0-9]-->" searching in text 1 of text window 1 options {search mode:grep, starting at top:true, wrap around:false, reverse:false, case sensitive:false, match words:false, extend selection:false} with selecting match)
	
	repeat with thischar in foundNum
		if thischar is in nums then
			set fileNum to fileNum & contents of thischar as text
		end if
	end repeat
	
	save window 1 to file ((ptd as text) & fileNum & ".htm")
	set current glossary set to "HTML Glossary.html"
end tell

Note: the script doesn’t deal with the possibility that a file with the same name may already exist in the target location.

Regular expression back-references and patterned results are useful things. Using Satimage osax:

on saveFile(filePath, txt)
	open for access filePath with write permission returning fileRef
	set eof of fileRef to 0
	write txt to fileRef
	close access fileRef
end saveFile

-------

set outputFolder to alias "Macintosh HD:Users:has:Test:recipes:"
set htmlStr to "<!--start 232--> 

<p>copy for recipe goes here.</p> 

<!--end 232--> 
<!--start 121--> 

<p>copy for recipe goes here.</p> 

<!--end 121-->
<!--start 444-->"

set pageList to find text "<!--start ([0-9]+)-->(.*)<!--end \1-->" in htmlStr using "\1\r\2" regexpflag {"EXTENDED"} with regexp, string result and all occurrences
repeat with aPage in pageList
	set fileName to (first paragraph of aPage) & ".html"
	set fileContent to text (paragraph 2) thru -1 of aPage
	set filePath to ((outputFolder as Unicode text) & fileName) as file specification
	saveFile(filePath, fileContent)
end repeat

Many thanks for your help Rob and HAAS.

I will definitely have to check out the osax you listed HAAS.

Here is the code that I ended up using:


repeat 4 times  //i am manually going to change this-no biggie
	
	set ptd to path to desktop
	set fileNum to ""
	set nums to ("1234567890")
	
	
	
	tell application "BBEdit"
		activate
		set foundText to found text of (find "<!--start [0-9][0-9][0-9]-->([^!]*)<!--end [0-9][0-9][0-9]-->\r" searching in text 1 of text window 1 options {search mode:grep, starting at top:true, wrap around:false, reverse:false, case sensitive:false, match words:false, extend selection:false} with selecting match)
		delete selection
		make new text window with properties {selection:foundText}
		set foundNum to found text of (find "<!--start [0-9][0-9][0-9]-->" searching in text 1 of text window 1 options {search mode:grep, starting at top:true, wrap around:false, reverse:false, case sensitive:false, match words:false, extend selection:false} with selecting match)
		
		repeat with thischar in foundNum
			if thischar is in nums then
				set fileNum to fileNum & contents of thischar as text
			end if
		end repeat
		
		save window 1 to file ((ptd as text) & fileNum & ".htm")
		close window 1
		set current glossary set to "HTML Glossary.html"
	end tell
	
end repeat