Tidy Hexdump of File Contents

CalvinFold · March 20, 2007, 6:37pm

You can see more on this script, and some variants, in this thread:

http://bbs.applescript.net/viewtopic.php?id=20526

Basically it takes a hexdump of a file or files dropped onto it and then outputs it into a text file for viewing. I was using it to try and find differences between Illustrator Native and Acrobat PDF files at the “code” level (long story, explained in the above thread), but given how much work James did to help me on it, I figured others might benefit from it.

I commented it to death for my own notes as well as anyone else trying to figure it out.

--
-- Get Hexdump Info v3
-- by Kevin Quosig, 3/19/07
--
-- Used to drag-n-drop files to examine their contents/headers.
--
-- Most code segments courtesy of James Nierodzik of MacScripter
-- http://bbs.applescript.net/profile.php?id=8727
--


--
-- UTILITY HANDLER
--

-- Search and Replace routine using AppleScript Text Item Delimiters "trick"
--
on searchNreplace(parse_me, find_me, replace_with_me)
	
	--save incoming TID state, set new TIDs
	set {ATID, AppleScript's text item delimiters} to {"", find_me}
	
	--using the specified character as a break point to strip the delimiter out and break the string into items
	set being_parsed to text items of parse_me
	
	--switch the TIDs again (replace string)
	set AppleScript's text item delimiters to {replace_with_me}
	
	--coerce it back to a string with new delimiters
	set parse_me to being_parsed as string
	
	--restore incoming TID state
	set AppleScript's text item delimiters to ATID
	
	--return results
	return parse_me
	
end searchNreplace


--
-- MAIN HANDLER
--
on open fileList
	
	-- parse through files dropped onto droplet
	repeat with i from 1 to number of items in fileList
		
		set AppleScript's text item delimiters to {""} --reset delimiters
		set this_item to item i of fileList as string ---pick item to work with
		set this_item_posix to quoted form of POSIX path of this_item --need POSIX path for shell scripts
		set doc_name to name of (info for alias this_item) --used for renaming the TextEdit window
		
		-- Get hex_dump and fomat (-C parameter sets-up formatting in the "hex and piped human-readable" format),
		-- then pipe to awk, one line at a time (hexdump does a line, then awk works with it further).
		-- Then set awk's field separators (FS) to the pipe character (|), using escaped characters (\"),
		-- this makes awk see two "items" (stuff before the first pipe, and the stuff after it).
		-- {print$2} tells awk to return the second item only (the human-readable part of the hexdump)
		set hex_dump to (do shell script "hexdump -C " & this_item_posix & " | awk  'BEGIN{FS=\"|\"}{print$2}'")
		
		--remove carriage returns so output is one giant paragraph
		--(allows for TextEdit searching for strings and manual scanning)
		set hex_dump to searchNreplace(hex_dump, return, "")
		
		--write to TextEdit window and rename window to file name to keep things straight
		tell application "TextEdit"
			make new document
			set text of front document to hex_dump
			set name of front window to doc_name
		end tell
	end repeat
end open