The AppleScript forum has a thread about the sdiff shell utility, and I thought I’d write something similar with ASObjC just for practice. My first effort does a simple string comparison and uses the logic shown in the following table. The words left and right in this table refer to corresponding lines in files one and two.
Circumstance
Example
left and right are the same
aa = aa
left and right are both blank
=
left is blank or does not exist
< aa
right is blank or does not exist
aa >
left and right are different
aa | bb
I may refine this script in subsequent posts including:
display a set number of characters per string;
make the string comparisons case insensitive;
change specific characters before comparison (e.g. tabs to spaces);
remove blank lines; and
simplify the code or take a different ASObjC approach.
In testing with Script Geek, the timing result was 14 milliseconds with 84 lines in fileOne and 100 lines in fileTwo with 17 words per line.
use framework "Foundation"
use scripting additions
set fileOne to "/Users/Robert/Working/File One.txt" --set to desired value
set stringOne to current application's NSString's stringWithContentsOfFile:fileOne encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
set arrayOne to (stringOne's componentsSeparatedByString:linefeed)
set arrayOneCount to arrayOne's |count|()
set fileTwo to "/Users/Robert/Working/File Two.txt" --set to desired value
set stringTwo to current application's NSString's stringWithContentsOfFile:fileTwo encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
set arrayTwo to (stringTwo's componentsSeparatedByString:linefeed)
set arrayTwoCount to arrayTwo's |count|()
if arrayOneCount is less than or equal to arrayTwoCount then
set shortCount to arrayOneCount
set longCount to arrayTwoCount
else
set shortCount to arrayTwoCount
set longCount to arrayOneCount
end if
set emptyString to current application's NSString's stringWithString:""
set theDifference to current application's NSMutableArray's new()
repeat with i from 0 to (shortCount - 1)
set stringOne to (arrayOne's objectAtIndex:i)
set stringTwo to (arrayTwo's objectAtIndex:i)
if (stringOne's isEqualToString:stringTwo) is true then
set theString to ((stringOne's stringByAppendingString:" = ")'s stringByAppendingString:stringTwo)
else if (stringOne's isEqualToString:emptyString) is true then
set theString to ((stringOne's stringByAppendingString:" < ")'s stringByAppendingString:stringTwo)
else if (stringTwo's isEqualToString:emptyString) is true then
set theString to ((stringOne's stringByAppendingString:" > ")'s stringByAppendingString:stringTwo)
else if (stringOne's isEqualToString:stringTwo) is false then
set theString to ((stringOne's stringByAppendingString:" | ")'s stringByAppendingString:stringTwo)
else
display dialog "An unexpected error has occurred" buttons {"OK"} cancel button 1 default button 1
end if
(theDifference's addObject:theString)
end repeat
if shortCount is equal to arrayOneCount then
repeat with i from shortCount to (longCount - 1)
set stringTwo to (arrayTwo's objectAtIndex:i)
set theString to ((emptyString's stringByAppendingString:" < ")'s stringByAppendingString:stringTwo)
(theDifference's addObject:theString)
end repeat
else
repeat with i from shortCount to (longCount - 1)
set stringOne to (arrayOne's objectAtIndex:i)
set theString to ((stringOne's stringByAppendingString:" > ")'s stringByAppendingString:emptyString)
(theDifference's addObject:theString)
end repeat
end if
set theString to (theDifference's componentsJoinedByString:linefeed) as text
The timing result was 24 milliseconds with the same files used with my script in post 1.
use framework "Foundation"
use scripting additions
on main()
set fileOne to "/Users/Robert/Documents/File One.txt" --set to desired value or replace with dialog
set stringOne to current application's NSString's stringWithContentsOfFile:fileOne encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
set arrayOne to stringOne's componentsSeparatedByString:linefeed
set arrayOneCount to arrayOne's |count|()
set fileTwo to "/Users/Robert/Documents/File Two.txt" --set to desired value or replace with dialog
set stringTwo to current application's NSString's stringWithContentsOfFile:fileTwo encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
set arrayTwo to stringTwo's componentsSeparatedByString:linefeed
set arrayTwoCount to arrayTwo's |count|()
if arrayOneCount is less than or equal to arrayTwoCount then
set shortCount to arrayOneCount
set longCount to arrayTwoCount
else
set shortCount to arrayTwoCount
set longCount to arrayOneCount
end if
set theDifference to current application's NSMutableArray's new()
set emptyString to current application's NSString's stringWithString:""
repeat with i from 0 to (shortCount - 1)
set stringOne to (arrayOne's objectAtIndex:i)
set substringOne to getSubstring(stringOne)
set stringTwo to (arrayTwo's objectAtIndex:i)
set substringTwo to getSubstring(stringTwo)
if (stringOne's isEqualToString:stringTwo) is true then
set theString to current application's NSString's stringWithFormat_("| %@ | = | %@ |", substringOne, substringTwo)
else if (stringOne's isEqualToString:emptyString) is true then
set theString to current application's NSString's stringWithFormat_("| %@ | < | %@ |", substringOne, substringTwo)
else if (stringTwo's isEqualToString:emptyString) is true then
set theString to current application's NSString's stringWithFormat_("| %@ | > | %@ |", substringOne, substringTwo)
else if (stringOne's isEqualToString:stringTwo) is false then
set theString to current application's NSString's stringWithFormat_("| %@ | \\| | %@ |", substringOne, substringTwo)
else
display dialog "An unexpected error has occurred" buttons {"OK"} cancel button 1 default button 1
end if
(theDifference's addObject:theString)
end repeat
if shortCount is equal to arrayOneCount then
repeat with i from shortCount to (longCount - 1)
set stringTwo to (arrayTwo's objectAtIndex:i)
set substringTwo to getSubstring(stringTwo)
set theString to current application's NSString's stringWithFormat_("| %@ | < | %@ |", emptyString, substringTwo)
(theDifference's addObject:theString)
end repeat
else
repeat with i from shortCount to (longCount - 1)
set stringOne to (arrayOne's objectAtIndex:i)
set substringOne to getSubstring(stringOne)
set theString to current application's NSString's stringWithFormat_("| %@ | > | %@ |", substringOne, emptyString)
(theDifference's addObject:theString)
end repeat
end if
set theDifference to (theDifference's componentsJoinedByString:linefeed)
set theHeader to "| File One | Comparison | File Two |"
set theFormatter to "| :--- | :---: | :--- |"
set theLinefeed to linefeed
set theString to current application's NSString's stringWithFormat_("%@%@%@%@%@", theHeader, theLinefeed, theFormatter, theLinefeed, theDifference)
set desktopFolder to current application's NSHomeDirectory()'s stringByAppendingPathComponent:"Desktop"
set theFile to desktopFolder's stringByAppendingPathComponent:"File Compare.txt"
theString's writeToFile:theFile atomically:true encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
end main
on getSubstring(theString)
set theRange to theString's rangeOfString:".{0,30}" options:1024 --set 30 to desired value
set patternMatch to (theString's substringWithRange:theRange)
end getSubstring
main()
Hi @peavine
This is an interesting one!
I think you could take advantage of the compare: method.
update: see amended script post #6
use framework "Foundation"
use framework "AppKit"
use scripting additions
set theSet to current application's NSCharacterSet's newlineCharacterSet()
set fileOne to "/Users/Robert/Documents/File One.txt" --set to desired value or replace with dialog
set stringOne to current application's NSString's stringWithContentsOfFile:fileOne encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
set arrayOne to stringOne's componentsSeparatedByCharactersInSet:theSet
set theCount to arrayOne's |count|()
set fileTwo to "/Users/Robert/Documents/File Two.txt" --set to desired value or replace with dialog
set stringTwo to current application's NSString's stringWithContentsOfFile:fileTwo encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
set arrayTwo to stringTwo's componentsSeparatedByCharactersInSet:theSet
set theCount2 to arrayTwo's |count|()
if theCount2 > theCount then set theCount to theCount2
set theDifference to current application's NSMutableArray's new()
repeat with i from 0 to (theCount - 1)
set stringOne to (arrayOne's objectAtIndex:i)
set substringOne to getSubstring(stringOne)
set stringTwo to (arrayTwo's objectAtIndex:i)
set substringTwo to getSubstring(stringTwo)
set diff to (substringOne's compare:substringTwo options:129) -- 1 = NSCaseInsensitiveSearch, 128 = NSDiacriticInsensitiveSearch
if diff ≠ 0 and substringOne's |length|() = 0 then set diff to -2
if diff ≠ 0 and substringTwo's |length|() = 0 then set diff to 2
set charDiff to item (diff + 3) of {"<<", "|<", "=", "", ">>"}
set theString to current application's NSString's stringWithFormat_("| %@ | %@ | " & charDiff & " | %@ |", i, substringOne, substringTwo)
(theDifference's addObject:theString)
end repeat
set theDifference to (theDifference's componentsJoinedByString:linefeed)
set theHeader to "|#| File One | Comparison | File Two |"
set theFormatter to "| :--- | :--- | :---: | :--- |"
set theLinefeed to linefeed
set theString to current application's NSString's stringWithFormat_("%@%@%@%@%@", theHeader, theLinefeed, theFormatter, theLinefeed, theDifference)
set desktopFolder to current application's NSHomeDirectory()'s stringByAppendingPathComponent:"Desktop"
set theFile to desktopFolder's stringByAppendingPathComponent:"File Compare.txt"
theString's writeToFile:theFile atomically:true encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
set theWorkspace to current application's NSWorkspace's sharedWorkspace()
theWorkspace's openFile:theFile
on getSubstring(theString)
set theRange to theString's rangeOfString:".{0,30}" options:1024 --set 30 to desired value
return (theString's substringWithRange:theRange)
end getSubstring
[BTW, what markdown app are you using to display the table?]
Jonas. Thanks for looking at my script and for your excellent script suggestion.
I like your use of the Compare method. It simplifies the script and deals with case sensitivity and other possible language issues. Line numbers is also a great idea.
I did encounter one possible issue with your script. If the files do not contain the same number of lines, the following error is reported:
***-[_NSArrayM objectAtIndex:]: index 3 beyond bounds [0 … 2]
Also, in the following line, shouldn’t stringOne be compared to stringTwo?
set diff to (substringOne's compare:substringTwo options:129) -- 1 = NSCaseInsensitiveSearch, 128 = NSDiacriticInsensitiveSearch
I used the iA Writer app to display the markdown table. It’s a somewhat expensive writing app that happens to support markdown.
To facilitate comparison, it may be desirable to delete leading and trailing whitespace and optionally blank lines (other than the first line) from the text input. To accomplish this, the following handler can be inserted in the script.
set stringOne to cleanString(stringOne) --also for stringTwo
on cleanString(theString)
set theString to theString's stringByReplacingOccurrencesOfString:"(?m)^\\h+|\\h+$" withString:"" options:1024 range:{0, theString's |length|()} --remove leading and trailing whitespace
set theString to theString's stringByReplacingOccurrencesOfString:"(\\R)\\R" withString:"$1" options:1024 range:({0, theString's |length|()}) --also remove blank lines
end cleanString
Here are my corrections according to your comments:
use framework "Foundation"
use framework "AppKit"
use scripting additions
set desktopFolder to current application's NSHomeDirectory()'s stringByAppendingPathComponent:"Desktop"
set theSet to current application's NSCharacterSet's newlineCharacterSet()
set fileOne to desktopFolder's stringByAppendingPathComponent:"a.txt"
set stringOne to current application's NSString's stringWithContentsOfFile:fileOne encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
set arrayOne to stringOne's componentsSeparatedByCharactersInSet:theSet
set theCount to arrayOne's |count|()
set fileTwo to desktopFolder's stringByAppendingPathComponent:"b.txt"
set stringTwo to current application's NSString's stringWithContentsOfFile:fileTwo encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
set arrayTwo to stringTwo's componentsSeparatedByCharactersInSet:theSet
set theCount2 to arrayTwo's |count|()
if theCount < theCount2 then set arrayOne to my equalizeArray:arrayOne |count|:theCount2
if theCount > theCount2 then set arrayTwo to my equalizeArray:arrayTwo |count|:theCount
set theDifference to current application's NSMutableArray's new()
repeat with iCount from 0 to (theCount - 1)
set stringOne to (arrayOne's objectAtIndex:iCount)
set substringOne to (my getSubstring:stringOne)
set stringTwo to (arrayTwo's objectAtIndex:iCount)
set substringTwo to (my getSubstring:stringTwo)
set diff to (substringOne's compare:substringTwo options:(129)) -- 1 = NSCaseInsensitiveSearch, 128 = NSDiacriticInsensitiveSearch
if diff ≠ 0 and substringOne's |length|() < substringTwo's |length|() then set diff to -1
if diff ≠ 0 and substringOne's |length|() > substringTwo's |length|() then set diff to 1
if diff ≠ 0 and substringOne's |length|() = 0 then set diff to -2
if diff ≠ 0 and substringTwo's |length|() = 0 then set diff to 2
set charDiff to item (diff + 3) of {"x<", "<", "=", ">", ">x"}
set theString to current application's NSString's stringWithFormat_("| %@ | %@ | " & charDiff & " | %@ |", iCount, substringOne, substringTwo)
(theDifference's addObject:theString)
end repeat
set theDifference to (theDifference's componentsJoinedByString:linefeed)
set theHeader to "|#| File One | Comparison | File Two |"
set theFormatter to "| :--- | :--- | :---: | :--- |"
set theLinefeed to linefeed
set theString to current application's NSString's stringWithFormat_("%@%@%@%@%@", theHeader, theLinefeed, theFormatter, theLinefeed, theDifference)
set theFile to desktopFolder's stringByAppendingPathComponent:"File Compare.txt"
theString's writeToFile:theFile atomically:true encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
tell application "iA Writer"
activate
open theFile as «class furl»
end tell
on getSubstring:aString
set theRange to aString's rangeOfString:".{0,30}" options:1024
set aString to (aString's substringWithRange:theRange)
end getSubstring:
on equalizeArray:anArray |count|:aCount
repeat aCount times
anArray's addObject:""
end repeat
return anArray
end equalizeArray:|count|:
The timing result with the test files was 24 milliseconds.
use framework "Foundation"
use scripting additions
on main()
set fileOne to "/Users/Robert/Documents/File One.txt" --set to desired value or replace with dialog
set stringOne to current application's NSString's stringWithContentsOfFile:fileOne encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
set stringOne to cleanString(stringOne) --disable if desired
set arrayOne to stringOne's componentsSeparatedByString:linefeed
set arrayOneCount to arrayOne's |count|()
set fileTwo to "/Users/Robert/Documents/File Two.txt" --set to desired value or replace with dialog
set stringTwo to current application's NSString's stringWithContentsOfFile:fileTwo encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
set stringTwo to cleanString(stringTwo) --disable if desired
set arrayTwo to stringTwo's componentsSeparatedByString:linefeed
set arrayTwoCount to arrayTwo's |count|()
if arrayOneCount is less than or equal to arrayTwoCount then
set shortCount to arrayOneCount
set longCount to arrayTwoCount
else
set shortCount to arrayTwoCount
set longCount to arrayOneCount
end if
set theDifference to current application's NSMutableArray's new()
set emptyString to current application's NSString's stringWithString:""
repeat with i from 0 to (shortCount - 1)
set lineNumber to (i + 1)
set stringOne to (arrayOne's objectAtIndex:i)
set substringOne to getSubstring(stringOne)
set stringTwo to (arrayTwo's objectAtIndex:i)
set substringTwo to getSubstring(stringTwo)
if (stringOne's compare:stringTwo options:129) is 0 then --option 129 is case and diacritic insensitive
set theString to current application's NSString's stringWithFormat_("| %@ | %@ | = | %@ |", lineNumber, substringOne, substringTwo)
else if (stringOne's compare:emptyString options:129) is 0 then
set theString to current application's NSString's stringWithFormat_("| %@ | %@ | < | %@ |", lineNumber, substringOne, substringTwo)
else if (emptyString's compare:stringTwo options:129) is 0 then
set theString to current application's NSString's stringWithFormat_("| %@ | %@ | > | %@ |", lineNumber, substringOne, substringTwo)
else if (stringOne's compare:stringTwo options:129) is not 0 then
set theString to current application's NSString's stringWithFormat_("| %@ | %@ | \\| | %@ |", lineNumber, substringOne, substringTwo)
else
display dialog "An unexpected error has occurred" buttons {"OK"} cancel button 1 default button 1
end if
(theDifference's addObject:theString)
end repeat
if shortCount is equal to arrayOneCount then
repeat with i from shortCount to (longCount - 1)
set lineNumber to (i + 1)
set stringTwo to (arrayTwo's objectAtIndex:i)
set substringTwo to getSubstring(stringTwo)
set theString to current application's NSString's stringWithFormat_("| %@ | %@ | < | %@ |", lineNumber, emptyString, substringTwo)
(theDifference's addObject:theString)
end repeat
else
repeat with i from shortCount to (longCount - 1)
set lineNumber to (i + 1)
set stringOne to (arrayOne's objectAtIndex:i)
set substringOne to getSubstring(stringOne)
set theString to current application's NSString's stringWithFormat_("| %@ | %@ | > | %@ |", lineNumber, substringOne, emptyString)
(theDifference's addObject:theString)
end repeat
end if
set theDifference to (theDifference's componentsJoinedByString:linefeed)
set theHeader to "| Line | File One | Comparison | File Two |"
set theFormatter to "| :---: | :--- | :---: | :--- |"
set theLinefeed to linefeed
set theString to current application's NSString's stringWithFormat_("%@%@%@%@%@", theHeader, theLinefeed, theFormatter, theLinefeed, theDifference)
set theFolder to current application's NSHomeDirectory()'s stringByAppendingPathComponent:"Desktop"
set theFile to theFolder's stringByAppendingPathComponent:"File Compare.md"
theString's writeToFile:theFile atomically:true encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
end main
on cleanString(theString) --enable or disable any of the following
set theString to theString's stringByReplacingOccurrencesOfString:"(?m)^\\h+|\\h+$" withString:"" options:1024 range:{0, theString's |length|()} --remove leading and trailing whitespace every line
--set theString to theString's stringByReplacingOccurrencesOfString:"^\\s*\\R" withString:"" options:1024 range:{0, theString's |length|()} --remove blank lines at beginning of string
set theString to theString's stringByReplacingOccurrencesOfString:"\\s*$" withString:"" options:1024 range:{0, theString's |length|()} --remove blank lines at end of string
--set theString to theString's stringByReplacingOccurrencesOfString:"(\\R)\\s*\\R" withString:"$1" options:1024 range:({0, theString's |length|()}) --remove blank lines except at beginning and end of string
end cleanString
on getSubstring(stringOne)
set theRange to stringOne's rangeOfString:".{0,50}" options:1024 --set 50 to desired value
set stringTwo to (stringOne's substringWithRange:theRange)
if (stringOne's compare:stringTwo options:129) is not 0 then set stringTwo to stringTwo's stringByAppendingString:"..."
return stringTwo
end getSubstring
main()
-- remove empty line(s) at start of text
set theString to theString's stringByReplacingOccurrencesOfString:"\\A\\s+" withString:"" options:1024 range:{0, theString's |length|()}
-- the same at end
set theString to theString's stringByReplacingOccurrencesOfString:"\\s+\\Z" withString:"" options:1024 range:{0, theString's |length|()}
I happened upon an app that does a good job of displaying markdown files with the Quick Look utility. It’s free but infrequently shows a nag screen asking that you buy the developer a coffee. The app is not notarized or signed, which is always a concern. A screenshot of a simple example:
I wrote the scripts in this thread just for practice. However, I thought I would investigate how the sdiff utility works to see if I might modify my script to work similarly. Unfortunately, the operation of sdiff is not clear to me.
The sdiff man page described its operation as follows:
sdiff displays two files side by side, with any differences between the two highlighted as follows: new lines are marked with ‘>’; deleted lines are marked with ‘<’; and changed lines are marked with ‘|’.
The man page does not explain what constitutes a new line, deleted line, or changed line. I thought an example might help, so I created two test files. File One contained:
line one
no match
line three
no match 1
line five
File Two contained:
line one
no match 2
line three
no match
line five
I couldn’t get sdiff to run with the do shell script command. I instead ran the test command in a Terminal window and got the following:
I renamed your text files to avoid any noise related to spaces in names, and deleted the spaces at the beginning of each line to simplify posting here.
do shell script "cd ~/Desktop; sdiff fileOne fileTwo"
error "line one line one
> no match 2
> line three
no match no match
line three <
no match 1 <
line five line five" number 1
The key, I think, to understanding the output is that unlike many shell utilities, it isn’t acting on a line by line basis. It’s comparing the entire files and trying to line up the lines before performing the diff.
The line no match 2 is only in fileTwo so it is returned with a > indicator.
line three in fileTwo, isn’t matched before the next line, therefore >
the no match text is in line 2 of fileOne and line 3 of fileTwo, so that’s a match
line three in fileOne, in the space after the matching line above, is only in fileOne and is returned with a < indicator
no match 1 is only in fileOne, therefore<
In essence, the diff hinges on the no match line in each file. Another way to think of it is that there are actually three diffs being performed:
zone above the no match line
no match line
zone below the no match line
Since fileOne has only a single line above while fileTwo has three such lines, there are two > in the above zone. This is reversed in the below zone, where fileOne has three lines and fileTwo has only one; therefore there are two < in the below zone.
NB error number 1 is part of the output when the files have mismatches. If you output to a file by appending >fileThree then the file’s contents will match the terminal output without the error. Script Editor will still thrown an error, although it looks more sensible (although it’s actually the same error): error "The command exited with a non-zero status." number 1, just without the returned results.
do shell script "cd ~/Desktop; sdiff fileOne fileTwo >fileThree"
It’s a weird command that way and I’m not sure how to handle it in script editor other than by redirecting to a file.
You may find this GNU diffutils documentation helpful. It includes an example that demonstrates what I meant above by a ‘file’ versus ‘line’ approach.
I find it helps to think of it not so much as “here are the differences between these two files” as it is “here are the changes needed to turn the first file into the second file”. You take the input from file 1, then insert|delete|change lines as needed to get the output of file2.
I tested the script contained below with the two test files in post 11, and the script returned the exact same results as the sdiff utility. For example:
The script needs further testing and optimization, but I was encouraged to make this much progress. The sdiff utility has a “|” comparison character, and I don’t know where that would apply. Also, the sdiff utility considers incomplete lines, but my script doesn’t.
use framework "Foundation"
use scripting additions
on main()
set fileOne to "/Users/Robert/Documents/File One.txt" --set to desired value or replace with dialog
set stringOne to current application's NSString's stringWithContentsOfFile:fileOne encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
set stringOne to cleanString(stringOne) --disable if desired
set arrayOne to stringOne's componentsSeparatedByString:linefeed
set arrayOneCount to arrayOne's |count|()
set fileTwo to "/Users/Robert/Documents/File Two.txt" --set to desired value or replace with dialog
set stringTwo to current application's NSString's stringWithContentsOfFile:fileTwo encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
set stringTwo to cleanString(stringTwo) --disable if desired
set arrayTwo to stringTwo's componentsSeparatedByString:linefeed
set theDifference to current application's NSMutableArray's new()
set emptyString to current application's NSString's stringWithString:""
set adjustedArrayTwo to arrayTwo's mutableCopy()
repeat with i from 0 to (arrayOneCount - 1)
set stringOne to (arrayOne's objectAtIndex:i)
set substringOne to getSubstring(stringOne)
try
set stringTwo to (arrayTwo's objectAtIndex:i)
set substringTwo to getSubstring(stringTwo)
on error
set stringTwo to emptyString
set substringTwo to emptyString
end try
if (stringOne's compare:stringTwo options:129) is 0 then --fileOne line and fileTwo line match
set theString to current application's NSString's stringWithFormat_("| %@ | = | %@ |", substringOne, substringTwo)
(adjustedArrayTwo's removeObjectAtIndex:0)
(theDifference's addObject:theString)
else if (adjustedArrayTwo's containsObject:stringOne) is false then --fileOne line is not in fileTwo
set theString to current application's NSString's stringWithFormat_("| %@ | < | %@ |", substringOne, emptyString)
(theDifference's addObject:theString)
else --add unmatched fileTwo lines then matched fileTwo lines then exit repeat
set subtractArray to current application's NSMutableArray's new()
repeat with anItem in adjustedArrayTwo
if (stringOne's compare:anItem options:129) is not 0 then
set theString to current application's NSString's stringWithFormat_("| %@ | > | %@ |", emptyString, anItem)
(subtractArray's addObject:anItem)
(theDifference's addObject:theString)
else
set theString to current application's NSString's stringWithFormat_("| %@ | = | %@ |", substringOne, anItem)
(subtractArray's addObject:anItem)
(theDifference's addObject:theString)
exit repeat
end if
end repeat
(adjustedArrayTwo's removeObjectsInArray:subtractArray)
end if
end repeat
repeat with anItem in adjustedArrayTwo --remaining items in arrayTwo
set theString to current application's NSString's stringWithFormat_("| %@ | > | %@ |", emptyString, anItem)
(theDifference's addObject:theString)
end repeat
set theDifference to (theDifference's componentsJoinedByString:linefeed)
set theHeader to "| File One | Comparison | File Two |"
set theFormatter to "| :--- | :---: | :--- |"
set theLinefeed to linefeed
set theString to current application's NSString's stringWithFormat_("%@%@%@%@%@", theHeader, theLinefeed, theFormatter, theLinefeed, theDifference)
set theFolder to current application's NSHomeDirectory()'s stringByAppendingPathComponent:"Desktop"
set theFile to theFolder's stringByAppendingPathComponent:"File Compare.md"
theString's writeToFile:theFile atomically:true encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
end main
on cleanString(theString) --enable or disable any of the following
set theString to theString's stringByReplacingOccurrencesOfString:"(?m)^\\h+|\\h+$" withString:"" options:1024 range:{0, theString's |length|()} --remove leading and trailing whitespace every line
--set theString to theString's stringByReplacingOccurrencesOfString:"^\\s*\\R" withString:"" options:1024 range:{0, theString's |length|()} --remove blank lines at beginning of string
set theString to theString's stringByReplacingOccurrencesOfString:"\\s*$" withString:"" options:1024 range:{0, theString's |length|()} --remove blank lines at end of string
--set theString to theString's stringByReplacingOccurrencesOfString:"(\\R)\\s*\\R" withString:"$1" options:1024 range:({0, theString's |length|()}) --remove blank lines except at beginning and end of string
end cleanString
on getSubstring(stringOne)
set theRange to stringOne's rangeOfString:".{0,50}" options:1024 --set 50 to desired value
set stringTwo to (stringOne's substringWithRange:theRange)
if (stringOne's compare:stringTwo options:129) is not 0 then set stringTwo to stringTwo's stringByAppendingString:"..."
return stringTwo
end getSubstring
main()
line one
line two
line three
line four
line fives and dime
How would your script address that docleft is missing a line at the top compared to docright? If your approach is exclusively line by line, then all lines will be considered different. In reality, the majority of two documents’ lines are identical.
I included the minor discrepancies in the last line (the ‘F’ and the plurals) to show when a ‘|’ indicator might appear. Either of the discrepancies is sufficient to cause this.
Ken. Thanks for looking at my script and for the helpful example. My script contained an error, which I’ve fixed. In other respects, I won’t be able to make my script functionally equivalent to sdiff. FWIW, the results from my revised script and from sdiff:
I thought I would add a quick final note to this thread.
The sdiff utility uses the diff utility to compare two files, and the operation of this utility is described on the page linked by Mockman as follows:
When comparing two files, diff finds sequences of lines common to both files, interspersed with groups of differing lines called hunks. Comparing two identical files yields one sequence of common lines and no hunks, because no lines differ. Comparing two entirely different files yields no common lines and one large hunk that contains all lines of both files. In general, there are many ways to match up lines between two given files. diff tries to minimize the total hunk size by finding large sequences of common lines interspersed with small hunks of differing lines.
Obviously, my AppleScript is not going to match this functionality, and I never should have attempted this. However, I did learn a lot and that’s always a good thing.
The script included below roughly mimics the functionality of the comm utility, which is described in its man page as:
The comm utility reads file1 and file2, which should be sorted lexically, and produces three text columns as output: lines only in file1; lines only in file2; and lines in both files.
Note should be made of the settings in the cleanString and getSubstring handlers of the script. The timing result using two marginally-different versions of the script as test files was 38 milliseconds.
--revised 2024.08.25
use framework "Foundation"
use scripting additions
on main()
set fileOne to "/Users/Robert/Documents/File One.txt" --set to desired value or replace with dialog
set stringOne to current application's NSString's stringWithContentsOfFile:fileOne encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
if stringOne is (missing value) then display dialog "File One not found" buttons {"OK"} cancel button 1 default button 1
set stringOne to cleanString(stringOne) --disable if desired
set arrayOne to (stringOne's componentsSeparatedByString:linefeed)
set arrayOne to (arrayOne's sortedArrayUsingSelector:"localizedStandardCompare:") --disable if desired and below
set fileTwo to "/Users/Robert/Documents/File Two.txt" --set to desired value or replace with dialog
set stringTwo to current application's NSString's stringWithContentsOfFile:fileTwo encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
if stringTwo is (missing value) then display dialog "File Two not found" buttons {"OK"} cancel button 1 default button 1
set stringTwo to cleanString(stringTwo) --disable if desired
set arrayTwo to (stringTwo's componentsSeparatedByString:linefeed)
set arrayTwo to (arrayTwo's sortedArrayUsingSelector:"localizedStandardCompare:") --disable if desired and above
set arrayOneUnique to current application's NSMutableArray's new()
set arrayTwoUnique to arrayTwo's mutableCopy()
set commonArray to current application's NSMutableArray's new()
repeat with anItem in arrayOne
if (arrayTwoUnique's containsObject:anItem) is true then --item in both arrays
(commonArray's addObject:anItem)
(arrayTwoUnique's removeObjectAtIndex:(arrayTwoUnique's indexOfObject:anItem))
else --item in arrayOne only
(arrayOneUnique's addObject:anItem)
end if
end repeat
set theCounts to {arrayOneUnique's |count|(), arrayTwoUnique's |count|(), commonArray's |count|()}
set theCounts to current application's NSArray's arrayWithArray:theCounts
set theCount to (theCounts's valueForKeyPath:"@max.self") as integer
set theTable to current application's NSMutableArray's new()
set emptyString to current application's NSString's stringWithString:""
repeat with i from 0 to (theCount - 1)
try
set columnOne to getSubstring(arrayOneUnique's objectAtIndex:i)
on error
set columnOne to emptyString
end try
try
set columnTwo to getSubstring(arrayTwoUnique's objectAtIndex:i)
on error
set columnTwo to emptyString
end try
try
set columnThree to getSubstring(commonArray's objectAtIndex:i)
on error
set columnThree to emptyString
end try
set aRow to current application's NSString's stringWithFormat_("| %@ | %@ | %@ |", columnOne, columnTwo, columnThree)
(theTable's addObject:aRow)
end repeat
set theTable to (theTable's componentsJoinedByString:linefeed)
set theHeader to "| File One Only | File Two Only | Both Files |"
set theFormatter to "| :--- | :--- | :--- |"
set theLinefeed to linefeed
set theString to current application's NSString's stringWithFormat_("%@%@%@%@%@", theHeader, theLinefeed, theFormatter, theLinefeed, theTable)
set theFolder to current application's NSHomeDirectory()'s stringByAppendingPathComponent:"Desktop"
set theFile to theFolder's stringByAppendingPathComponent:"File Compare.md"
theString's writeToFile:theFile atomically:true encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
end main
on cleanString(theString) --disable any of the following
set theString to theString's stringByReplacingOccurrencesOfString:"([*#$>|])" withString:"\\\\$1" options:1024 range:{0, theString's |length|()} --escape markdown characters
set theString to theString's stringByReplacingOccurrencesOfString:"(?m)^\\h+|\\h+$" withString:"" options:1024 range:{0, theString's |length|()} --remove leading and trailing whitespace every line
set theString to theString's stringByReplacingOccurrencesOfString:"^\\s*\\R|\\s*$" withString:"" options:1024 range:{0, theString's |length|()} --remove blank lines at beginning and end of string
set theString to theString's stringByReplacingOccurrencesOfString:"(\\R)\\s*\\R" withString:"$1" options:1024 range:({0, theString's |length|()}) --remove blank lines except at beginning and end of string
end cleanString
on getSubstring(theString) --truncate lines at 50 characters
set theRange to theString's rangeOfString:".{0,50}" options:1024
set theSubstring to (theString's substringWithRange:theRange)
if (theString's compare:theSubstring options:129) is not 0 then set theSubstring to theSubstring's stringByAppendingString:"..."
return theSubstring
end getSubstring
main()
The comm utility has an option not to display lines common to both files, and the following script mimics this behavior:
use framework "Foundation"
use scripting additions
on main()
set fileOne to "/Users/Robert/Documents/File One.txt" --set to desired value or replace with dialog
set stringOne to current application's NSString's stringWithContentsOfFile:fileOne encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
if stringOne is (missing value) then display dialog "File One not found" buttons {"OK"} cancel button 1 default button 1
set stringOne to cleanString(stringOne) --disable if desired
set arrayOne to (stringOne's componentsSeparatedByString:linefeed)
set arrayOne to (arrayOne's sortedArrayUsingSelector:"localizedStandardCompare:") --disable if desired and below
set fileTwo to "/Users/Robert/Documents/File Two.txt" --set to desired value or replace with dialog
set stringTwo to current application's NSString's stringWithContentsOfFile:fileTwo encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
if stringTwo is (missing value) then display dialog "File Two not found" buttons {"OK"} cancel button 1 default button 1
set stringTwo to cleanString(stringTwo) --disable if desired
set arrayTwo to (stringTwo's componentsSeparatedByString:linefeed)
set arrayTwo to (arrayTwo's sortedArrayUsingSelector:"localizedStandardCompare:") --disable if desired and above
set arrayOneUnique to current application's NSMutableArray's new()
set arrayTwoUnique to arrayTwo's mutableCopy()
repeat with anItem in arrayOne
if (arrayTwoUnique's containsObject:anItem) is true then --item in both arrays
(arrayTwoUnique's removeObjectAtIndex:(arrayTwoUnique's indexOfObject:anItem))
else --item in arrayOne only
(arrayOneUnique's addObject:anItem)
end if
end repeat
set theCount to arrayOneUnique's |count|()
set theOtherCount to arrayTwoUnique's |count|()
if theOtherCount is greater than theCount then set theCount to theOtherCount
set theTable to current application's NSMutableArray's new()
set emptyString to current application's NSString's stringWithString:""
repeat with i from 0 to (theCount - 1)
try
set columnOne to getSubstring(arrayOneUnique's objectAtIndex:i)
on error
set columnOne to emptyString
end try
try
set columnTwo to getSubstring(arrayTwoUnique's objectAtIndex:i)
on error
set columnTwo to emptyString
end try
set aRow to current application's NSString's stringWithFormat_("| %@ | %@ |", columnOne, columnTwo)
(theTable's addObject:aRow)
end repeat
set theTable to (theTable's componentsJoinedByString:linefeed)
set theHeader to "| File One Only | File Two Only |"
set theFormatter to "| :--- | :--- |"
set theLinefeed to linefeed
set theString to current application's NSString's stringWithFormat_("%@%@%@%@%@", theHeader, theLinefeed, theFormatter, theLinefeed, theTable)
set theFolder to current application's NSHomeDirectory()'s stringByAppendingPathComponent:"Desktop"
set theFile to theFolder's stringByAppendingPathComponent:"File Compare.md"
theString's writeToFile:theFile atomically:true encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
end main
on cleanString(theString) --disable any of the following
set theString to theString's stringByReplacingOccurrencesOfString:"([*#$>|])" withString:"\\\\$1" options:1024 range:{0, theString's |length|()} --escape markdown characters
set theString to theString's stringByReplacingOccurrencesOfString:"(?m)^\\h+|\\h+$" withString:"" options:1024 range:{0, theString's |length|()} --remove leading and trailing whitespace every line
set theString to theString's stringByReplacingOccurrencesOfString:"^\\s*\\R|\\s*$" withString:"" options:1024 range:{0, theString's |length|()} --remove blank lines at beginning and end of string
set theString to theString's stringByReplacingOccurrencesOfString:"(\\R)\\s*\\R" withString:"$1" options:1024 range:({0, theString's |length|()}) --remove blank lines except at beginning and end of string
end cleanString
on getSubstring(theString) --truncate lines at 50 characters
set theRange to theString's rangeOfString:".{0,50}" options:1024
set theSubstring to (theString's substringWithRange:theRange)
if (theString's compare:theSubstring options:129) is not 0 then set theSubstring to theSubstring's stringByAppendingString:"..."
return theSubstring
end getSubstring
main()
A screenshot of script output where the test files were scripts: