Hi,
I need to be alerted as soon as a file is added to a folder by replacing an existing file there. A further script, already written, will then process that file. My problem is that I can’t find a way to trigger a folder action on replacing an existing file. There is nothing suitable in the folder action library.
Is there any way to monitor the contents of a folder and trigger an alert and run a script to process the updated file. It is a critical file, which must be processed immediately. Note that the file name cannot be hard coded because it does vary but, once added to the folder (that is being taken care of with a FA), it can be overwritten a few times. It is then, when I need the alert.
Please, is there a quick and clever solution out there. I need it badly.
now you’re doing an action after something has been added to your folder. Isn’t it an solution that you move everything with an folder action to another folder. This way you’re ahead of the ‘add’ process when the file has been moved you can call your further script.
I’m not quite sure I understand what you are saying:
I don’t have any problem with processing the files, once I know that they arrived and their names;
Moving all files in that folder to another folder is not practical because there can be between 15 and 100 files to a total size of up to 5Gb;
Using a folder action to move them one by one as they arrive is also not practical because the main app expects to find the data in the original folder. Currently, a folder action does process the new added file. What it can’t do is to process updated files.
What I need is an automatic way to monitor the folder continually and trigger an alert/notification as soon as an existing file has been updated/overwritten (even better if it takes care of new added files as well). The notification must be immediate and include the full path and file name.
You can try using watchpaths from launchd. This will be triggered when an file is added, edited or removed or simply put; when something changes in the directory. But here you have to figure out yourself what happened. That could be done with an cache file of course to keep track between runs.
What I meant with my first solution was that when you work with two folder like input and output you have some sort of ‘should’ handler instead of an callback handler which are folder actions basically. That’s what I’ve used very well in the past. Some files you do want to overwrite while others you won’t or prompt the user.
I agree with Shane, for a number of reasons. One of them being concurrency problems and race conditions.
What you are asking for is not that hard to make work most of the time though, but making it work at all times, being able to guarantee that, makes up for complex programming, you’d better afford, or have a look at Hazel which is a commercial solution that may solve your problems.
The complexity lies in the fact that changes to the directory may happen, while the process are checking what changes that is already there. Or if several users address the directory simultanously.
You could however look at my post in Code Exchange, I have no time to help you at this time. The scheme in Code Exchange, involves make, and watchpaths, and that you know the file extensions.
By creating dummy files, and setting up a dependency hierarchy, you can have make help you track if files are added, changed or deleted from the folder. But in order to make this work you’ll have to seriously muck around with make if you are unexperienced with it. You are better of buying Hazel in the first place, if Hazel can do it.
Another scheme would be to use watchpaths, and have a script create a new directory listing when the agent is triggered, and then use the diff and comm tool to figure out what kind of changes has happened.
This is nowhere near you getting an alert the moment a file is to be changed, what you’ll get is a message after it is changed. To be guarranteed you’d be alerted before a file changed, then you’d better change ownership and privileges the moment the file is added to the folder, so that other processes will have to use a script, which alerts you, to unlock the file for modification.
Edit
This is intriguing stuff, so I will eventually have a look into it, but that will happen rather “tomorrow” than “yesterday”. In other words, when I find time for it, and is otherwise inclined. And then I cannot guarrantee that I have any time helping you with your particular problem, nor that my solution will suit you in any way.
Unfortunately, that’s not an option. The folder has been created by and is part of the hard coded setup of a video processing app.
DJ, thanks. I’d like to try this. It sounds like the answer to my prayer. After each change in the folder, I can save a contents list of file names, sizes and date/time stamps. Then, compare it with the previous list to get the changed file.
My “new” problem is that I know nothing about “watchpaths”. I looked up launchd and launchctl but didn’t find anything.
Could you please help me with the “watchpaths” command and/or script triggered on any change in the folder?
I have the same comments as for DJ above. Could you help with the “watchpaths” part?
Sorry, my use of “immediate” was confusing. I meant “immediate after a change”, not before or during. There is no permissions issue here.
Any help will be welcome, even if just for the sake of knowledge.
Hazel would need be installed in all Macs and that just for the “watchpaths” part. I’d very much prefer to roll-my-own.
I have made a shell script called Agent for you, put this into a folder named AgenTest first, read up on launch agents, and when you are read, save the plist enclosed as com.mcusr.video.plist in your ~/Library/LaunchAgents folder, it should run when you stand in that folder and enter load -wF com.mcusr.video.plist
[code]<?xml version="1.0" encoding="UTF-8"?>
Label
com.mcusr.AgenTest
ProgramArguments
/Users/mcusr/Desktop/AgenTest/Agent
QueueDirectories
/Users/mcusr/Desktop/AgenTest
WatchPaths
/Users/mcusr/Desktop/AgenTest
[/code]
You must remember to [b]chmod u+x Agent[/b], so it is executable.
[code]#!/bin/bash
# Agent : keeps track of a single folder without subfolders
# it is expected to fire once at a time.
added=0
changed=0
if ! test -f filelist.old ; then
echo "first time"
for i in * ; do
echo $i
if ! test "$i" = "Agent" ; then
echo "Not AGent"
echo "Added $i" >filelist.old
# YOUR ACTIONS
added=1
break
fi
done
echo "continues ;"
for i in * ; do
echo "pass2: "$i
if ! test "$i" = "Agent" ; then
if ! test "$i" = "filelist.old" ; then
grep -q $i filelist.old
if test $? -ne 0 ; then
echo "Added $i" >>filelist.old
# YOUR ACTIONS
added=1
fi
fi
fi
done
else
for i in * ; do
if ! test "$i" = "Agent" ; then
if ! test "$i" = "filelist.old" ; then
grep -q $i filelist.old
if test $? -ne 0 ; then
echo "Added $i" >>filelist.old
echo "Added $i"
# YOUR ACTIONS
added=1
break
fi
fi
fi
done
fi
if test $added -eq 1 ; then
echo "file was added...exiting"
exit 0
fi
# A file hasn't been added but changed if we come here:
# but which?
# we iterate over the files. the one that is newer tha filelist.old, is the one
# that is changed.
for i in * ; do
if ! test "$i" = "Agent" ; then
if ! test "$i" = "filelist.old" ; then
if test "$i" -nt "filelist.old" ; then
echo "$i has changed"
# YOUR ACTIONS HERE
changed=1
break
fi
fi
fi
done
if test $changed -eq 1 ; then
echo "file was changed... exiting"
# THE LINE BELOW BE UPDATED WITH THE FULL PATH TO THE FOLDER!!!
touch ~/Desktop/AgenTest/filelist.old
exit 0
fi
Checking for the last event; if a file was deleted…
for i in cat filelist.old ; do
if ! test “$i” = “Agent” ; then
if ! test “$i” = “filelist.old” ; then
stat $i >/dev/null
if test $? -ne 0 ; then
echo “deleted $i”
# YOUR ACTIONS
rm filelist.old
# THE LINE BELOW MUST HAVE THE FULL PATH TO THE FOLDER!!!
~/Desktop/AgenTest/Agent
break
fi
fi
fi
done[/code]
how much do you know about launchd? Well in short: launchd is an bunch of important unix processes wrapped into an single process because those different processes had an lot of overlap in functionalities. The gains of launchd is less development time and one syntax for the end user for all different ‘processes’.
The syntax for launchd is in plist format and different folders are loaded at certain times. plist files in /Library/launchAgents and ~/Library/launchAgents/ for example are automatically loaded bij launchd at the right time. The launch agents in you home folder are launched as your user while the agents from your system will run as root. I would advise to store always in your user home folder and give it a working directory that has the right privileges.
one of the functionalities in launchd is to create an agent that will be triggered when something changes with a file. On of the beauties in unix is that everything is a file, directories are just special files (where a string is associated with a files; file names). So when something changes in this file (read:folder) you can execute an command. So when is watchpaths triggered when the path is an directory file?
when an file is inserted in an directory
when an file is removed from the directory
when an file is renamed
when an file is overwritten (read the inode is changed)
IMPORTANT: Watchpaths aren’t triggered when some contents of a file has been changed; some software does an atomically write which causes watchpath to trigger because the file gets an new inode.
First we can create an script that will just do an display dialog so we know when it is triggered and if it’s working. We’re telling the finder to show the dialog because we’re using osascript later to show the dialog (read:current application is osascript).
on run
tell application "Finder" to display dialog "There has something changed in your folder"
end run
I’ve saved this script on my desktop named ‘watchpaths.scpt’, we can store it elsewhere later. Then we’re creating a folder in the home folder named ‘testfolder’ for now and the user is named flex for now. So you know what you have to change later.
Now we’re going to create our launchd file. We can see in the manual that watchpath needs an array of strings. Also we put an label (Which we can see with the command ‘launchctl list’) and some program arguments. You can find all this information in the man page of launchd.plist btw.
[code]<?xml version="1.0" encoding="UTF-8"?>
Label
com.flex.watchpathsexample
ProgramArguments
osascript
/Users/flex/Desktop/watchpaths.scpt
WatchPaths
/Users/flex/testfolder
[/code]
The programArguments needs to be separated in order to work. For now, just to see if it works, I store this file on my desktop and named it 'com.flex.watchpathsexample.plist'. To make it all work the only thing we need to do is load this file into launchd using launchctl.
Now just try to add, rename, overwrite, remove a file from your folder. As you can see the script is triggered every time, just like an folder action. To unload you only have to type the full path to the plist file again.
Move the plist file to "~/Library/Launchagents/ to autimatically load when you log in. And of course change the file locations and names to your likings.
I added the touching of the filelist.old when a file has been changed.
My solution doesn’ handle renaming directly, but treats it as an addition, and the non-existing item will reside in the filelist.old, until another file is deleted.
If there isn’t any atomic rewrites, then I can make make take care of it, as an amendment. (I guess). But then I think the folder gets touched in the end. so after that, the script will be executed every 10 seconds.
On the other hand I have added both a QueuePath and a WatchPath, the QueuePath, should treat the directory as a file, because the inode, in any incantation, is really a small file, a file that should be updated, as long as some of the contents is, like the modifed date. I guess it also will be updated when a file is accessed, but if nothing gets updated, then we are having sligtly more than a noop. But there is always a price. And this is the price in this case.
DJ, wow, your tutorial is just perfect. Many, many thanks
I did have some grounding in the usage of launched, mainly for timing the run of some scripts. Now I know the usage of watchpaths as well
Watchpaths is not just a trigger, it is a trigger happy hair trigger. Anything touched in the watched folder pulls the trigger. It is so much more versatile than folder actions, and probably more reliable as well. BTW, I don’t understand why Apple haven’t at least included “on replace” in the FA triggers.
I’d add the following to your list of triggers: change labels, change permissions, add custom icon, any change in a bundle contents, including anything causing the creation of or changes to a .DS_Store hidden file. Even, to my surprise, adding/removing/replacing things to/from/in a subfolder, probably because of the subfolder’s size and date stamp changes.
Because it is so trigger happy, one must be careful with the script handling the trigger and the make-up the before and after folder contents lists, to specify exactly which of the possible triggers are relevant for getting the name of the desired file. In my case that’s changes to the file name and/or size and/or date/time stamp. I’ll have fun with that
Now, a few questions:
I don’t quite understand that. In the first sentence, do you mean in single file binary or in bundles? (changes in a bundle do trigger). The part after the “;” seems (to me :P) to contradict the first part. Could you explain please?
If one has more than one unrelated folder to watch (and I do ;)), should I include all folders in the same launched plist or make separate plist files for each folder? I tend to think that separate plist files lessen the complexity of the scripts handling the trigger.
Finally, googling watchpaths I found many different launchd plist files with a variety of functions and corresponding keys. Could you point me to a good resource to read about xml keys?
Thank you very much for your take as well
I haven’t had time yet to get my teeth into the bash script but, I can see its potential for combining launchd functions.
I’ll come back as soon as I try it
I’m sorry, for the confusing but in my defense in my language you would use a semicolon instead of words like ‘thus’ and ‘or’ in some contexts. Anyway it’s still incorrect.
Well what I was trying to say there is that most processes will write data to an file when the file gets updated. It’s a simple process of opening an file, remove it’s contents, write new data to the file and close the file. When during this process something bad happens “ like an error or power failure “ you probably lose some data. So to avoid this there is atomic writing which is storing everything first in an temporary file and overwrite the old file by moving the new file to the same folder. More and more Applications are using this method nowadays like textedit.app for example.
So why is that so important? For example you could use ‘ls -i’ to see the file name and it’s inode. This way you know between runs which file has been overwritten by an move or not. When an file is overwritten by an move the file name record in the directory file gets an new inode reference but the same applies for atomic writing (read: overwrite by moving). You mentioned that you want to be notified or at least take some action when an file is overwritten/updated. Therefore, depending on how you’re going to tackle this, I wanted to bring this difference into the spotlight. If you’re not going to store the file’s inode then I guess there is nothing to worry about at all.
That’s your choice of course. When the job is already running launchd is going to queue the next one and next one. When having different jobs (different plist files) and too many jobs will run it could slow down your system. Having a cue isn’t always bad.
I would type this in the terminal:
man -t launchd.plist | open -f -a preview
You see a manual page in preview which makes it better readable. The second thing I advise to do is opening some plist files in your system folder. There is no better example than working ones in your system. To avoid typos you could use an property list editor or even download an plist editor for launchd like lingon (not sure if it supports watchpaths).
Now this should worry me :(, how could I spend years reading man pages in the terminal and not thinking about using preview or another reader :rolleyes: my bad . sound of slaps
Here is an alternative for converting man pages to html. I use Safari to look at html instead of pdf.
Prerequisites
man -aw : finds path to the specified man pages:
Example
man -aw launchd.plist
→ launchd.plist.5.gz
launchd.plist is a file with a gz extension,and not just a number like 1,2,3… and so on, so we’ll have to open it with gzcat and pipe the input into groff.
If we hadn’t have to do that, then the commandline would have looked like this:
groff -man /usr/share/man//man5/launchd.plist.5 -Thtml >|~/Desktop/tmp.html && open -a "Safari" ~/Desktop/tmp.html
But we have to use this:
gzcat /usr/share/man//man5/launchd.plist.5.gz |groff -man -Thtml >|~/Desktop/tmp.html && open -a "Safari" ~/Desktop/tmp.html
The -man parameter tells groff the input/file should be parsed with the man macro package, the -Thtml tells it that it should generate html as output.
Edit
I think it is worth mentioning in the passing, that you shouldn’t regard the man file, as the primary source for information, when it is about tools from the GNU platform. According to the GNU coding standard, you needn’t update the man files, nor that they be accurate. So, when you see a reference to an “info file”, then you should really use that info file as the source of information for a command. At least on Snow Leopard, not all info files are shipped with the OS. Then you can find them over at gnu.org.
(This doesn’t relate to launchd.plist, or tools Apple develops I guess, but it is worth noting, that there is not a word on the sleep.1 man page for instance, that you can indeed specify a decimal value for specifying a fraction of a second.)
Before going to GNU I would take a look at freebsd.org first. Those manuals at freeBSD matches with Mac OS X, Mac OS X is built on FreeBSD 5 till today not on GNU/Linux.
As this is (still) an AppleScript board, I’m reading man pages online at apple.com using this script
property baseURL : "https://developer.apple.com/library/mac/#documentation/darwin/reference/manpages/man"
display dialog "Enter page name (optional with section in parentheses)" default answer "cp" buttons {"Cancel", "OK"} default button "OK"
set manCommand to text returned of result
set leftParenthesisOffset to offset of "(" in manCommand
if leftParenthesisOffset = 0 then
set section to word 2 of (do shell script "man " & manCommand)
else
set rightParenthesisOffset to offset of ")" in manCommand
set section to text (leftParenthesisOffset + 1) thru (rightParenthesisOffset - 1) of manCommand
set manCommand to text 1 thru (leftParenthesisOffset - 1) of manCommand
end if
open location baseURL & section & "/" & manCommand & "." & section & ".html"
I like to use bwana at times. And I also have a script that feeds the man pages from Developer.doc.
It also happens that I use Preview, it all comes down to which window configuration I am happy with at the moment.
It might be, that I have installed utilities that have overwritten the originals, through MacPorts, or something else, but when stuff refer to info, then it is originally from GNU, or is it? Maybe info has crossed over to FreeBSD?
As for FreeBSD v. 5 FreeBSD, along with Xinu, NetBSD, Mach and NextStep is correct, but the version would be 4.3 wouldn’t it? Or is that just for the kernel? (originally). Does to the tools descend from FreeBSD v5?
By the way, how do you make open open header files with something different than XCode? ”And Apple.OpenSource has actually good a good job, improving the codebase of some of the FreeBSD libraries. Just mentioning it, since we are all so quick to complain.
A good example of things that have originated from GNU that ships with Mac Os X, is the C compiler (gcc not clang).
Actually FreeBSD is much better in reliability and performance. Sony, Yahoo and Hotmail run their servers on FreeBSD because of that. FreeBSD did copy new features from Darwin, but they did it better and not simply copy paste.
Hi Stefan,
I do like the convenience of the script, as well as the Apple man pages. Their layout is great, with easy navigation via links. The only thing I’m puzzled about, is that they still haven’t updated from 10.7.4. Why? Does it matter?
I actually thought I’d make my stint for turning a random manpage into html into an applescript but it doesn’t seem necessary to do so.
[code]#!/bin/bash
argc=$#
if [ $argc -lt 1 ] ; then
echo “gman takes a man page, if found and formats it into html.”
echo “Usage: gman [-s1…n|-s 1…n] manfile”
exit 2
fi
a=man -aw $* 2>/dev/null|head -1 2>&1 >/dev/null
if test x$a = x ; then
if [ $argc -eq 1 ] ; then
echo “gman: Can’t find $1”
exit 1
elif [ $argc -eq 2 ] ; then
echo “gman: Can’t find $2 in section ${1#-s}, try another section.”
exit 1
elif [ $argc -eq 3 ] ; then
echo “gman: Can’t find $3 in section $2, try another section.”
exit 1
else
echo “gman: “$*” didn’t work for me.”
exit 1
fi
fi
Figures out if it is a normal man page or something else (gz).
b=man -aw $* |head -1 |grep "gz" 2>&1 >/dev/null
t=mktemp -t gman.1.XXXXXXXXXX
if test x$b = x ; then
groff -mandoc $a -Thtml >|$t.html 2>/dev/null
else
gzcat -f $b |groff -mandoc -Thtml >|$t.html 2>/dev/null
fi
echo |qlmanage -px $t.html 2&>1 >/dev/null &[/code]
Put it somewhere in your path, make it exexutable and try: gman launchd.plist for instance in a terminal window.