mdfind catalog

chris2:

Location exclusion is not easily achieved since file paths are not part of the metadata of the file (no pedantic rants on this, please :D) There are ways but a bit to ugly to foist on you. :stuck_out_tongue: I would suggest using the -onlyin argument to restrict your search scope (which you should be in the habit of. 9 times out of 10, searching everywhere on a machine is usually a sign of a poorly thought out search. Just sayin’)

Excluding a fileType can be done fairly easily with !=, like kMDItemContentType!=com.apple.mail.email.

Cheers,
Jim Neumann
BLUEFROG

Hi,
Thanks for the reply.
I was using the “-interpret” to avoid learning about mdfind. But looks like now there is no option and I have to go throught this:
http://developer.apple.com/mac/library/documentation/Carbon/Conceptual/SpotlightQuery/SpotlightQuery.html

I want to search for files containing “roll” but exclude files with extensions “.h” & “.js”

set theresult to (do shell script "mdfind   'kMDItemTextContent == \"roll\"  && ((kMDItemFSName != \"*.js\" && kMDItemFSName != \"*.h\") '")

why is this not working?
I don’t get any errors either.

In terminal, this command shows search results

and gives the error:

Also, regarding my original query (about excluding mdfind from certain locations):
Since, mdfind can’t do it, is using grep -v “” the best alternative?

someone please help!!!

Hardly ever, has my query not been answered in this forum. So, I don’t mind one. But not this one. This is a basic Spotlight query. and my first step in Spotlight scripting. Guide me.

mdfind kMDItemTextContent == "roll" && (kMDItemFSName != “.js") && (kMDItemFSName != ".h”)

and

mdfind kMDItemTextContent == "roll" && ((kMDItemFSName != “.js") && (kMDItemFSName != ".h”))

give me the same error:

From the mdfind manpage on my machine (10.4), it looks like the query needs to be a single argument. Try

mdfind ‘kMDItemTextContent == “roll” && (kMDItemFSName != “.js") && (kMDItemFSName != ".h”)’

The single quotes get the shell to deliver the whole thing as a single argument to mdfind. Also, the parenthesis do not seem to be necessary.

The error “-bash: kMDItemFSName: command not found” means that the shell seeing at least one of the occurrences of “kMDItemFSName” in a place where it is expecting a program name.

None of the ampersands or parentheses are intended for the shell, so they need to be “quoted” (the suggestion above is one way to quote them). When they are not “quoted”, the shell gives special meaning to some parts of that command string (ampersands, parentheses, double quotes, etc.). && is a conditional command separator in most shell languages (A && B executes B only if A ended with a zero exit code). So, the (kMDItemFSName != “.js") && (kMDItemFSName != ".h”) is parsed as a normal shell command to be executed only if mdfind kMDItemTextContent == “roll” had an exit code of zero. The parenthesis mean sub-shell execution, thus kMDItemFSName != “*.js” is parsed as a shell command. So the shell looks for a program named kMDItemFSName (to which it would give two arguments != and *.js (the double quotes are used up by the shell’s parsing)). Since no program named kMDItemFSName exists, the shell gives the error “kMDItemFSName: command not found”.

Model: iBook G4 933
AppleScript: 1.10.7
Browser: Safari 4.0.3 (4531.9)
Operating System: Mac OS X (10.4)

Chrys;

None of that good advice is different for Leopard. The most important point is that the search parameters must be in single quotes so they will be handed to mdfind in their entirety.

First of all, when I do

in the results, I can see plenty of files having the extension “.h” and “.js”.

I tried the above command as it is. This command gives me no error. But no results either. I also tried changing the search query from “roll” to “r”. Made no difference.

This command gives me no error. But no results either.

correctly returns all files having the extension “.h” or “.js”

To chrys:
A couple of days back, I gave this to a Tiger user through email and he said it worked for him. It does not work for me. It gives me no results or error. Can you please try it?

do shell script "mdfind  " & " \"kMDItemTextContent == 'roll*' && kMDItemFSName != '*\\.c' && kMDItemFSName != '*\\.html'\""

All of the shell commands and the AppleScript code in your last post produced results on my machine (except for mdfind ‘kMDItemTextContent == “roll” && (kMDItemFSName == “*.js”)’, I do not seem to have any files that match that).

so there is a difference between how mdfind works in Leopard and Tiger, right?

here is the man page in Leopard. http://dl.getdropbox.com/u/872430/mdfind%20man%20page
I don’t understand much of shell.

That Leopard manpage for mdfind only has a few differences from the one on my Tiger machine. The -count option is not present in Tiger’s SYNOPSIS section. The -count, -literal, and -interpret options are not in the Tiger manpage DESCRIPTION section. The Tiger manpage does not have the note at the end of the EXAMPLES section that mentions using mdimport -X to list the available attributes. The Leopard manpage removed a reference to the mdcheckschema manpage in the SEE ALSO section.

So neither manpage really fully explains the syntax of the query language that mdfind expects. It seems that much of the Spotlight query syntax is described in the Query Expression Syntax section of Spotlight Query Programming Guide. But even that is missing some major chunks of information (like the optional use of quote marks, and whether both single and double quote marks are interchangeable). Comparing the Spotlight APIs available in C (Core Foundation) and Objective C (Cocoa), I found that the Comparison of NSPredicate and Spotlight Query Strings section of Predicates Programming Guide fills in some more information, but still does not offer complete coverage of the syntax (at least nothing for the main Spotlight query syntax like BNF Definition of Cocoa Predicates offers for the Predicate syntax).

I am at a loss for the problems you seem to be encountering. You might try rebuilding your Spotlight metadata stores (sudo mdutil -E /), but that seems like a ˜long shot’.

Try making sure you get all the parentheses matched up and all the quotes properly paired. For example, in post #4 of this thread, you have too many opening parentheses. On my machine, that problem yields an error from mdfind that says “Failed to create query for .”, but you said it gave no error on your machine. You could try using single quote instead of double quotes inside the query string itself (note that this also changes how you have to quote it to the shell and to AppleScript).

Thanks chrys! A thorough explanation is a hallmark of your posts in this forum.

I am ready to do this once I find some script examples that work on Leopard but do not work for me. Till then, I will read my bash manual to figure out how to use paranthesis and quotes.

I think now I am beginning to understand it.
The problem is “!=” does not work properly (if it works, at all). ( BLUEFROG :frowning: :frowning: )

This script gives me results which include files with extensions “emlx”, “html”, “htm”, “js”, “webarchive”, etc

set thequery to "test"
set thecmd to "mdfind 'kMDItemTextContent == " & quote & thequery & quote & "c'"
--& " && (kMDItemFSName == " & quote & "*.js" & quote & ")'"
do shell script thecmd

This script gives me results which are files having ".js"extensions only.

set thequery to "test"
set thecmd to "mdfind 'kMDItemTextContent == " & quote & thequery & quote & "c" & " && (kMDItemFSName == " & quote & "*.js" & quote & ")'"
do shell script thecmd

However, when I merely change “==” to “!=” in the script below it returns nothing (no results and no errors):

set thequery to "test"
set thecmd to "mdfind 'kMDItemTextContent == " & quote & thequery & quote & "c" & " && (kMDItemFSName != " & quote & "*.js" & quote & ")'"
do shell script thecmd

I ran the command in Terminal with suitable changes and the result was the same as that produced by applescript.

Ok, I think I have finally got hold of mdfind.
This works.

set thequery to "test"
set thecmd to "mdfind 'kMDItemDisplayName == " & quote & thequery & quote & "wc" & " && (kMDItemContentType != " & quote & "com.netscape.javascript-source" & quote & "c" & ")'"
do shell script thecmd

kMDItemContentType has to be used instead of kMDItemFSName but I don’t know why. I had actually got kMDItemFSName from a info of saved search searching files with extension “.js” http://dl.getdropbox.com/u/872430/Picture.png
Now, I used mdls.

chrys, I hope (and wish) that you see this post. As you may have seen in my last two posts in this thread that kMDItemContentType has to be used instead of kMDItemFSName if I want to exclude files with a particular extension. I wish some Leopard user can confirm this.

I just found out that kMDItemContentType of one of my pdf file was “com.adobe.pdf” and I realised that something is wrong. What has “adobe” to do with every pdf? So I googled and read this:

in
http://developer.apple.com/macosx/spotlight.html

So I can’t understand what to do? How to exclude files with a particular extension?

UTIs (the values in the kMDItemContentType attribute) denote file formats. Often when a file format has a well known inventor, the UTI for that file format will include the inventor’s name. Adobe invented the PDF file format, so the UTI for PDF files is com.adobe.pdf.

This is not much different from the UTI you were using for JavaScript files. Netscape invented JavaScript, so the UTI for JavaScript source code is com.netscape.javascript-source.

If you want to filter on extensions, then the best attribute to test would seem to be kMDItemFSName. The UTI values of kMDItemContentType are currently derived from a file’s extension, but not always in a one-to-one manner (.cxx, .cpp, and .cc all map to public.c-plus-plus-source (though .C maps to public.c-source); also in the opposite direction, .m is used for Objective C and Matlab files).

So far, your filter construction has been has_some_content && extension_IS_NOT_A && extension_IS_NOT_B. If you really think that the != operator is causing a problem, then you could switch to the logically equivalent has_some_content && ! (extension_IS_A || extension_IS_B). That gets rid of the != by replacing them with ||, parentheses, and !.

But really, if != is not working, something fundamental has gone wrong. Are you running the latest system update? Have you tried deleting and rebuilding your Spotlight metadata stores (as suggested earlier in the thread)? Have you tried a fresh OS installation? Have you tried a different computer? You should try each of these on a bootable backup of you system before trying it on your main installation.

Many thanks again, chrys

I will test this.

I am ready to do all of this if once I find one Leopard user for whom this code works:

set thequery to "test"
set thecmd to "mdfind 'kMDItemTextContent == " & quote & thequery & quote & "c" & " && (kMDItemFSName != " & quote & "*.js" & quote & ")'"
do shell script thecmd

Also, what is your opinion on using “grep -v” to exclude search locations after mdfind gives the results? Do you recommend something else? (I know about “-onlyin”)

Using grep -v to exclude certain locations should work fine. The only caveat is that grep works on lines and, technically, pathnames are not restricted to a single line. Both the linefeed and carriage return characters are valid in pathname components. But such usage is fairly rare, and there is currently no good way to overcome this problem with mdfind (find has a -0 option for this situation).

If you are going to be using grep, you could use it to do the extension filtering as well:
mdfind roll | egrep -v -e ‘.(c|h|html|emlx)$’ -e ‘^/(Volumes/(foo|bar)|Users/fred)/’