Here is a Spotlight CHM (HTML Help, Compressed HTML) Metadata Importer that I wrote. It is written in python using chmlib and PyCHM. It takes each html page in the chm file, converts it to plain text, and applies that text to the kMDItemTextContent Spotlight attribute of the file.
Install Instructions:
]$ wget http://www.mattweber.org/files/chm-metadata-importer.tar.bz2 ]$ bzip2 -d chm-metadata-importer.tar.bz2 ]$ tar -xvf chm-metadata-importer.tar ]$ sudo mv CHM Metadata Importer.mdimporter/ /Library/Spotlight/
This plugin has only been tested on Intel Macs. Please make sure you have Python 2.5 installed before using this plugin.
Note:
I have been informed of a problem with the importer. It seems that CHMLib did not get packaged correctly and is failing on some systems. I have not had the time to figure out how to correctly package a library with xcode yet, but I have found a quick fix.
- Download CHMLib from http://www.jedrea.com/chmlib/
- Extract, and change into the chmlib directory.
- Run ./configure; make; sudo make install
Now the importer should run correctly. You can test it by typing:
mdimport -nfd4 file-to-import.chm
Files:
http://www.mattweber.org/files/chm-metadata-importer.tar.bz2
http://www.mattweber.org/files/chm-metadata-importer-source.tar.bz2
Hi,
I tried your plugin under OSX 10.4.9, Mac Mini Intel.
I have many many CHM format ebooks.
The mdimporter reports bizarre errors whenever it attempts to index a document using this plugin.
What are the errors you are seeing? Please make sure you have Python 2.5 installed also. Thanks.
http://www.python.org/download/
hi Matt,
I got back to your blog today. No I had not upgraded python. I did so rather warily today.
I reset spotlight, but its still generating an error each time it hits a CHM file, like so:
May 28 04:35:37 gabrielas-computer mdimportserver[426]: Python Metadata Importer: Could not process file ‘/Applications/Adobe Dreamweaver CS3/configuration/Shared/XSLTransform/Help/XMLXSL.chm’ (Exception: CPythonObject passed NULL.)
some more logs
ay 28 08:42:59 gabrielas-computer mdimportserver[721]: Python Metadata Importer: Could not process file ‘/Volumes/Data/eBooks/Computer Books/O’Reilly Pack/O’Reilly Pack Folder/O’Reilly – Zero Configuration Networking The Definitive Guide (Dec 2005).chm’ (Exception: Could not get function.)
May 28 08:42:59 gabrielas-computer mdimportserver[721]: *** Assertion failure in -[CPythonSession executeFunction:inScript:parameters:], /Users/matt/Projects/python/CHM Metadata Importer/source/ToxicPython/Source/CPythonSession.m:120
May 28 08:42:59 gabrielas-computer mdimportserver[721]: Python Metadata Importer: Could not process file ‘/Volumes/Data/eBooks/Computer Books/OReilly – Zero Configuration Networking The Definitive Guide – 2005 Dec/OReilly – Zero Configuration Networking The Definitive Guide – 2005 Dec.chm’ (Exception: Could not get function.)
Found the problem. I did not package CHMLib correctly with the .mdimporter file. I do not know how to correctly package a library with Xcode yet, so until I have a chance to figure it out you can install it manually.
1. Download CHMLib from http://www.jedrea.com/chmlib/
2. Extract, and change into the chmlib directory.
3. Run ./configure; make; sudo make install
Now the importer should run correctly. You can type:
mdimport -nfd4 file-to-import.chm
To see that the importer is working. I will update this post once I get a chance to figure out how to package a library with xcode.
Hi Matt following your new improved instructions I get a specific problem any ideas…?
[quote]2007-07-28 17:57:13.065 mdimport[12927] Import ‘/Users/scooby/Downloads/Books/Computer/O’Reilly – Google Hacks.chm’ type ‘org.mattweber.compiled-html’ using ‘file://localhost/Library/Spotlight/CHM%20Metadata%20Importer.mdimporter/’
/Library/Spotlight/CHM Metadata Importer.mdimporter/Contents/SharedSupport/chm/chmlib.py:4: RuntimeWarning: Python C API version mismatch for module _chmlib: This Python has API version 1012, module _chmlib has version 1013.
import _chmlib
/Library/Spotlight/CHM Metadata Importer.mdimporter/Contents/SharedSupport/chm/chm.py:34: RuntimeWarning: Python C API version mismatch for module extra: This Python has API version 1012, module extra has version 1013.
import extra
CHM Metadata Importer could not process file “/Users/scooby/Downloads/Books/Computer/O’Reilly – Google Hacks.chm”[/quote]
This is with Python 2.5.1 and chmlib 0.39 also tried 0.38 to see if that fixed it but no!! ;-(.
Python 2.5
Here is my problem:
Something about Python has API version 1012, module _chmlib has version 1013.
$ mdimport -nf file-to-import.chm sendmail Cookbook (2003).chm
/Library/Spotlight/CHM Metadata Importer.mdimporter/Contents/SharedSupport/chm/chmlib.py:4: RuntimeWarning: Python C API version mismatch for module _chmlib: This Python has API version 1012, module _chmlib has version 1013.
import _chmlib
/Library/Spotlight/CHM Metadata Importer.mdimporter/Contents/SharedSupport/chm/chm.py:34: RuntimeWarning: Python C API version mismatch for module extra: This Python has API version 1012, module extra has version 1013.
import extra
CHM Metadata Importer could not process file “/Users/noel/Desktop/books/Over 1100 General Computer Ebooks/sendmail Cookbook (2003).chm”
Thanks for the heads up guys. I will take a look into this problem as soon as I get a chance. I have just got a new job so I have been extremely busy. Have you tried running this with a python version prior to 5.1?
Thanks again and sorry for the problems,
Matt Weber
Yup also tried with python version 2.4. Same error
Maybe I’ll look at the source, if I find any solution I let you know.
Noel, that’s the same prob. at this end. Any ideas appreciated guys!. Seem to be collecting quite a of these pesky chm books.
.
3 Tools for fixing all of your CHM EBook issues with OSX Tiger or Leopard.
CHMporter
This is a spotlight plug in which adds the contents of your CHM files.
CHMox
A CHM file viewer
quickCHM
A CHM quick look plug in for viewing those EBooks with Cover View.