Swish-e is a fast, flexible, and free open source system for indexing collections of Web pages or other files. Swish-e is written in C but has APIs for Perl, PHP and other languages.

This page is about a Python API. Note that this is not the only such API, others can be found on the Python Package Index website.

Please note: this package is not very well tested and may be out of date (it was initially done for Swish-e 2.4). No support is provided.

What's new

An example

Assuming you have a SWISH-E index file called 't/swish.idx', and you would like to search for occurences of 'madrid' in the index files, the following would do:

$ python
 Python 2.2.3 (#1, Jul 15 2003, 15:44:20) 
 [GCC 2.95.3 20010125 (prerelease, propolice)] on openbsd3
 Type "help", "copyright", "credits" or "license" for more information.
 >>> # load the module
 >>> import SwishE
 >>> # get a SWISH-E handle on 't/swish.idx'
 >>> handle = SwishE.new('t/swish.idx')
 >>> # get a search object
 >>> search = handle.search('')
 >>> # search for 'madrid'
 >>> results = search.execute('madrid')
 >>> # tell the world how many results we have
 >>> print results.hits()
 >>> # iterate on the results
 >>> for r in results:
 ...    print r.getproperty('swishtitle')
 ... 
 Argentina Centro de Medios Independientes
 Indymedia Barcelona: home
 San Francisco Bay Area Independent Media Center
 Independent Media Center -
 >>> # now looking for 'lluita', we want to sort by title
 >>> search.setSort('swishtitle')
 >>> again = search.execute('lluita')
 >>> for r in again:
 ...    print r.getproperty('swishdocpath')
 ... 
 1.html
 >>> # figure out that sorting isn't of much use with a single match

Jean-François Piéronne added a "query" method as of version 0.5 - it is possible to pass the SwishE.Handle object a search string directly by way of that method. For example:

>>> for r in SwishE.new('index.swish-e').query('tags'):
...    print r.getproperty('swishtitle')
On SGML and HTML
HTML 4 Changes
Tables in HTML documents
HTML 4 Specification References
Conformance: requirements and recommendations
Performance, Implementation, and Design Notes
>>>

Acknowledgments

Gianluigi Tiesi and Jean-François Piéronne contributed patches to this module. BerliOS hosts this web page and other files for this project. Thanks!

Contact details

<jbrobertson at users berlios de>, or https://developer.berlios.de/users/jbrobertson/.
BerliOS Developer Logo