elfs: (Default)
[personal profile] elfs

It drives me nuts that we in the Django community rely on Solr or Haystack to provide us with full-text search when MySQL provides a perfectly functional full-text search feature, at least at the table level and for modest projects. I understand that not every app runs on MySQL, but mine do, and I’m sure many of you are running exactly that, and could use this technique without modification.

Well, after much digging, I found an article on MercuryTide’s website covering custom QuerySets with FULLTEXT and relevance, and built this library around it.

I used this rather than Django’s internal filter keyword search, because this technique adds an additional aggregated value, the relevance of the search terms to the search. This is useful in sorting the search, something not automatically provided by the QuerySet.filter() mechanism.

You must create the indexes against which the search will be conducted. For performance reasons, if you’re importing a massive collection of data, it’s better to import all of the data and then create the index. More importantly, when you declare that a SearchManager to be used by a Model, you declare it thusly:

class Book(models.Model):
    ...
    objects = SearchManager()

When you do, you must add an index that corresponds to that list of fields:

CREATE FULLTEXT INDEX book_text_index ON books_book (title, summary)

Notice how the contents of the index correspond with the contents of the Search Manager.  Or you can automate the process with South:

    def forwards(self, orm):
        db.execute('CREATE FULLTEXT INDEX book_text_index ON books_book (title, summary)')

    def backwards(self, orm):
        db.execute('DROP INDEX book_text_index on books_book')

To use the library is fairly trivial. If there is only one index (which can encompass several columns) for any table, you call

books = Book.objects.search('The Metamorphosis').order_by('-relevance')

If there’s more than one index, you specify the index by the list of fields:

books = Book.objects.search('The Metamorphosis', ('title', 'summary')).order_by('-relevance')

Note that that’s a tuple, and must be.

If you specify fields that are not part of a FULLTEXT index, the error message will include lists of viable indices.   It will also tell you if there are no indices.  (Getting that to work was tricky, as it involved database introspection and the decoration of methods, so I’m especially proud of it.)

The library is fully available on my github account: django_mysqlfulltextsearch

This entry was automatically cross-posted from Elf's technical journal, ElfSternberg.com

Date: 2010-10-23 12:06 am (UTC)
From: [identity profile] wolfwings.livejournal.com
Sadly, most stuff just ignores MySQL's FULLTEXT for a different reason: It only works on MyISAM tables. :-/ InnoDB usage blocks it, and more and more stuff is running/focussing on InnoDB from what I've been seeing since it allows for row-level locking instead of shutting down an entire table to lock it to update it.

This same stuff could be applied to a Sphinx install pretty easilly though, since Sphinx runs a subset of the MySQL command-set for communication with front-end apps.

Date: 2010-10-25 02:54 pm (UTC)
From: [identity profile] en-ki.livejournal.com
Yeah, I like the part where I can have full-text search XOR referential integrity. "Herp a derp", as the kids say.

Profile

elfs: (Default)
Elf Sternberg

December 2025

S M T W T F S
 12345 6
78910111213
14151617181920
21222324252627
28293031   

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Dec. 30th, 2025 08:07 pm
Powered by Dreamwidth Studios