twitter
    Find out what I'm doing, Follow Me :)

Using DotLucene as DotNetNuke Search Data Store Provider

Having used DotNetNuke (DNN) since version 2, I'm really happy to find that in DotNetNuke version 3 included the Search features. I think the DNN team has done a fantastic job in providing such a great framework and APIs.

Yesterday I started to play around with DotNetNuke's SearchDataStoreProvider and ModuleIndexer. I try to provide my own implementation using DotLucene , a open source Search Engine written in C#. It is a port of the famous Jakarta Lucene to .NET maintained by George Aroush.

Why you would say? It is because I think DNN search can be made better. Recently I've seen so many Errors in our DNN sites event log caused by DNN Search Engine Scheduler. DNN default SearchDataStoreProvider and ModuleIndexer is very database heavy. It forced us to turn off the DNN Search Engine Scheduler. So I would like to create a SearchDataStoreProvider that store the indexes in the file system instead of database.

My main goals is to reduce database operations during searching and indexing. For indexing part not much I can do since most of the DNN contents are database driven. But for searching I would like to store all the things required in the indexes in the file system.

What is the better way to do that than to use Lucene, a high-performance, full-featured open source search engine library originally developed by Doug Cutting. For those who are interested in find out why the name Lucene? Lucene is Doug Cutting's wife's middle name.

My early prototype working so well it got me really excited. I got the wildcard and fuzzy search capabilities as well. As an added bonus, the indexes is compatible with Java too.

A couple more things that I want to investigate and extend further:
  • Integrate with Jakarta POI and IKVM to index pdf, word, and excel documents.
  • Investigate of how to do incremental indexing in Lucene
  • Do more research on fine tuning Lucene queries
  • Create an indexing application as windows service on separate box from the web server.

0 comments: