Friday, April 20, 2007

Theory Guy...

This is not really techie but has a somewhat tech-vs-theory war flavor. The following list came out of today's departmental Friday lunch. Top 10 "ways to tell a theory person from a systems person":

10. One gets job and another does not.

09. One likes 10000n^2 better than n^3.

08. Space-time is important outside Star-trek.

07. In therory they are same, but in practice they are not.

06. P is not equal to NP divided by N.

05. n^10 is efficient.

04. SAT (Famous NP complete problem also known as Satisfiability) is most of the times easy to compute.

03. (Specific to our department) One can be only be found across the street.

02. Its pronounced "Lee-nux" (not "Lai-nux")

01. Computers do more than just email ?!

Wednesday, April 04, 2007

TinyBeagle or a Lucene Example

Recently I read this interesting comment in an OSNews article. It tried to briefly summarize what beagle is. I take users' comments very seriously and this person seem to know some internal of beagle, so I thought maybe he is true (modulo some factual errors). Maybe the only new thing in beagle is the crawler, the GUI and the scheduler; its mostly little C# glue code tying up a few third party apps.

So, I wrote down a small Lucene.Net based file indexer and query program. You index by
mono LuceneLocate.exe /path/to/index/dir index /directory/to/index
and query by
mono LuceneLocate.exe /path/to/index/dir query query_term
Pretty simple program, 85 lines of actual code. Incredibly fast performance. Using external program ('cat') to index files in a directory (recursively), it indexes 180 files in 0.06 seconds. Query returning 44 results took 0.0015 seconds . Takes 24 MB virtual, 5.3 MB RSS-Shared. No GUI yet. I could have added a scheduler to pause for 10 seconds after every 10 files (5 more lines). This Lucene.Net based crawler and indexer beats beagle in performance but nowhere close to beagle.

Maybe beagle is not a lucene-powered locate. After all, to err is human.