Thursday, January 11, 2007

Seekable LineReader

Recently I need a way in StreamReader to get the position of a line which can be stored and later skip to that line directly. C# StreamReader API does not have any way of doing this, except calling Read() repeatedly and then doing the processing yourself. Which is clumsy. Note that, StreamReader.BaseStream.Position might be wrong due to underlying bufferring. I thought this should be a fairly common requirement and indeed, many people have same question on google groups or other forums. No good answer though. One reason is, such a thing does not really make sense for arbitrary Streams, since it might not be possible to seek in them.

I needed such a thing desparately, so I created an interface:
namespace System.IO {

// A linereader interface
public interface LineReader {

// Returns a position marker, which can be used to navigate the lines.
// Some implementations may only allow moving in the forward direction.
// Might be different from line number or file offset.
// Should only be used for traversal.
long Position {
get;
set;
}

// Reads and returns the next line, null if EOF
string ReadLine ();

// Reads the next line and returns a stringbuilder containing the line
// The StringBuilder returned could be the same one used while reading,
// so it should not be modified and its content might change when readline
// is next called.
// This is the most worst horriblest API I ever designed, for sake of speed
// And thats why this should not be a public API.
StringBuilder ReadLineAsStringBuilder ();

// Skips the next line, return true if successful
bool SkipLine ();

// Skips required number of lines; returns actual number of lines skipped
long SkipLines (long n);

// Close the reader
void Close ();
}
}
Beagle source contains the interface and several implementations.

Sunday, December 31, 2006

Fasten your seatbelts; we are ready to ship

There was a discussion going on in beagle mailing list sometime ago where I made a comment that I dont think beagle is newbie ready i.e. plug-and-play yet. Beagle fans did not understand my comment and people replied why they think I am wrong. I would be glad to be proved wrong; but all of their arguments were how they were using beagle since version x.y, how beagle was shipped and enabled by default since last z releases of some distribution and how someone is able to install beagle in a large enterprise. Duh! None of these prove that beagle is newbie ready. All they do is show that beagle works and even I know that. 0/100.

I have the feeling that some of the beagle devs and followers live in the garden of Eden surrounded by a high wall of reality. Sometimes they should go out in the streets, check the bugzilla of other distros, go through user blogs (which mostly contain complaints about how beagle does not work and how to disable it), and visit some user forums where a lot of questions are how to disable beagle from starting at startup. These are laborious jobs and not pleasing. A lot of them contain flames and invalid reasons. But almost always they are started by someone who found beagle causing trouble.

Here are some links which can make your task easy:
I sometimes make the rounds and all I see are I make a point of uninstalling beagle on all my machines and The first thing I did after ... was to uninstall beagle and now my machine is happy. Silly men, how can they not like the doggy!

Saturday, December 30, 2006

Subversion arrives. Finally!

KDE uses SVN for their source code management. GNOME used to use CVS till yesterday, which means beagle too was managed with CVS. I do not know the technical details. but time a again we did face technical problems with CVS. I was mostly told that life would be easier with SVN and lo behold! GNOME has switched to Subversion as its SCM (actually, still switching).

The last time this was tried by the awesome GNOME guys, they later found a glitch and had to cancel the migration. As a result I lost a commit that I made within hours of SVN migration. This time I will play safe and watch it for a few days before committing anything. If everything works out, life should be easier. Joe already cleaned up quite a bit of the unused files and directories, renamed the Evo-mail backend correctly and updated the links et al. A New Year with a clean, new repo. Sweet.

PS: There is one downside though. Joe (and others too) would like to use the SVN commit messages for creating Changelog files during creating a tarball. Which basically means others cannot observe the Changelog file between releases to figure out what was changed (neither I nor Joe updated the Changelog while committing, so this is a lame excuse). The real trouble is now I cannot write any lame jokes in my commit messages. Life would be serious now. Boo hoo.

Sunday, December 17, 2006

My time with the doggie

Today I read about the Ohloh project (http://ohloh.net) and added my favourite project beagle (http://beagle-project.org) to it. I was curious what it actually does.

It took them nearly 4 hours to download and analyze the source code. But it was worth the wait. It showed some interesting statistics, like 122,885 LOC codebase, 82 direct contributors (committing in CVS) and 13 of them in last 12 months.

Just for a light comparison, Firefox has a codebase of 157,207 LOC, Amarok has 169,288 LOC and (take this) PHP 6.0 has 599,805 LOC.

It was also amusing to see my share in the project: http://ohloh.net/projects/3826/contributors/21154

Thursday, December 14, 2006

beagle 0.2.14

Joe announced the release of beagle 0.1.4 few hours ago. Its a fascinating and shining new release containing exciting new features, lots of memory/speed optimizations and many bug-fixes as well. Here are the major ones which are readily visible:
  • Indexes tar, gzipped-tar, bzipped-tar, gzipped and bzipped files, in the filesystem as well as in email attachments. The results show you the exact file in the archive that matched the query.
  • Do some smart tokenizing to allow matching 001234 to a query of 1234 and better matching of file names. No more missing files.
  • Beagle can find and extract data itself using its dozen or more backends. But sometimes its better for other applications to send data to beagle for indexing. Beagle had the infrastructure to act as a search/indexing service provider. The release contains an example C code to show how to do that; its pretty simple actually. Obviously python can also be used.
  • Some cool signal mechanism which help to figure out what file in being currently indexed and for how long. This will be helpful if you feel beagle is taking ages to index some file.
  • Use Xdg autostart mechanism to auto-start beagle. KDE4 will also implement xdg autostart mechanism. One more step towards being DE agonistic.
  • The indexing information now explicitly mentions if the initial indexing is in progress. Also clients now have the option of being notified when the initial indexing ends.
  • Lots of memory fixes. bhale just mentioned in the irc channel holy crap, startup RSS for beagled is 15m... beagled is below nautilus in mem usage...  im not believing my eyes :) Thank you for your myth on how beagle is a bloatware.
  • API and beage-search support to know the total number of documents that matched any query. Not the superficial imposed limit of 100 documents.
Go, get it!

Sunday, December 03, 2006

License to hack-debug-release

I start with the disclaimer that I am not a lawyer. I am also not very careful in reading the EULAs (End User License Agreement - a short acronym of a long term usually describing an even longer gibberish english text). When I release my first piece code (was it JBabel or mGet ?), I put them in GPL (possibly v1). Of course I didnot know what was I doing - I was merely completing a formality during the fascinating experience of sharing one's program. With time, and during development of Beagle, I came to know about the various licenses. It was about this time that the GPLv3 debate started. Also, I started releasing verious kde-beagle softwares which required me to figure out the correct licenses for them.

Liceneses, I feel, serve two purpose, demand credit where its due and specify the amount of responsibility.

Authors of some projects just want to release code without claiming any credit. I did so too. Only to be pointed out by someone that I have to actually specify the terms for others to use my code. Since I dont want to be credit for the code, its natural not to held responsible for its damages either. That needs to be specified too. One can either cook up one's own license or just pick one from the sea of licenses (http://www.opensource.org/licenses/) . The license I found to do just that is the MIT license.

MIT (X11) License: Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, ...

The license is itself not copyrighted, so it can be modified if needed. A rather free-to-do license, except for this clause following the above text

... subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

Wikipedia tells me that the latter part is necessary for the copyright laws of US and other countries. The credit part is followed by the author responsibility part

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT...

Often, people use this clause to explicitly deny responsibility

Except as contained in this notice, the name(s) of the above copyright holders shall not be used in advertising or otherwise to promote the sale, use or other dealings in this Software without prior written authorization.

I havent seen projects where the last clause wasnt present, so I assume it is not a necessary clause. I didnot use it myself when I released yaBi under MIT license. The core parts of Beagle and Mono are released under this license.

My other project, kBeagleBar uses the fancy LGPLv2. I had to use LGPLv2 because I was using some source code from kde libraries. Note that, LGPL allowed me to use the library without any restriction, but I copied some code from a library source file - so I had to use LGPL. Similarly, I release kontrol as (distributed as a part of kbeaglebar) which uses the code from GPLed kerry, so its under GPL too. Also GPL and LGPL are copyrighted by FSF. But then again, I wouldnt dare to modify them myself.

I am looking forward to GPLv3. No, not because I have any particular reason to like it. Because I hope to read it fully, understand it and read the various articles and blogs about it to understand the issues of free software licensing. Its always good to know where my rights end and start.

Even after all this, it will take me some more years to actually read EULAs. They are frightfully convoluted to confuse my little grey cells.

Sunday, November 26, 2006

Bigger than life icon

Icons are small pieces of art I encounter in my everyday life. They are like
apparels, you see them in all forms in every moment of life. They serve
different purpose, keep you warm, keep you cool, keep you comfortable
whatever but all them try to look good (some even make it their prime
agenda). In human-computer interaction, icons are large, easy targets that
serve as better visual cues than textual messages that needs to be parsed and
pattern matched. A good icon make that experience fun and enjoying.

I started with the the usual KDE crystal icon theme, and then tried different
icon sets. The tango icon theme is good, but like the usual gnome colours,
they have a very low-saturated colours which look a bit drab and boring on my
desktop. I am not looking forward to extremely gorgeous, highly saturated and
colourful icons; my desktop is not a colour palette, but the art forms I will
stare at a million times over months better be pleasant to look at. kde-looks
has quite a few icon theme, and a lot of them are mixed icon theme. For a
while, I tried KDE Crystal Diamond icons. It has good icons, except some of
the icons are very light coloured, white based and its hard to make out the
icon contents. Most of the filetype icons and the media-play (amarok)
play/pause icons are like that.

I have heard a lot about Oxygen icons; unfortunately their license does not
allow anyone to package them. Of course, I could download the svg icons
directly from their svn and convert them to an icon theme locally. I need to
install inkscape for that. Interesting enough, I found a forum post in
sabayonlinux about using oxygen icons in current kde desktop. The user gives
out the details of how to make an icon theme and later even provides a tar.gz
package to install oxygen icons. Oxygen is work in progress, so that user
replaced the missing icons by other icons. I think he violated their license
but I have to double check. Anyway, I installed that icon them and its really
good. Attention has been paid to every detail. Icons are 3d and 2d exactly
according to their need, those that look good with shadows have them, the
colours look just ok. All in all, I am pretty satisfied. I am looking forward
to manually checking out the oxygen icons and making a theme out of it. That
way, I will get the icons from oxygen as they are checked in. Good work,
guys.

Saturday, September 23, 2006

Mono.FUSE Filesystem for Digikam Tags

FUSE is cool anyway. And writing FUSE filesystems in C# is even cooler. Adjectives aside, I wrote a FUSE-DigikamFS filesystem for browsing Digikam tags. Its much easier to browse the images organized by tags using konqueror or kuickshow. This way I can also share the tags mount-point by kpf (or some other HTTP server) allowing others to browse my images arranged by both folders (Albums) and tags.

A sample session,

[debajyoti@dbera Tags]$ pwd
/home/debajyoti/Tags
[debajyoti@dbera Tags]$ ls
Artistic/ Favourites/ Fireworks/ Food/
Nature/ People/ Places/ QuickCheck/ Season/
[debajyoti@dbera Tags]$ ls Artistic/
Black-White/ Mood/ Motion/ pa020099.jpg@
pa020102.jpg@ Perspective/ Shadow/


As you see, I use symlinks to overlay the images from tag filesystem. Works pretty cool.

Monday, July 24, 2006

Ed-ed

Who knew that 'vi' was based on the legendary editor 'ed'. Or, inspired by. Being a die-hard vi-fanboy, it took me no time to get used to ed, at least being able to compose/edit/save simple documents. An ed-documentation page at BUCS provided me with a quick howto listing (all) the important commands. Actually, all the commands, since there are only a few of them. Unlike vi, where you have to know thousands of commands. Its wonderful how much simpler life becomes if we focus on functionality alone without caring for presentation.
If you are comfortable with vi (or even with vim or with gvim but dont use the menu too much), I suggest you to give ed a try. You'll love it.

(From the documentation page) An ed-quickcard:

All the standard ed commands are listed below, together with a brief description of their function. Those commands which may be given with line addresses are shown with the default values of the addresses in the first column. For example, 1,$ for w means that the write command may be given with one or two line addresses to specify a particular line or a range of lines to be written to a file, and that if no address is given the default address is 1,$ (ie, all the lines in the buffer are written out). The default address . represents the current line and $ represents the last line in the buffer.


The Input Commands

Default Address Command Function
. a text Append input text after addressed
line.The last line input becomes the
current line.
.,. c text Replace addressed lines with input
text. The last line input becomes the
current line.
. i text Insert input text before the addressed
line. The last line input becomes the
current line.

The Edit Commands

Default Address Command Function
.,. d Delete the addressed lines from
the buffer. The line after the
last deleted line becomes the
current line.
e file Delete the buffer contents then
read file into the buffer. The
last line read in becomes the
current line.
E file As for e but no warning is given
if a modified buffer has not been
written out.
f file Print current remembered filename
if file not specified, otherwise
set it to file.
1,$ g/R/cmds Perform cmds on all addressed lines
matching regular expression R. Last
line in which a match was found
becomes the current line.
.,.+1 j Join together all the addressed
lines. The resulting line becomes
the current line.
. kx Mark the addressed line with the
single lower case character name x.
The addressed line becomes the
current line.
.,. l List the addressed lines showing
non-printing characters and folding
long lines. The last line listed
becomes the current line.
.,. mA Move addressed lines to follow line
whose address is A. The last line
moved becomes the current line.
.,. p Print the addressed lines. The last
line displayed becomes current line.
q Exit from the editor.
Q Exit from the editor with no warning
if a modified buffer has not been
written out.
$ r file Read file into the buffer after
addressed line. The last line read in
becomes the current line.
.,. s/R/S/{g} Substitute string S for the regular
expression R in the addressed lines.
If g specified substitution is made
globally throughout addressed lines.
.,. tA Copy the addressed lines to follow
line whose address is A. Last line of
the copy becomes the current line.
.,. u Undo the effect of the previous
substitute command. The current line is
reset to its value before that command.
1,$ v/R/cmds Perform cmds on all addressed lines
not matching regular expression R.
1,$ w file Write the addressed lines into named
file. The current line is not reset.
1,$ W file Append the addressed lines to the
named file. Current line not reset.
x Decrypt or encrypt the text according
to an input key. The current line is
not reset.
$ = Print the line number of the addressed
line The current line is not reset.
|cmnd Pass cmnd to the UNIX shell to be
executed. Current line is not reset.
A Where A is one of the legal address
forms listed above: locate addressed
line and display it. The addressed line
becomes current line.
.+1 newline Print the next line. Addressed line
/linefeed becomes the current line.


Obligatory jokes link (not a PJ).