Wednesday, May 31, 2006

3rd Top in closing most bugs for last week

I was playing around in bugzilla creating more saved-searches for quicker access to bugs and found my name listed 3rd for closing most number of bugs in last 7 days on Past week's bug activity link in Gnome bugzilla.

As the link for past-week's-bug-activity keeps changing, here is the list of bugs that I closed, includes evolution-conduits, beagle, connector, gnome-pilot and GAL bugs.

Bug triaging - Part 1

Last week I triaged around 36 bugs in GAL, which is now a retired-component of Evolution. Have to categorize that list of bugs in to fix-for-gods-sake, fix-for-goodness-sake and wont-fix.

Today, triaged around 80 bugs in all including Evolution-conduits, gpilot and Beagle.

Marked around 20 bugs as FIXED/INCOMPLETE/NOTGNOME and 30 bugs to be fixed.

Recurrence - Mind blowing stuff of iCal - 2

Today I committed my recurrence work in evolution-exchange along with other conduit patches. Now, evolution-conduits can sync against exchange and groupwise (needs testing) accounts.

Following links will take you to those committed changes:
1) Calendar conduit
2) Todo conduit
3) Addressbook conduit
4) Recurrence fix in evolution-exchange

Now, that Eureka thing. Yes, the bug was in exchange-calendar code that reads events from the server. According to MSDN, events have some specific property that has values from 0 - 4, which represents events with master object, without master object, detached instances etc. Evolution-exchange was always querying for events with codes 0 and 1, which made the message not to appear in evolution as it didn't even fetch. Chen fixed this thing today.

Recurrence - Mind blowing stuff of iCal

The supplemental patch had a new FIXME to support syncing of instances of a calendar event. In simpleton, recurrences. Recurrence in Evolution has always been a nightmare and especially w.r.t individual backends its more than a nightmare. Chen had made quite good contributions for recurrence in evolution-data-server and some of the other backends. My synchronization stuff cannot be close to complete, if I don't have support for syncing the instances of a recurrence. Today, with a determined mind, was looking at the other backends for similar implementation and finally, got the code in place.

Testing is what makes us happy to see our code working. Well, it was working except for a specific scenario in which we have only (a) instance(s) without any corresponding master object. Hugh... I was tired finding out the roots of the cause and finally got to a point where I could say, "Eureka!! I found the reason". Well, thatz it for today and keep watching to know more about that "Eureka" stuff. ;-)

Exchange synchronization is back.

Just cooked up a patch for todo-conduits to sync with exchange-tasks. All the three, Calendar, Tasks and Addressbook (contact) conduits are now authentication aware that they can authenticate without user interaction. However, user should have saved their account passwords through Evolution.

Patches are as follows:
Calendar and Contacts
Tasks
Supplemental fix for exchange-calendar sync to work. ;-)

Beagle memory optimization - Filters

On 26th November 2005, myself and Debajyothi Bera started discussing about memory consumption of Beagle, especially its filters. DBera gave a link to the heap-buddy output of Source filter, which showed around 2.9M of allocations.

I immediately pulled heap-buddy from the trunk and installed it and ran beagle-extract-content on a 'C#' source-code file of size approx. 130K. The report showed around 2.6M of allocations for extracting contents from that 'C#' file and the size of extracted content was just around 68K. Here is the heap-buddy output before fix:

bhargavi@vvaradhan-lap:~/cvs/beagle/beagle-before-fix/beagled> heap-buddy outfile
SUMMARY

Filename: outfile
Allocated Bytes: 2.6M
Allocated Objects: 68736
GCs: 10
Resizes: 8
Final heap size: 1.3M

Distinct Types: 170
Backtraces: 2572

and individual "type-based-allocations" as follows...

bhargavi@vvaradhan-lap:~/cvs/beagle/beagle-before-fix/beagled> heap-buddy outfile types

Type # Total AvSz AvAge BT#
string 44744 2.0M 47.4 0.2 636
string[] 4460 106k 24.4 0.1 59
char[] 3922 98k 25.7 0.3 59
char 7141 69k 10.0 0.0 8
System.Text.StringBuilder 1940 45k 24.0 0.0 71
System.Collections...t/SimpleEnumerator 1339 31k 24.0 0.0 17
byte[] 21 29k 1422.7 5.6 21
System.Collections.Hashtable/Slot[] 106 26k 251.2 1.8 71
object[] 321 23k 76.2 0.7 182
Beagle.Filters.FilterHtml 695 21k 32.0 3.9 1
System.MonoType 650 16k 25.7 8.1 166
System.Reflection.MonoMethod 293 11k 41.0 7.8 20
System.Xml.NameTable/Entry[] 20 10k 528.0 0.1 13
System.Xml.NameTable/Entry 226 5.3k 24.0 0.4 94
System.IO.FileInfo 65 5.1k 80.0 1.1 2
System.Collections.ArrayList 255 5.0k 20.0 0.8 154
System.Xml.XmlName...eManager/NsScope[] 15 4.9k 336.0 0.1 8
System.Reflection.PropertyInfo[] 116 4.6k 40.9 0.0 10
System.Collections.Hashtable 93 4.4k 48.0 2.1 62
System.Reflection.MethodInfo[] 20 3.9k 199.6 0.0 3
System.Attribute[] 197 3.6k 18.6 0.4 66
System.Xml.Serialization.XmlAttributes 55 3.2k 60.0 0.3 25
System.MonoType[] 201 3.1k 16.0 6.0 8
System.Reflection.MonoCMethod 125 2.6k 21.5 5.3 6
System.Xml.Serialization.TypeData 58 2.5k 44.0 8.9 57

(skipped 145 types)

... check the top 4 types.. string, string[], char, char[] and StringBuilder. These are the types that contributed most of allocations.

A little closer look at the Filter.cs revealed a potential-unnecessary-allocation of string being done. The fix here fixed it. The fix saved around 600K of allocations for filtering the mentioned "C#" file.

Heap-buddy report after the fix:

bhargavi@vvaradhan-lap:~/cvs/beagle/beagle/beagled> heap-buddy outfile

SUMMARY

Filename: outfile
Allocated Bytes: 2.0M
Allocated Objects: 47082
GCs: 8
Resizes: 8
Final heap size: 1.3M

Distinct Types: 170
Backtraces: 2581

.. and "individual-type-based-allocations"..
bhargavi@vvaradhan-lap:~/cvs/beagle/beagle/beagled> heap-buddy outfile types

Type # Total AvSz AvAge BT#
string 32361 1.6M 50.8 0.2 639
char 7141 69k 10.0 0.0 8
char[] 218 33k 157.3 4.4 54
System.Collections...t/SimpleEnumerator 1334 31k 24.0 0.0 17
byte[] 21 29k 1422.7 4.4 21
string[] 685 27k 40.9 0.6 55
System.Collections.Hashtable/Slot[] 107 26k 250.3 1.4 72
object[] 319 23k 75.7 0.6 182
Beagle.Filters.FilterHtml 734 22k 31.6 3.2 1
System.MonoType 652 16k 25.6 6.4 166
System.Reflection.MonoMethod 293 11k 41.0 6.1 20
System.Xml.NameTable/Entry[] 20 10k 528.0 0.1 13
System.Xml.NameTable/Entry 226 5.3k 24.0 0.4 94
System.Collections.ArrayList 253 4.9k 20.0 0.6 152
System.Xml.XmlName...eManager/NsScope[] 15 4.9k 336.0 0.1 8
System.Reflection.PropertyInfo[] 116 4.6k 40.9 0.0 10
System.IO.FileInfo 57 4.5k 80.0 0.1 2
System.Collections.Hashtable 94 4.4k 48.0 1.6 63
System.Reflection.MethodInfo[] 20 3.9k 199.6 0.0 3
System.Attribute[] 197 3.6k 18.6 0.3 66
System.Xml.Serialization.XmlAttributes 55 3.2k 60.0 0.3 25
System.MonoType[] 202 3.2k 16.0 4.5 8
System.Text.StringBuilder 127 3.0k 24.0 0.2 85
System.Reflection.MonoCMethod 122 2.6k 21.5 4.3 6
System.Xml.Serialization.TypeData 58 2.5k 44.0 6.9 57

(skipped 145 types)

When you compare the above two reports, the main contributors for memory allocations were reduced after fix.

The current fix will give some good numbers when run on bigger documents. (Will post them later.)

As a side-effect of this fix, the filters run little faster than they used to be. ;-)

Dtrace and glib

Memory leaks have always been difficult problems to deal it. There are many tools available and most of them preloads the probes and thus adds overhead on the execution of software.

During recent encounters with memory leaks, I thought of trying DTrace, the dynamic tracing tool available from Solaris. It operates on the kernel level probes and thus doesn't add overhead by preloading probes during execution.
It took me a while to get the setup ready for a noble cause. ;)

And bingo!!, got a dtrace script and a supporting perl script that ease the job of printing the stack that leaks memory. Both the scripts are available here. The output of the script was quite satisfactory, however, glib uses memory-pools that doesn't actually free the memory, rather, mark it for re-use and similarly, its allocation routines return pointers from the pool, when available otherwise allocates fresh memory and maintain it in the pool.

The DTrace script in the link actually uses the standard syscall probes, which in this case wouldn't be appropriate. So, I came up with the following script for glib:

#!/usr/sbin/dtrace -s

pid$target:libglib-2.0.so.0:g_malloc:entry
{
self->trace = 1;
self->size = arg0;
}
pid$target:libglib-2.0.so.0:g_malloc:return
/self->trace == 1/
{
printf("Ptr=0x%p Size=%d", arg1, self->size);
ustack();
self->trace = 0;
self->size = 0;
}

pid$target:libglib-2.0.so.0:g_realloc:entry
{
self->trace = 1;
self->size = arg1;
self->oldptr = arg0;
}

pid$target:libglib-2.0.so.0:g_realloc:return
/self->trace == 1/
{
printf("Ptr=0x%p Oldptr=0x%p Size=%d", arg1, self->oldptr,
self->size);
ustack();
self->trace = 0;
self->size = 0;
}

pid$target:libglib-2.0.so.0:g_malloc0:entry
/self->trace == 1/
{
self->trace = 1;
self->size = arg1;
}

pid$target:libglib-2.0.so.0:g_malloc0:return
/self->trace == 1/
{
printf("Ptr=0x%p Size=%d", arg1, self->size);
ustack();
self->trace = 0;
self->size = 0;
}

pid$target:libglib-2.0.so.0:g_free:entry
{
printf("Ptr=0x%p ", arg0);
}


The test program to verify the script:

#include 
#include

int
main ()
{
gchar* str = NULL;
int i;

for (i = 0; i < str =" g_strdup">

DTrace reported "Zero" memory leaks and I sort of convinced that the script works and can be run on bigger application like Evolution.

Travelogue: NCAC - 2005, PSG Tech, Coimbatore.

My travelogue can be reached here.

It was a real fun to talk about Open Source. Just three weeks before this event, the department had inaugurated a lab of Open Source and some of their alumni members from Novell presented talks on Open Source Technologies and distributed OpenSuSE DVDs to the lab and students.

Eureka!! Another Planet discovered.

Isn't the subject amazing? Yep, I guess. Today, while blogging and reading other planets, a chotasa spark jalaa in my mind..oOOO(Why not we have a planet for evolution?)......

jeee booom baaaaaaa....,

Here it is, however, it is as of now running locally on my laptop, will soon move it to a publicly accessible server.

WOW!! I should really appreciate and thank, Jeff Waugh and Scott James Remnant for the Planet stuff. The setup was so easy that it took hardly 5 mins to get my planetevolution up and running. Great work guys.

More enhancements than bugs, isn't it great?

From the list of around 30 pilot-conduit bugs, today I categorized them in to two broad categories, fix-for-god-sake and fix-for-goodness-sake. There was a third category, ask-nice-to-get-fixed, as a goodies for the hacker contributing to it. Most of the bugs decorated the fix-for-goodness-sake category, as many of fix-for-god-sakes' are gone, phew. :-)

Its really the right time to start supporting the new Palm OS databases which, pilot-link 0.12.x supports, in evolution-conduits. First one is Contacts, with so many requests, with one of them falling in to ask-nice-to-get-fixed category. ;-)

A Beagle bite

It feels like void, even after doing lot of coding in evolution & co, till, I get a Beagle bite. May be that Beagle hacking is a real fun. After the memory-optimization work, I took two-bugs filed by our test-hero, David Richards, 323276 and 323312. Though both the bugs are fixed, I haven't checked-in the fix for 323276 as it requires a little debugging-work to be done to get rid of a double free error.

323312 actually brought-up a corner case in the FilterPPT, actually, I would say one of the weirdness of Microsoft's formats. We were reading till EOF of the PPT stream and it was working fine except the document attached with the bug. Some stray bytes were left without reading causing the filter to enter into an infinite-loop.

Guess, will have more beagle-bites this week.