Like most early online communities, the graduating class of 1989 from IIT Kanpur has a Yahoo! group: iitk-89. It was created way back in 1999 and was quite active till a few months ago. We discussed stuff that our most other friends would find uninteresting. Some one will send a link to an article that he (there were girls in our batch but they rarely participated) liked or found outrageous and then a heated discussion will ensue. Sometimes we collectively solved mathematical puzzles. It was fun.
But then a dispute arose about an off hand comment made by one the members. Without going into details, I'll only say that this incident polarized the group and the nature of discussion became very different. At this point, one of the members wondered: "it would have been nice if Yahoo! allowed a simple form of expressing likeness/dislikeness of posts". Posting response to a message you disagree with takes too much energy, is seen as an attack and is delivered as email to everyone in the group. A click to express agreement or disagreement which is then aggregated and shown as count to only those who visit the group pages would be milder and much more effective. Think of this as simple yes- or no- nodding of head during normal conversation. These are cues that get picked up and changes the conversation in subtle ways before it gets to heated and loud verbal exchange.
I kept thinking that adding a capability like this would be very beneficial to the Yahoo! group communities. So when the opportunity came this month in form of Yahoo! (internal) hackday, I coded up YLike, a hack that adds like and dislike buttons. With a little bit of extra work, I was able to make it work on my personal server and make it available to others. Visit Ylike page and give it a try. If you are a member of iitk-89 group then you can even see my votes for some of the recent messages.]]>
While playing with R, I discovered that I could load the MS Access file of marks directly into R and do simple analysis by directly issuing R commands against the loaded data. For example, following is a brief session with R console, string "jee2009" being an ODBC DSN, configured in Windows Control Panel ODBC Setup icon, pointing to the downloaded MS Access file.
// load ODBC plugin > library(RODBC) // connect to MS-Access DB > ch <- odbcConnect('jee2009') // show table names > sqlTables(ch) ... list of tables ... (snip) // show columns of a table > sqlColumns(ch, 'All Marks') ... list of columns ... (snip) // show number of rows > sqlQuery(ch, "SELECT count(1) from `All Marks`") Expr1000 1 384977 // show number of rows for Boys > sqlQuery(ch, "SELECT count(1) from `All Marks` where GENDER='M'") Expr1000 1 286942 // show number of rows for Girls > sqlQuery(ch, "SELECT count(1) from `All Marks` where GENDER='F'") Expr1000 1 98028 // show average of physics, chemistry, maths and total > sqlQuery(ch, "SELECT AVG(phys), AVG(chem), AVG(math), AVG(mark) from `All Marks`") Expr1000 Expr1001 Expr1002 Expr1003 1 7.80696 10.43663 10.1155 28.35909 // quit > q()
So, there were a total of 384,977 test takers, consisting of 286,942 boys (74.54%) and 98,028 girls (25.46%).
The following table shows some more stats on aggregate and individual marks for Boys and Girls in each of the three subjects:
Metric | Boys | Girls | ||||||
---|---|---|---|---|---|---|---|---|
Phys | Chem | Maths | Total | Phys | Chem | Maths | Total | |
Minimum | -35.00 | -35.0 | -35.00 | -86.00 | -35.00 | -35.00 | -35.00 | -77.00 |
Maximum | 156.00 | 132.00 | 156.00 | 424.00 | 144.00 | 124.00 | 146.00 | 362.00 |
Average | 8.79 | 11.10 | 10.90 | 30.79 | 4.92 | 8.49 | 7.81 | 21.24 |
Median | 4.00 | 7.00 | 7.00 | 17.00 | 3.00 | 5.00 | 5.00 | 13.00 |
Std. Dev | 19.61 | 19.98 | 18.86 | 51.80 | 13.64 | 17.13 | 15.37 | 39.26 |
Girls seem to be seem to be showing poor performance than boys in almost every metric. I wasn't quite expecting comparable performance but was surprised nonetheless! Could this just be a manifestation of the societal stereotype that girls are not supposed to be good at Engg. oriented subjects like maths, physics and chemistry. Or is there some other force at work?
Let us look at the average share of marks in Physics, Chemistry and Maths for all students.
> sqlQuery(ch, "SELECT sum(phys)/sum(mark), sum(chem)/sum(mark), sum(math)/sum(mark) FROM `All Marks`") Expr1000 Expr1001 Expr1002 1 0.2752895 0.3680171 0.3566934
Physics marks account for 27.53%, Chemistry for 36.80% and Maths 35.67%. Restricting entries to Boys only gives the corresponding shares as 28.56%, 36.04% and 35.04%. The corresponding figures for girls are 23.17%, 40.02% and 36.81%, implying girls did better in Chemistry than in Physics and Maths.
However, restricting the population to those with total marks within certain range, say 0-10 (low performing), 100-110 (better than average), 200-210 (significantly better than average) and 300-310 (high performing) doesn't give any clear trend, as evident from the accompanying chart.
Surprisingly, both boys and girls did better in Maths in the low performing range but not in other ranges. Also, relative performance in different subjects doesn't seem to depend on gender at all.
If a person does well in Math then is he or she also likely to do well in Chemistry? in Physics?
To answer this, I looked at correlation coefficient between marks of different subject for the groups of JEE aspirants defined by their total marks, as in the previous section.
> marks <- sqlQuery(ch, "SELECT phys, chem FROM `All Marks` where GENDER='M' AND mark >= 300 AND mark <=310") > cor(marks) phys chem phys 1.0000000 -0.2912179 chem -0.2912179 1.0000000
And here are all the different correlation coefficients:
To my utter amazement, I see no correlation. In fact, I see negative correlation in most cases. I was expecting positive correlation for most subject pairs, especially among high performing students. But no, there is no correlation. How is this possible?
Then I tried grouping of population based on marks in a particular subject. The following table shows correlation coefficients for groups that got 0-10, 50-60 and 100-110 in Maths.
This shows most correlations as positive. So those who do well in Maths are more likely to show similar performance in other subjects as well. In fact this is not limited to Maths only. I calculated correlations based on Physics and Chemistry and found similar results.
The only explanation of these observations I can think of is that high total marks is not a good predictor of consistent performance in all subjects, whereas marks in a particular subject is. If true, this is a very significant, for JEE 2009 based its ranking on total marks and not on marks in a specific subject, even though higher marks in a particular subject is a better predictor of consistent performance! Of course, it is hard to decide which subject to pick for ranking. ]]>
Let us say your client program running on machine chost is talking to the Server program running on machine shost and listening for connections at port 8000. To capture the request and response traffic in files, you need to do two things:
You see, it isn't that simple. So, I just picked the round number 10. A bit less than what the official records indicate, a bit more than my real years at HP and pretty close to the average of these two figures.
Besides the obvious aging and graying (or rather, loss) of hair, these 10 years have brought numerous changes: relocation from Bangalore to Bay Area and all its attendant transitions in the lifestyle, addition of Unnati (my younger daughter) to our three member family, fulfilling part of the American dream, naturalization to US citizenship and many others.
My years at HP saw many historically significant events: spinning off of Agilent, merger with Compaq, colorful days of Carly Fiorina and a resurgent HP under Mark Hurd, to name a few. However these had much less impact on my day to day professional life than events less well known but much closer to what and with whom I worked on in the software business of HP: the initial excitement and euphoria around E-speak and its subsequent unfolding along with dotcom bust of 2001 (I personally and HP as a company did a learn a thing or two with this whole endevour), acquisition of Bluestone (a company that developed a J2EE App Server) and its subsequent closing for business reasons, and the rapid expansion of HP Software business through acquisition of Peregrin, Mercury Interactive and Opsware in recent years. Each of these touched and affected my professional life in a much more profound way and saw me go through a succession of roles, each building upon the previous one: developer, development manager, product design architect and then a solution architect.
Besides the customary project deliveries and customer visits, what I remember most about working for HP is the meeting and working with very different, interesting and wonderful people. Attending TechCons, invite-only annual gathering of HP technologists from all over the world to share ideas and showcase best of their works, has been another highlight, though the competition to get invited has become much more fierce in recent years.
Projects at work, though interesting and important, weren't quite as exciting and fulfilling as semi-professional projects at home: assembling a PC in early 2000 with individually purchased part at local Frys, authoring a book on J2EE Security (though the torrid pace of change in technology has made it obsolete in less than 5 years), launching a hobby Web 2.0 site which found a mention in the venerable Wall Street Journal, and numerous other smaller projects at home including a home radio based on iTunes and a FM transmitter, a modded NSLU2 and this blog.
My latest home project: a Linux based media server that can rip song/book CDs and self-recorded DVDs into shorter clippings and then serve to the living room TV through Wii Internet Channel or a future intenet enabled phone (it will iphone 2.0 or an android based phone -- haven't made up my mind yet!)over the home network, a combination of PowerLine Network and wifi Access Points. A ffmpeg based prototype running Fedora Core 7 within a VM is almost ready but lacks the the usability that 11-year old Akriti demands for ripping and 7-year old Unnati demands for viewing.
As you would most certainly agree, these were wonderful 10 years!]]>
One of my function calls returned a collection of pairs of integers and I was wondering whether to store the pair as an array of two named values (as in array('value1' => $value1, 'value2' => $value2)
) or a PHP5 class (as in class ValuePair { var $value1; var $value2; }
). As the number of pairs could be quite large, I thought I'll optimize for memory. Based on experience with compiled languages such as C/C++ and Java, I expected the class based implementation to take less space. Based on a simple memory measurement program, as I'll explain later, this expectation turned out to be misplaced. Apparently PHP implements both arrays and objects as hash tables and in fact, objects require a little more memory than arrays with same members. In hindsight, this doesn't appear so surprising. Compiled languages can convert member accesses to fixed offsets but this is not possible for dynamic languages.
But what did surprise me was the amount of space being used for an array of two elements. Each array having two integers, when placed in another array representing the collection, was using around 300 bytes. The corresponding number for objects is around 350 bytes. I did some googling and found out that a single integer value stored within an PHP array uses 68 bytes: 16 bytes for value structure (zval), 36 bytes for hash bucket, and 2*8 = 16 bytes for memory allocation headers. No wonder an array with two named integer values takes up around 300 bytes.
I am not really complaining -- PHP is not designed for writing data intensive programs. After all, how much data are you going to display on a single web page. But it is still nice to know the actual memory usage of variables within your program. What if your PHP program is not generating an HTML page to be rendered in the browser but a PDF or Excel report to be saved on disk? Would you want your program to exceed memory limit on a slightly larger data set?
Coming back to the original problem -- how should I store a collection pair of values? array of arrays or array of objects? For memory optimization, the answer may be to have two arrays, one for each value.
For those who care for nitty-gritties, here is the program I used for measurements:
<?php class EmptyObject { }; class NonEmptyObject { var $int1; var $int2; function NonEmptyObject($a1, $a2){ $this->int1= $a1; $this->int2= $a2; } }; $num = 1000; $u1 = memory_get_usage(); $int_array = array(); for ($i = 0; $i < $num; $i++){ $int_array[$i] = $i; } $u2 = memory_get_usage(); $str_array = array(); for ($i = 0; $i < $num; $i++){ $str_array[$i] = "$i"; } $u3 = memory_get_usage(); $arr_array = array(); for ($i = 0; $i < $num; $i++){ $arr_array[$i] = array(); } $u4 = memory_get_usage(); $obj_array = array(); for ($i = 0; $i < $num; $i++){ $obj_array[$i] = new EmptyObject(); } $u5 = memory_get_usage(); $arr2_array = array(); for ($i = 0; $i < $num; $i++){ $arr2_array[$i] = array('int1' => $i, 'int2' => $i + $i); } $u6 = memory_get_usage(); $obj2_array = array(); for ($i = 0; $i < $num; $i++){ $obj2_array[$i] = new NonEmptyObject($i, $i + $i); } $u7 = memory_get_usage(); echo "Space Used by int_array: " . ($u2 - $u1) . "\n"; echo "Space Used by str_array: " . ($u3 - $u2) . "\n"; echo "Space Used by arr_array: " . ($u4 - $u3) . "\n"; echo "Space Used by obj_array: " . ($u5 - $u4) . "\n"; echo "Space Used by arr2_array: " . ($u6 - $u5) . "\n"; echo "Space Used by obj2_array: " . ($u7 - $u6) . "\n"; ?>And here is a sample run:
[pankaj@fc7-dev ~]$ php -v PHP 5.2.4 (cli) (built: Sep 18 2007 08:50:58) Copyright (c) 1997-2007 The PHP Group Zend Engine v2.2.0, Copyright (c) 1998-2007 Zend Technologies [pankaj@fc7-dev ~]$ php -C memtest.php Space Used by int_array: 72492 Space Used by str_array: 88264 Space Used by arr_array: 160292 Space Used by obj_array: 180316 Space Used by arr2_array: 304344 Space Used by obj2_array: 349144 [pankaj@fc7-dev ~]$]]>
So why did I choose this particular title? No, I didn't intend to write everything I know about Ajax. It is just a link-bait. Seems to have worked quite well for others. Might work for me as well.
What I really want to do in this post is to write a short review of "Ajax -- The Definitive Guide", a book published by O'Reilly. Those who are familiar with Oreilly's The Definitive Guide series know that these books have a reputation of being very comprehensive and all encompassing about the chosen topic. This certainly seems to be the case for a number of books in this series on my bookshelf, such as "JavaScript: The Definitive Guide" and "SSH, The Secure Shell: The Definitive Guide". But a definitive guide on something like Ajax? It would have to cover a lot of stuff, in all their fullness and fine details, to do justice to the title: the basics of Ajax interactions, (X)HTML, JavaScript, XML, XmlHttpRequest, CSS, DOM, browser idiosyncrasies, Ajax programming style and design patterns, tips-n-tricks, numerous browser side Ajax libraries such as prototype, YUI library, jQuery etc. and their integration with server side frameworks such as RoR, Drupal etc. The list is fairly long, if not endless. And each topic worthy of a book by itself.
On the other hand, the book does provide good introduction to basic concepts, is quite readable, includes a lot of source code for non-trivial working programs and lists relevant resources, such as Ajax libraries, frameworks and applications, in its References section. I especially liked the "chat" and "whiteboard" application that allows two or more users to share a whiteboard and chat through their browsers.
Okay, so how does this book compares with other books on the same topic? This is a tough question, for I haven't been paying attention to most books that have come out on this topic. Though there is a answer, and it comes from this Amazon Sales Rank comparison chart:
A higher Sales Rank for an item implies that more people are buying it from Amazon. This doesn't tell how well a particular book will meet your needs but just that the high ranking items, in general, are being bought by more people than the low ranking ones. The above chart does indicate that Ajax -- The Definitive Guide is outselling its rivals, at least at the time of this review (March 17-18, 2008).]]>
The way Google makes money is actually straightforward: It brokers and publishes advertisements through digital media. ... snip ... Google’s protean appearance is not a reflection of its core business. Rather, it stems from the vast number of complements to its core business. ... snip ... For Google, literally everything that happens on the Internet is a complement to its main business. The more things that people and companies do online, the more ads they see and the more money Google makes. In addition, as Internet activity increases, Google collects more data on consumers’ needs and behavior and can tailor its ads more precisely, strengthening its competitive advantage and further increasing its income. As more and more products and services are delivered digitally over computer networks - entertainment, news, software programs, financial transactions - Google’s range of complements is expanding into ever more industry sectors.
Though this argument appears plausible, I don't think it will withstand critical scrutiny. Not all online activities can be equally monetized through ads. It is well documented that ads alongside search results perform much better than ads on content pages, email messages, online productivity apps, video clips or social networks (to be fair the verdict on last two is still not out). Would a company as focussed on effectiveness as Google try to increase the online ad market by doing things which are proven not to be very effective?
In my opinion, Google's core competency is in developing and running highly customized hardware and software systems and they will use this competency to solve mega-problems that others are ill-equipped to address. In the process, they will disrupt a number of established businesses. ]]>
preg_match('/Name: (.+), Age: (\d+)/', $text, $matches);
would return 1 on finding a substring that matches the specified pattern and stores the matched name, ie; the first captured group, in $matches[1]
and matched age, ie; the second captured group, in $matches[2]
. $match[0]
stores the full matched text. Other languages that support regular expressions, and the list of such languages is pretty long, have similar conventions.
Counting the capturing groups to get the index of the captured text works okay with short regualr expressions that don't change often. However, counting the position becomes tedious and error prone when the number is large and new groups may get introduced or existing ones removed as the code evolves.
If you just rely on the documentation accompanying your programming language, such as this regex syntax for PHP, or this Javadoc page for Java, then you are not likely to find a better solution to this problem. At least this is what happened to me, for I wrote code that had the magic indexes all over till I started readingJeffrey E.F. Friedl's excellent Mastering Regular Expression and came across PHP's support for named captures, a mechanism to associate symbolic names to captured groups.
What it essentially means is that I could rewrite the previous statement as
preg_match('/Name: (?P<Name>.+), Age: (?P<Age>\d+)/', $text, $matches);
and access the matched name and age as $matches['Name']
and $matches['Age']
and need not worry about introducing (or dropping) groups. It not only improves the readability but also makes the code more robust.
At this point one could argue that in this particular case the book was just incidental, for the information on named captures was already available on the Web, as my link shows, and I should just have googled it. Unfortunately, you need to know a little bit about something to search for more. Google and the Web are no good if you don't know what you don't know. This is exactly where I think the book Mastering Regular Expressions really shines. You need to go through this to realize what you didn't know and what you should look for. And be assured that there are enough aspects of regualr expressions and their implementations in various languages that you may not know to justify the cost of the book. By the way, named captures are not the only thing that I learned from this book. Other things I learnt inlcude 'x' modifiers, conditionals within regular expressions, lookaheads and lookbehinds, and many others. No wonder this book is selling almost as well as Programming Perl, 3rd Edition, the all time programming best seller from O'Reilly.
At this point I should add that named captures may not yet be widely available in all languages. In fact, as per the book, Perl doesn't have it, though my research for this post led me to this page and eventually to this page stating that Perl 5.10 has named captures. In fact, the support in Perl 5.10 are much more powerful and makes available not only the last match, as we saw in PHP, but all the matches in an array. Java and JavaScript programmers may have to wait longer for named captures, though! ]]>
Let us take a look at how does all this statistics compare with the Amazon Sales Rank comparison charts at charteous:
No doubt the expanded/revised Freakonomics is doing much better than the copycats. Even the first version (lower line in the chart) is not doing. But I wouldn't call the copycats complete failures. At least not at their current Sales Rank level of between 100 and 1000. It would be interesting to watch this chart over time, though.
There is something else that caught my attention -- The WSJ story compares sales numbers for different time periods: the publish date for Discover Your Inner Economist is Aug. 2, 2007 and that of The Economic Naturalist is May 21, 2007, whereas the reported sales of 119,000 for Freakonomics is since Jan. 1, 2007. So, the copycats may not be doing as bad as a cursory look at the numbers might suggest.
I read the older release of Freakonomics a few weeks ago and was pretty impressed by the basic notion of how the economics of incentives drives human behavior as well as the specific case stories. The first point is easy to understand but its implications in specific situations are usually non-obvious. The specific stories make the connection and often make for very good reading. I am assuming that what WSJ is calling copycats essentially analyze research and observations in different fields with the theory of economic incentives. If so, I wouldn't consider them copycats at all. In fact, I would buy them, at least the ones that become popular, and read them for the stories. ]]>
a.b aab aaa
Keep in mind that the ASCII value of '.' is 46, which is less than 97, the ASCII value of 'a'. Note down your arranged list. Now, create a text file list.txt
with the above strings in separate lines and sort them on a Linux system using the sort utility with the following command:
$ sort list.txt
Did you get what you were expecting? I didn't. Here is what I was expecting and what I got under three different Linux systems (Fedora Core, Mandrake and Ubuntu):
Expected sort output ======= ======== a.b aaa aaa aab aab a.b
What is going on here? Looks like sort is simply ignoring the '.' character. It shouldn't, at least not as per the sort man page. There is this option '-d' to ignore all characters except letters, digits and blanks, and hence '.', but this is not a default option.
Just to confirm that I didn't make a mistake in my manual sort to arrive at the expected list, I sorted the strings within PHP command line shell:
php > $a = array("a.b", "aab", "aaa"); php > sort($a); php > print_r($a); Array ( [0] => a.b [1] => aaa [2] => aab )
This output is same as what I expected. So, no mistake on my part!
And this led me to the question: is GNU Sort broken? or did I miss something. After shifting through sort man pages at different machines, noticed this warning on a Fedora Core 6 box:
*** WARNING *** The locale specified by the environment affects sort order. Set LC_ALL=C to get the traditional sort order that uses native byte values.
So, this is what I was missing! Btw, this is not something obvious that I just didn't pay attention to. Rechecking the online man page, something that I tend to use more often than the man output on a 20x80 terminal screen, confirmed that the warning wasn't there. Also, none of the machines I had tried, all installed for US locale, had LC_ALL set to C by default. And keep in mind that I came across the above discrepancy in sort output only after my program finding the difference of two sorted files failed on certain specific input values. Like most normal folks, I suspected my program first and it took a while to suspect the sort output as the culprit.
Sorry for the provocative title -- I found out about LC_ALL environment variable only while writing this blog post and double checking my facts (one of the few advantages of writing things down) and didn't feel like changing the title. After all, how many of us will think of setting LC_ALL=C before issuing sort! In that sense, Gnu sort IS broken.]]>
The main problems with the existing system and goals for the future system identified in the study are:
Of course, these points are not so neatly laid out but are embedded within the story in a typical HBR case study style. I had to read it twice.
Two options are presented to address the current problems and meet future objectives:
As usual, the expert opinions on this case are varied: George C. Halvorson, the chairman and CEO of Kaiser Permanente, is concerned that the CIO of Peachtree is not enthusiastic about about SOA and recommends more work around defining the vision and identifying the objectives. Typical CEO speak, but it might help the CIO in better understanding the pros and cons of the two options. Monte Ford, senior VP and CIO at American Airlines, recommends SOA based on his experience in adopting SOA. Randy Heffner, a VP at Forrester Research, makes the comment that "by goofing around SOA as a product category instead of looking at it as a methodology, the CIO has missed key perspectives" and recommends SOA. John A Kastor, a professor of medicine at the Univ. of Maryland School of Medicine, agrees with Peachtree CEO Max that indiscriminate standardization of all medical processes is not the right thing to do, but offers no choice for IT infrastructure modernization.
The interesting thing to note is that none of the experts recommend a monolithic enterprise software system. ]]>
addEvent()
and removeEvent()
. He wrote these functions in response to a addEvent() recoding contest, that was published at a well-known site for Web developers run by Peter-Paul Koch and included Scott Andrew LePera, Dean Edwards and John Resig himself as co-judges. The recoding contest itself was a response to wide interest in his blog post addEvent() considered harmful where he outlined a problem with a widely used function addEvent() published by Scott Andrew LePera. It should also be noted that John Resig's entry was judged as the winner entry.
Most web developers are familiar with the names mentioned in the previous paragraph. They have published books, maintain highly visible websites (Google PageRank of websites/blogs maintained by Peter-Paul Koch, Dean Edwards, John Resig, Scott Andrew LePera are 9, 8, 7 and 7, respectively at the time of this blog post), blog regularly and are generally considered gurus in the area of client side web development.
I add all this background only to make the point that writing cross-browser DOM event handling code is non-trivial and has attracted the attention of best minds in the field. With that feeling of comfort that comes with being in good hands, one would think that the problem, although considered difficult in the past, has been solved once and for all and can be reused without much thought.
At least this is what I thought till some strange behavior in my AJAX code that used John Resig's winning addEvent() and removeEvent() forced me to analyze each and every line of the whole program and discovered a couple of really interesting things about the addEvent() function. But before I get into my discovery, let us take a look at the addEvent() code from John Resig's page:
function addEvent( obj, type, fn ) { if ( obj.attachEvent ) { obj['e'+type+fn] = fn; obj[type+fn] = function() {obj['e'+type+fn]( window.event );} obj.attachEvent( 'on'+type, obj[type+fn] ); } else obj.addEventListener( type, fn, false ); }As you can see, this code takes on two issues with IE's support for DOM events: (a) IE uses a non-standard method attachEvent() to register event handlers; and (b) it runs the handler code in the global context (ie; built-in variable
this
is set to window
object during handler execution) and not in the context of the element to which the handler is registered.
The removeEvent() code is very similar and doesn't need to be reproduced here.
So, what is the problem? Actually, none whatsoever, at least not until you have an event handler function that is few tens of lines long and you pass the name of the function as the last argument to addEvent() function. If you are like me, you would think that the code will either use the function name string or some kind of address to create a short string as key to store the handler function reference within the DOM element object. But what really happens is that whole text of the handler function consisting of few tens of lines of code becomes part of the key (key is 'on' + type + fn). In my code I had a key with length greater than 2000! This in itself would not be much of a problem if the key was created only once during registration and then used for lookup during handler execution, though even a lookup in a hash table with very long strings is probably going to tax the JavaScript interpreter badly. The killer is that the key gets created every time the handler is run. This could be very frequent if the event type is 'mousemove' and could easily result in excessive memory use and sluggish behavior.
"This doesn't sound like an insurmountable problem," you may say, "just wrap your long function within another function that simply invokes the long function. This way the addEvent() code will use the body of the wrapper function for forming the key and avoid creation of long strings."
Actually, this is very similar to what I tried, my motivation bring two-fold: reduce the length of the code that gets used as part of the key and also pass an argument at the time of event handler registration. The wrapper creation function looked something like this:
function create_handler(func, arg1){ return function(event){ return lfunc.call(null, event || window.event, arg1); } }And I used it as follows:
function long_function(event, arg1){ ... tens of lines of code ... } addEvent(obj, 'mousemove', create_handler(long_function, arg1));which, actually, ended up creating this fixed text for every function: "function(event){ return lfunc.call(null, event || window.event, arg1); }". As the key is a created by concatenating the even type and function text, same key will be created for different handlers if the event type remains same, causing overwrite! This actually happened in my code! So, even the winning entry has skeletons in the cupboard. It is not that every use would result in broken programs, but there certainly are situations where they fall short. In fact, this is true for most library function and it is always a good practice to know not only the interface and purpose but also the underlying assumptions and how the thing actually works. To be fair to the author John Resig, the recoding contest post had a strict set of requirements and being a reusable function under different conditions was not one of those. ]]>