Twitter icon.
Random notes on software, programming and languages.
By Adrian Kuhn

Pharo Superpower: Respond to any Message


January 19th, 2010

For the fifth week in a row we’re stepping into the Pharo superpowers booth. Today we shall learn how to create objects that respond to any message. That is, objects that respond to a message without implementing a corresponding method. Again, as with sending any message, this superpower can be used for the good (if used with care) and I will thus discuss an example that I consider good use below.

When a message is sent to a Smalltalk object, the message name is looked up in the method dictionary of the object’s class and its superclasses. If a method whose name matches the message is found, that method is executed. However, if no matching method is found a special message is sent to the object, which is

#doesNotUnderstand:

By default, the implementation of #doesNotUnderstand: opens a debugger (or more precise, the pre-debugger dialog that we all know from test-driven development). However, we are free to override #doesNotUnderstand: and thus respond to any unknown message.

As a (dadaistic) example, let’s implement a Lorem ipsum object

Object subclass: #Lorem instanceVariableNames: 'expects'

with an ipsum constructor

Lorem class >> ispum
    ^ self new

and the following two methods

Lorem >> initialize
    expects := #(dolor sit amet "to be continued ad nauseam..." nil)

Lorem >> doesNotUnderstand: aMessage
    ^ aMessage selector == expects first
        ifTrue: [ expects := expects allButFirst. self ]
        ifFalse: [ super doesNotUnderstand: aMessage ]

So, if you ever doubted that virtually any English sentence is valid Smalltalk, here is your proof :)

Lorem ipsum dolor sit amet.

This executes as valid Smalltalk code, without ever having implement any #dolor , #sit or #amet method! If however, we deviate from the canonical Lorem ipsum sequence we’ll get the usual MessageNotUnderstood error.

[ Lorem ipsum dolor zork ] should signal: MessageNotUnderstood.

As a more sensible examples let’s consider a list that responds to any messages understood by all its elements.

OrderedCollection subclass: #Group.

Group >> eachRespondsTo: aSelector
    ^ self allSatisfy: [ :each | each respondsTo: aSelector ]

Group >> doesNotUnderstand: aMessage
    ^ (self eachRespondsTo: aMessage selector)
        ifTrue: [ self collect: [ :each | aMessage sendTo: each ] ]
        ifFalse: [ super doesNotUnderstand: aMessage ]

As you can see, the implementation of #doesNotUnderstand: follows the same pattern as above. We check whether we want to handle the message, and if not, we delegate to the default implementation in object (which will open a pre-debugger dialog).

Keen readers might have already noted a limitation of above approach: when you override #doesNotUnderstand: but not #respondsTo: your object will respond to a new message (through the means of #doesNotUnderstand:) but still insists that it does not respond to that message when queried with #respondsTo:.

So we’ll have to override #respondsTo: as well

Group >> respondsTo: aSelector
    ^ (super respondsTo: aSelector) or: [ self eachRespondsTo: aSelector ]

It is a sad but true fact, that over 90% of all #doesNotUnderstand: overriders that you’ll find out there do not override #respondsTo: as well—even though they should!

So now our new class is ready for a bunch of expectations (please refer to Phexample for more details on expectation matchers)

g := Group new.
g should not respondTo: #x.
[ g x ] should raise: MessageNotUnderstood.
g add: 2 @ 3.
g add: 3 @ 4.
g add: 1 @ 2.
g should respondTo: #x.
g x should beSameSequence: #(2 3 1).
g y should beSameSequence: #(3 4 2).

BTW, if you are an OSX user and looking for a language that provides this feature by default, take a look at F-Script by Philippe Mougin. F-Script also offers a plethora of awesome features beyond projection of messages, for example it allows you to manipulate the Cocoa objects of any OSX application—at runtime!

As a best practice, you should only override #doesNotUnderstand: and #respondsTo: on your own classes. Just imagine what might happen when two or more stakeholders attempt to override #doesNotUnderstand: in, for example, Collection, only one of the extensions will eventually remain and thus leave the system in an undefined state with overloaded extensions.

If you know more good (or evil, uoarharhar) uses of #doesNotUnderstand: share them in the comments.

Hackety hacking!

Pharo Superpower: Send Any Message


January 12th, 2010

In Pharo Smalltalk you may send any message even if it’s name is not known at compile time. Sending any message is one of the superpowers that can be used for the good, even when doing application programming, therefore I will discuss best practices in the end.

First of all, recall that “sending a message” is Smalltalk jargon for calling a method. Since sending a message is synchronous in Smalltalk, ie it blocks until the receiver returns, it is basically the same as a method call and the actual difference thus of philosophical nature only. (There is an implementation difference deep down at the language’s core, but that shall not be discussed today as it does not matter to programmers.)

BTW, this is the fourth post in the Superpower series.

There are many ways to send any message, mainly due to optional and variable arguments which are not well supported by Smalltalk syntax. The most basic form is

object perform: #symbol withArguments: anArray

Let’s consider a real example

string = 'Lorem'.
string perform: #size. "=> 5"
string perform: #at: with: 1. "=> $L"
string perform: #copyFrom:to: with: 2 with: 4. "=> 'ore'"

If the number of arguments is not known at compile time, we may use

string perform: aSymbol withArguments: anArray.
string perform: aSymbol withEnoughArguments: anArray.

the first expects that the array matches the arity (number of arguments) of the target method, the latter will just use as many arguments are required. This is most useful to send a message with optional arguments.

So when is sending any message for the good or for the evil?

Whenever possible, try to avoid using #perform: because it is less readable. When a reader of your program looks at the a #perform: it is not obvious which message is being sent at runtime. Also, messages that are sent with #perform: will not be shown when browsing all senders of a message. There is one subtle difference here: If the dynamically sent message is stored somewhere as symbols, at least that symbols will show up when looking for senders. If however, the dynamically sent message is composed using string concatenation, it wont show up at all. It might even seem as if its implementers are never used, which can be very confusing to the reader.

For all above reasons you should only use #perform: when you have good reason to do so. And if you use it, make sure that the dynamically sent messages are all stored as a symbol somewhere else. Best of all, make sure that all code involved into dynamically sending message is encapsulated by one single class.

I will provide an example from Hapax’s clustering algorithm. When you do hierarchical clustering, there are different ways to link small clusters into large clusters. The call to this linkage method is buried deep down in the internals of the clustering algorithm, so the ClusteringEngine class uses a strategy pattern to pick the right linkage method. The choice of strategy is stored as a symbol in an instance variable and then used as follows

Object subclass: #ClusteringEngine
    instanceVariableNames: 'distanceMatrix dendrogram linkage'
    classVariableNames: ''

ClusteringEngine >> linkage: aSelector
    linkage := aSelector

ClusteringEngine >> linkage
    ^ linkage

ClusteringEngine >> allLinkageSelectors
    ^ #( averageLinkage centroid completeLinkage meanLinkage singleLinkeage wardsMethod )

ClusteringEngine >> run
    (distanceMatrix size - 1) timesRepeat:
        [self findMinimum.
        self perform: linkage].

ClusteringEngine >> averageLinkage
    "implementation omitted..."

"et cetera..."

All code is encapsulated with one class such that a reader can find it all in one place when browsing the source code.

Any other use of #perform:, in particular when string concatenation of selectors is involved, is evil and should be limited to library design, if used at all.

A note regarding performance. Using #perform: is as fast as sending a message the normal way. So contrary to popular believe there is not performance penalty—at least not in Pharo Smalltalk, in other dialects that do use JIT compilation there might be severe performance penalties though).

Hackety hacking!

Imagine, IDE search so faaaaast that…


January 9th, 2010

Imagine an IDE where search were so fast that it became the sole means of navigation.

In such a system, one would not write //TODO but just this.todo() since browsing the callers of todo is so fast that is it faster than using a dedicated task view. The system might even be hidden to your Yahoogle query for “fast search in IDE” since in such a system search is so fast and ease-to-use that its ceases to be used as a verb. For example, the devs might speak of “browse callers” rather than “search callers” since no intermediate search step is between them and their need.

I can see some of my readers smile now :) …because…

There is such a system. It is little known since it predates the invention of today’s filesystems and has never made the transition to file based software development itself. It is Smalltalk, the old lady of dynamic programming languages.

In the IDE of Smalltalk “browse callers of” and “browse implements of” are the main means of navigation. Executing these actions opens–in the same instant–a new editor window with all callers (or implementers) of a given method. No spinning wheel, no tree list of search results, no browsing of results even, you are right there and can start editing.

NB: in fact, seasoned Smalltalk devs even omit “browse” and just say “senders’of” (in message-oriented languages such as Smalltalk and Ruby, objects don’t call methods but send each other messages) and “implementers’of” instead of a proper verbs.

For the alert reader: yes, graphical UIs predate hierarchical file systems. And even better, Smalltalk invented graphical windows. But we’ll stop the children’s games here. It does not matter who was first, but who makes the best out of it. And there, the winner is obvious.

The point I want to make is rather that there is a system out there with a 30-year head start in IDE search. So as researchers we can go and learn from the experience of that community, and then used what we learned to advance the state of current IDEs beyond it. Breakpoints, for example, are also just another method call in Smalltalk. You insert a call to #halt() where ever you want, and to view the list of current break points you browse all callers of halt. Again, no need for a dedicated view. As you see, search-driven development simplifies your tool set.

Of course, not all your navigation needs can be satisfied by hyperjumping. Sometimes you need to drill down from top-levels packages to methods. To do so Eclipse offers the code browsing perspective, which is however never used because the package explorer view offers the same drill-down capabilities without change of perspective. In Smalltalk we get a code browsing interface as well. In fact, Eclipse inherited that perspective from its predecessor VisualAge which was IBM’s prime Smalltalk IDE before they switched to Java.

So before I start to tell the story of how Eclipse’s elimination of the compilation step was inherited from VisualAge as well, lemme summarize this post.

  • Search so fast that is disappears from the list of “verbs” in your IDE.
  • Search so fast that it is called “browse code” instead.
  • Search so fast that developers, for example, us method calls as TODO markers.
  • Plus a drill-down interface for the remaining navigation needs that are not covered by hyperjumping.

The comparison with compilation is actually quite nice: With Eclipse “compile” and “build” ceased to be used as verbs in Java development. Now devs just execute code, done. This feature was brought to Java from Smalltalk. It would be awesome if we could achieve the same kind of “knowledge transfer” for IDE search.

I’d say that our job as providers of IDE search is only done when search ceases to be used as a verb in software development.

— that said, paper submission for SUITE is open until January 19, 2010.

 

Sneak Peak: Codemap User Study


January 8th, 2010

Here is a preview of a recently submitted paper on Software Cartography. We report on preliminary results of an ongoing user study with both students and professional developers. Results are mixed and revised the assumption that lexical similarity is sufficient to layout the map. We are now working on a new distance metric that includes the ideal structural proximity proposed by the “Law of Demeter”. Also we are looking into new layout algorithms. For example, anchored MDS would allow developers to rearrange the map according to their system’s architecture.

NB if you are a professional developer from Switzerland, we encourage you to participate in our user study. Please contact David or me for more information.

towards-an-improved-mental-model-of-software-developers-through-cartographic-visualization-icse-nier-2010

The preview was created with Wordle, an idea that I owe to Tom Zimmermann.

Pharo Superpower: Change of Class


January 5th, 2010

Smalltalk objects are ordered by classes hierarchies. But still, an object may change its class membership! Objects are able to move between classes and hierarchies at runtime.

In this post I shall show how to transform a rabbit into a light house!

First let us create two classes Rabbit and Lighthouse with the same format. We use Pharo’s public API to do so.

Object subclass: #Rabbit instanceVariableNames: 'age size'.
Object subclass: #Lighthouse instanceVariableNames: 'latitude longitude'.

Lets check that both classes are of the same format, but neither is a subclass of the other. (Please refer to Phexample for more information on expectation matchers.)

Rabbit format should = Lighthouse format.
Rabbit should not beKindOf: Lighthouse.
Lighthouse should not beKindOf: Rabbit.

Now, lets create a rabbit and transform it into an instance of a light house.

r := Rabbit new.
r class should beKindOf: Rabbit.
r primitiveChangeClassTo: Lighthouse new.
r class should beKindOf: Lighthouse.

And, abra cadabra, we’ve turned r from a Rabbit into a Lighthouse!

To do so we use the #primitiveChangeClassTo: method. This method expects an instance of the target class as argument. However this instance is only used to determine the target class of the receiver. (In other Smalltalk dialects, as for example Cincom Smalltalk, #changeClassTo: expects a class rather than an instance. We can only guess why Pharo and its sibling Squeak require an instance of the target class. My guess is that this required as proof that the target class is a valid class, since otherwise it would not have been possible to create an instance of itself.)

Hackety hacking!

Post scriptum: please note that it is not possible to change the class of a point to that of an associations even though both got two instance variables. This is because both are so called “compact” classes with a smaller header. We’ll cover that in another superpowers issue.

How-to Revert Safari to Version 4.0.2 and Thus Fix Inquisitor


January 5th, 2010

Recent versions of Safari don’t play nice with the Inquisitor search plugin. The only sensible fix seems to revert Safari to an older version.

Inquisitor is supposed to plug into Safari’s search bar to display a list of search engines instead of Google suggestions. However, after installing Safari 4.0.3 (or higher) that won’t work anymore. Google suggestion are being displayed on top of the Inquisitor drop-down!

Here is how to revert Safari back to version 4.0.2 once you’ve installed the latest version by mistake:

  1. Get a copy of the Safari 4.0.2 installer. This ain’t as simple as it sounds. Apple’s website offers the latest version only, and all usual download sites refer to Apple’s website only. You have to search the interwebs for a file called Safari4.0.2Leo.dmg. I was lucky enough to find a copy on an anonymous website, but Rapidshare or friends should do fine as well.
  2. Once you got the file, you should double check it’s SHA-1 digest against the one given in the Apple security announcement APPLE-SA-2009-07-08-1. Execute openssl sha1 Safari4.0.2Leo.dmg an the command line and compare the output to
    SHA1(Safari4.0.2Leo.dmg)= 48676afbb5c5bacac8610ba13f6851d3b266cb69

    If your output is not the same, the file has been tampered and you should not install it!

  3. Next you’ll have to patch the version number of Safari’s current installation. Otherwise the installer will refuse to revert Safari. To do so open the files
    /System/Library/Frameworks/WebKit.framework/Versions/A/Resources/Info.plist
    /Applications/Safari.app/Contents/version.plist

    and replace the version numbers with lower numbers. Anything lower than 5530.19 should do fine. You might have to change access rights to be able to edit these files.

  4. Now, you may eventually run the Safari 4.0.2 installer!

 

PS: As an Inquisitor user you might also be interested in turning off Inquisitor’s Yahoo suggestions. Being tracked by Yahoo instead of Google is “vom Regen in die Traufe geraten” as we’d say in German. To cover up your traces, open /etc/hosts and add the following lines

0.0.0.0 sugg.search.yahoo.net
0.0.0.0 clients1.google.com

this blocks the URL that are used to retrieve both Google and Yahoo suggestions.

Dependency Injection vs. Virtual Classes


December 30th, 2009

Every time you create an instance of a class by using the new keyword, you tightly couple your code to that class, and you will not be able to substitute that particularly implementation with a different one (at least not without recompiling the code).

Taken from an answer on stackoverflow that motivates dependency injection. Dependency injection however is not the real solution, it is rather a workaround for missing virtual classes. In this post I will show you why.

For example in

public class MyClass {
  public string sendMessage(String contents) {
    return new MailService().sendMessage(contents);
  }
}

you cannot use a different MessageService. This might not seem limiting at first sight, but think for example of testing. Your tests might be better off using a stub service that does not send actual emails.

This can and is being solved with dependency injection, yes. But the actual problem is rather a language problem: Classes are not virtual. While methods are virtual and thus looked up at runtime, classes are not.

In the following example, we can change the mail service by subclassing and overriding a method. This works because methods are virtual and thus late bound.

public class MyClass {
  public string sendMessage(String contents) {
    return newMailService().sendMessage(contents);
  }
  public MailService newMailService() {
    return new MailService();
  }
}

What a difference a single whitespace can make!

When newMailService is called, that method name is looked up in the context of the calling method. The context of the calling method is its class. Therefore, by subclassing the containing class we can provide a different mail service.

Imagine the same would happen with class names. Whenever we call new MailService() then the class name would be looked up in the containing context. Imagine further the lookup would not stop with the class but bubble up to the containing package. Then, by “subclassing” that package we might provide a different mail service.

Subclassing a package is nonsense terminology, let’s rather say that we instantiate a package … and let’s further assume that packages and classes are the same, and that our system is built of nested classes.

public class MyApplication {
  public class MyClass {
    public string sendMessage(String contents) {
      return new "MailService"().sendMessage(contents);
    }
  }
}

I have put the class name of MailService in quotes to remind you that it is the name of a virtual class name and thus looked up in the containing context.

Imagine further that we could pass the binding of classes to a class when creating it, then we could instantiate two versions of our application: one for real, and one for testing.

Application forTesting = new Application(MailService = TestMailer);
Application forReal = new Application(MailService = InterwebMailer);

Testing and production are not the only possible contexts. Imagine for example an XML parser. Typically it creates DOM nodes, but maybe you would prefer you own implementation of nodes. Maybe even the DAO objects of your application? There we go

XMLParser custom = new XMLParser(Node = Application.DAO);
XMLDocument doc = custom.new XMLDocument("example.xml");

NB, the second line is actually valid Java syntax: this is how you create an instance of an inner class when you are not in the same file as its outer class.

— So this is how dependency injection should really work!

And it is even not that far from reality. Both Beta and Newspeak (the new language by Gilad Bracha) do have virtual classes and both do have nested classes, and in both classes what I just described is how you decouple your packages in these languages.

And, hup hup hup, another design pattern has been disguised as a missing language feature :)

For.example is Digg proof thanks to caching by WP Super Cache!