Twitter icon.
Random notes on software, programming and languages.
By Adrian Kuhn

Archive for the ‘Smalltalk’ Category

Pharo Superpower: Respond to any Message


Tuesday, January 19th, 2010

For the fifth week in a row we’re stepping into the Pharo superpowers booth. Today we shall learn how to create objects that respond to any message. That is, objects that respond to a message without implementing a corresponding method. Again, as with sending any message, this superpower can be used for the good (if used with care) and I will thus discuss an example that I consider good use below.

When a message is sent to a Smalltalk object, the message name is looked up in the method dictionary of the object’s class and its superclasses. If a method whose name matches the message is found, that method is executed. However, if no matching method is found a special message is sent to the object, which is

#doesNotUnderstand:

By default, the implementation of #doesNotUnderstand: opens a debugger (or more precise, the pre-debugger dialog that we all know from test-driven development). However, we are free to override #doesNotUnderstand: and thus respond to any unknown message.

As a (dadaistic) example, let’s implement a Lorem ipsum object

Object subclass: #Lorem instanceVariableNames: 'expects'

with an ipsum constructor

Lorem class >> ispum
    ^ self new

and the following two methods

Lorem >> initialize
    expects := #(dolor sit amet "to be continued ad nauseam..." nil)

Lorem >> doesNotUnderstand: aMessage
    ^ aMessage selector == expects first
        ifTrue: [ expects := expects allButFirst. self ]
        ifFalse: [ super doesNotUnderstand: aMessage ]

So, if you ever doubted that virtually any English sentence is valid Smalltalk, here is your proof :)

Lorem ipsum dolor sit amet.

This executes as valid Smalltalk code, without ever having implement any #dolor , #sit or #amet method! If however, we deviate from the canonical Lorem ipsum sequence we’ll get the usual MessageNotUnderstood error.

[ Lorem ipsum dolor zork ] should signal: MessageNotUnderstood.

As a more sensible examples let’s consider a list that responds to any messages understood by all its elements.

OrderedCollection subclass: #Group.

Group >> eachRespondsTo: aSelector
    ^ self allSatisfy: [ :each | each respondsTo: aSelector ]

Group >> doesNotUnderstand: aMessage
    ^ (self eachRespondsTo: aMessage selector)
        ifTrue: [ self collect: [ :each | aMessage sendTo: each ] ]
        ifFalse: [ super doesNotUnderstand: aMessage ]

As you can see, the implementation of #doesNotUnderstand: follows the same pattern as above. We check whether we want to handle the message, and if not, we delegate to the default implementation in object (which will open a pre-debugger dialog).

Keen readers might have already noted a limitation of above approach: when you override #doesNotUnderstand: but not #respondsTo: your object will respond to a new message (through the means of #doesNotUnderstand:) but still insists that it does not respond to that message when queried with #respondsTo:.

So we’ll have to override #respondsTo: as well

Group >> respondsTo: aSelector
    ^ (super respondsTo: aSelector) or: [ self eachRespondsTo: aSelector ]

It is a sad but true fact, that over 90% of all #doesNotUnderstand: overriders that you’ll find out there do not override #respondsTo: as well—even though they should!

So now our new class is ready for a bunch of expectations (please refer to Phexample for more details on expectation matchers)

g := Group new.
g should not respondTo: #x.
[ g x ] should raise: MessageNotUnderstood.
g add: 2 @ 3.
g add: 3 @ 4.
g add: 1 @ 2.
g should respondTo: #x.
g x should beSameSequence: #(2 3 1).
g y should beSameSequence: #(3 4 2).

BTW, if you are an OSX user and looking for a language that provides this feature by default, take a look at F-Script by Philippe Mougin. F-Script also offers a plethora of awesome features beyond projection of messages, for example it allows you to manipulate the Cocoa objects of any OSX application—at runtime!

As a best practice, you should only override #doesNotUnderstand: and #respondsTo: on your own classes. Just imagine what might happen when two or more stakeholders attempt to override #doesNotUnderstand: in, for example, Collection, only one of the extensions will eventually remain and thus leave the system in an undefined state with overloaded extensions.

If you know more good (or evil, uoarharhar) uses of #doesNotUnderstand: share them in the comments.

Hackety hacking!

Pharo Superpower: Send Any Message


Tuesday, January 12th, 2010

In Pharo Smalltalk you may send any message even if it’s name is not known at compile time. Sending any message is one of the superpowers that can be used for the good, even when doing application programming, therefore I will discuss best practices in the end.

First of all, recall that “sending a message” is Smalltalk jargon for calling a method. Since sending a message is synchronous in Smalltalk, ie it blocks until the receiver returns, it is basically the same as a method call and the actual difference thus of philosophical nature only. (There is an implementation difference deep down at the language’s core, but that shall not be discussed today as it does not matter to programmers.)

BTW, this is the fourth post in the Superpower series.

There are many ways to send any message, mainly due to optional and variable arguments which are not well supported by Smalltalk syntax. The most basic form is

object perform: #symbol withArguments: anArray

Let’s consider a real example

string = 'Lorem'.
string perform: #size. "=> 5"
string perform: #at: with: 1. "=> $L"
string perform: #copyFrom:to: with: 2 with: 4. "=> 'ore'"

If the number of arguments is not known at compile time, we may use

string perform: aSymbol withArguments: anArray.
string perform: aSymbol withEnoughArguments: anArray.

the first expects that the array matches the arity (number of arguments) of the target method, the latter will just use as many arguments are required. This is most useful to send a message with optional arguments.

So when is sending any message for the good or for the evil?

Whenever possible, try to avoid using #perform: because it is less readable. When a reader of your program looks at the a #perform: it is not obvious which message is being sent at runtime. Also, messages that are sent with #perform: will not be shown when browsing all senders of a message. There is one subtle difference here: If the dynamically sent message is stored somewhere as symbols, at least that symbols will show up when looking for senders. If however, the dynamically sent message is composed using string concatenation, it wont show up at all. It might even seem as if its implementers are never used, which can be very confusing to the reader.

For all above reasons you should only use #perform: when you have good reason to do so. And if you use it, make sure that the dynamically sent messages are all stored as a symbol somewhere else. Best of all, make sure that all code involved into dynamically sending message is encapsulated by one single class.

I will provide an example from Hapax’s clustering algorithm. When you do hierarchical clustering, there are different ways to link small clusters into large clusters. The call to this linkage method is buried deep down in the internals of the clustering algorithm, so the ClusteringEngine class uses a strategy pattern to pick the right linkage method. The choice of strategy is stored as a symbol in an instance variable and then used as follows

Object subclass: #ClusteringEngine
    instanceVariableNames: 'distanceMatrix dendrogram linkage'
    classVariableNames: ''

ClusteringEngine >> linkage: aSelector
    linkage := aSelector

ClusteringEngine >> linkage
    ^ linkage

ClusteringEngine >> allLinkageSelectors
    ^ #( averageLinkage centroid completeLinkage meanLinkage singleLinkeage wardsMethod )

ClusteringEngine >> run
    (distanceMatrix size - 1) timesRepeat:
        [self findMinimum.
        self perform: linkage].

ClusteringEngine >> averageLinkage
    "implementation omitted..."

"et cetera..."

All code is encapsulated with one class such that a reader can find it all in one place when browsing the source code.

Any other use of #perform:, in particular when string concatenation of selectors is involved, is evil and should be limited to library design, if used at all.

A note regarding performance. Using #perform: is as fast as sending a message the normal way. So contrary to popular believe there is not performance penalty—at least not in Pharo Smalltalk, in other dialects that do use JIT compilation there might be severe performance penalties though).

Hackety hacking!

Imagine, IDE search so faaaaast that…


Saturday, January 9th, 2010

Imagine an IDE where search were so fast that it became the sole means of navigation.

In such a system, one would not write //TODO but just this.todo() since browsing the callers of todo is so fast that is it faster than using a dedicated task view. The system might even be hidden to your Yahoogle query for “fast search in IDE” since in such a system search is so fast and ease-to-use that its ceases to be used as a verb. For example, the devs might speak of “browse callers” rather than “search callers” since no intermediate search step is between them and their need.

I can see some of my readers smile now :) …because…

There is such a system. It is little known since it predates the invention of today’s filesystems and has never made the transition to file based software development itself. It is Smalltalk, the old lady of dynamic programming languages.

In the IDE of Smalltalk “browse callers of” and “browse implements of” are the main means of navigation. Executing these actions opens–in the same instant–a new editor window with all callers (or implementers) of a given method. No spinning wheel, no tree list of search results, no browsing of results even, you are right there and can start editing.

NB: in fact, seasoned Smalltalk devs even omit “browse” and just say “senders’of” (in message-oriented languages such as Smalltalk and Ruby, objects don’t call methods but send each other messages) and “implementers’of” instead of a proper verbs.

For the alert reader: yes, graphical UIs predate hierarchical file systems. And even better, Smalltalk invented graphical windows. But we’ll stop the children’s games here. It does not matter who was first, but who makes the best out of it. And there, the winner is obvious.

The point I want to make is rather that there is a system out there with a 30-year head start in IDE search. So as researchers we can go and learn from the experience of that community, and then used what we learned to advance the state of current IDEs beyond it. Breakpoints, for example, are also just another method call in Smalltalk. You insert a call to #halt() where ever you want, and to view the list of current break points you browse all callers of halt. Again, no need for a dedicated view. As you see, search-driven development simplifies your tool set.

Of course, not all your navigation needs can be satisfied by hyperjumping. Sometimes you need to drill down from top-levels packages to methods. To do so Eclipse offers the code browsing perspective, which is however never used because the package explorer view offers the same drill-down capabilities without change of perspective. In Smalltalk we get a code browsing interface as well. In fact, Eclipse inherited that perspective from its predecessor VisualAge which was IBM’s prime Smalltalk IDE before they switched to Java.

So before I start to tell the story of how Eclipse’s elimination of the compilation step was inherited from VisualAge as well, lemme summarize this post.

  • Search so fast that is disappears from the list of “verbs” in your IDE.
  • Search so fast that it is called “browse code” instead.
  • Search so fast that developers, for example, us method calls as TODO markers.
  • Plus a drill-down interface for the remaining navigation needs that are not covered by hyperjumping.

The comparison with compilation is actually quite nice: With Eclipse “compile” and “build” ceased to be used as verbs in Java development. Now devs just execute code, done. This feature was brought to Java from Smalltalk. It would be awesome if we could achieve the same kind of “knowledge transfer” for IDE search.

I’d say that our job as providers of IDE search is only done when search ceases to be used as a verb in software development.

— that said, paper submission for SUITE is open until January 19, 2010.

 

Pharo Superpower: Change of Class


Tuesday, January 5th, 2010

Smalltalk objects are ordered by classes hierarchies. But still, an object may change its class membership! Objects are able to move between classes and hierarchies at runtime.

In this post I shall show how to transform a rabbit into a light house!

First let us create two classes Rabbit and Lighthouse with the same format. We use Pharo’s public API to do so.

Object subclass: #Rabbit instanceVariableNames: 'age size'.
Object subclass: #Lighthouse instanceVariableNames: 'latitude longitude'.

Lets check that both classes are of the same format, but neither is a subclass of the other. (Please refer to Phexample for more information on expectation matchers.)

Rabbit format should = Lighthouse format.
Rabbit should not beKindOf: Lighthouse.
Lighthouse should not beKindOf: Rabbit.

Now, lets create a rabbit and transform it into an instance of a light house.

r := Rabbit new.
r class should beKindOf: Rabbit.
r primitiveChangeClassTo: Lighthouse new.
r class should beKindOf: Lighthouse.

And, abra cadabra, we’ve turned r from a Rabbit into a Lighthouse!

To do so we use the #primitiveChangeClassTo: method. This method expects an instance of the target class as argument. However this instance is only used to determine the target class of the receiver. (In other Smalltalk dialects, as for example Cincom Smalltalk, #changeClassTo: expects a class rather than an instance. We can only guess why Pharo and its sibling Squeak require an instance of the target class. My guess is that this required as proof that the target class is a valid class, since otherwise it would not have been possible to create an instance of itself.)

Hackety hacking!

Post scriptum: please note that it is not possible to change the class of a point to that of an associations even though both got two instance variables. This is because both are so called “compact” classes with a smaller header. We’ll cover that in another superpowers issue.

Dependency Injection vs. Virtual Classes


Wednesday, December 30th, 2009

Every time you create an instance of a class by using the new keyword, you tightly couple your code to that class, and you will not be able to substitute that particularly implementation with a different one (at least not without recompiling the code).

Taken from an answer on stackoverflow that motivates dependency injection. Dependency injection however is not the real solution, it is rather a workaround for missing virtual classes. In this post I will show you why.

For example in

public class MyClass {
  public string sendMessage(String contents) {
    return new MailService().sendMessage(contents);
  }
}

you cannot use a different MessageService. This might not seem limiting at first sight, but think for example of testing. Your tests might be better off using a stub service that does not send actual emails.

This can and is being solved with dependency injection, yes. But the actual problem is rather a language problem: Classes are not virtual. While methods are virtual and thus looked up at runtime, classes are not.

In the following example, we can change the mail service by subclassing and overriding a method. This works because methods are virtual and thus late bound.

public class MyClass {
  public string sendMessage(String contents) {
    return newMailService().sendMessage(contents);
  }
  public MailService newMailService() {
    return new MailService();
  }
}

What a difference a single whitespace can make!

When newMailService is called, that method name is looked up in the context of the calling method. The context of the calling method is its class. Therefore, by subclassing the containing class we can provide a different mail service.

Imagine the same would happen with class names. Whenever we call new MailService() then the class name would be looked up in the containing context. Imagine further the lookup would not stop with the class but bubble up to the containing package. Then, by “subclassing” that package we might provide a different mail service.

Subclassing a package is nonsense terminology, let’s rather say that we instantiate a package … and let’s further assume that packages and classes are the same, and that our system is built of nested classes.

public class MyApplication {
  public class MyClass {
    public string sendMessage(String contents) {
      return new "MailService"().sendMessage(contents);
    }
  }
}

I have put the class name of MailService in quotes to remind you that it is the name of a virtual class name and thus looked up in the containing context.

Imagine further that we could pass the binding of classes to a class when creating it, then we could instantiate two versions of our application: one for real, and one for testing.

Application forTesting = new Application(MailService = TestMailer);
Application forReal = new Application(MailService = InterwebMailer);

Testing and production are not the only possible contexts. Imagine for example an XML parser. Typically it creates DOM nodes, but maybe you would prefer you own implementation of nodes. Maybe even the DAO objects of your application? There we go

XMLParser custom = new XMLParser(Node = Application.DAO);
XMLDocument doc = custom.new XMLDocument("example.xml");

NB, the second line is actually valid Java syntax: this is how you create an instance of an inner class when you are not in the same file as its outer class.

— So this is how dependency injection should really work!

And it is even not that far from reality. Both Beta and Newspeak (the new language by Gilad Bracha) do have virtual classes and both do have nested classes, and in both classes what I just described is how you decouple your packages in these languages.

And, hup hup hup, another design pattern has been disguised as a missing language feature :)

Pharo Superpower: Use Anything as Class


Monday, December 28th, 2009

In Pharo Smalltalk, not only can you create anonymous classes at runtime, you can use anything as a class. You can create objects whose class is not a class. Mind boggling, ain’t it?

So if an object’s class is not a class, what is it then? Recall that in Smalltalk all classes are objects, thus if a class is not a class it is at least an object. When I first discovered this superpower I though “this must be a bug in the virtual machine”. However, the Blue Book of Smalltak-80 confirms that this is by design. The virtual machine of Smalltalk does not require that classes should inherit from Behavior.

In this post, I shall use an instance of Interval to create a new object whose class is … well, an instance of interval rather than a proper class.

The choice of interval is not a coincidence. In fact, we may only use objects as classes that have at least three instance variables. The first instance variable must refer to the superclass (which neither must be a class, but to keep things simple we’ll use Object in our example), the second instance variable must refer to a method dictionary, and the third instance variable must encode a magic number that specified the class format.

g := Interval basicNew.
g instVarAt: 1 put: Object.
g instVarAt: 2 put: MethodDictionary new.
g instVarAt: 3 put: Object format.

Next we compile a method that implements primitive #70 into interval. Primitive #70 can be used to create new instances. So we can use primitive #70 to create an instance of g.

Interval compile: 'primitive70 <primitive: 70>'.
gg := g primitive70.

Let’s verify that gg is really an instance of g. (Please refer to Phexample for more information on expectation matchers.)

gg class should beSameAs: g.
gg class should not beKindOf: Behavior.

Now we can add methods to the dictionary in g’s second instance variable and they become available on gg. We’ll add a method #zork that returns self.

methods := g instVarAt: 2.
methods at: #zork put: CompiledMethod toReturnSelf.
gg zork should = gg.

Unfortunately we cannot write gg should respondTo: #zork since g is not a real class and thus gg cannot send #canUnderstand: to g. Also you might not be able to print or inspect gg for the same reason, depending on the version of Pharo you are using.

Hackety hacking!

Pharo Superpower: Create Anonymous Class


Monday, December 21st, 2009

One of the superpowers in Pharo Smalltalk is to create new classes at runtime. Actually, whenever you accept a class definition in the class browser that very definition is evaluated to create a new class. And since all development in Smalltalk happens eo ipso at runtime the accpeted class definition creates a class at runtime. Superpower for the masses, it can be done.

In this post, I shall cover how to create anonymous classes at runtime. As an example, we’ll create an anonymous subclass of point that extends Point with a color attribute.

To create an anonymous class, you’ll first need to create an anonymous metaclass. (Hey, nobody said superpowers ain’t confusing!) The newly created metaclass needs a superclass, a method dictionary and a magic format number. Computation of format numbers is explained later.

m := Metaclass new.
m superclass: Point class.
m methodDict: MethodDictionary new.
m setFormat: 156.

Now we can create the actual anonymous class. Classes are instances of their metaclass. For each metaclass there is only one instance, thus if you send #new twice an error is thrown. As above, the newly created class needs a superclass, a method dictionary and a magic format number.

c := m new.
c superclass: Point.
c methodDict: MethodDictionary new.
c setFormat: 136.

Now we can create an instance of the anonymous class. We’ll verify the new instance to check that it actually meets our expectations. (Please refer to Phexample for more information on expectation matchers.)

p := c x: 3 y: 4.
p asString should = '3@4'.
p class should = c.
p class class should = m.
p should beKindOf: Point.

Next we’ll create accessors for the additional instance variable of c, which shall be named color. And we’ll also override #printOn: to report the color.

c setInstVarNames: { 'color' }.
c compile: 'color
    ^color'.
c compile: 'color: aString
    color := aString'.
c compile: 'printOn: out
    super printOn: out.
    out nextPutAll: '' is ''.
    out nextPutAll: color'.

Again, we’ll verify our instance.

p color should = nil.
p color: #yellow.
p color should = #yellow.
p asString should = '3@4 is yellow'.

I promised to explain how to compute the magic numbers above. The format number encodes both the number of instance variables and the type of an object. For example, an object can be indexable or not. All this is stuffed into a 32-bit number. To compute the format number, we’ll use a method in ClassBuilder that does the bit-fiddling for us.

metaformat := ClassBuilder new
    computeFormat: #normal instSize: 0
    forSuper: Point class ccIndex: 0.
format := ClassBuilder new
    computeFormat: #normal instSize: 1
    forSuper: Point ccIndex: 0.

Important for us are the parameters instSize: and forSuper: which expect the number of instance variables and the superclass of the to be created class. Please note, the number of instance variables should not include inherited instance variables, but only the number of to be added instance variables: which is zero for our metaclass, and one for our colored point class.

Hackety hacking!

For.example is Digg proof thanks to caching by WP Super Cache!