Thursday, January 20, 2022

The Principles of Deferring to a Superclass, or How to Do super() Right

In a previous post I showed how Python's implementation of super() is unpredictable and therefore impossible to use reliably.  What are the theoretical principles on which a correct implementation should be built?

I have written before on Design by Contract, a practical methodology for correct object-oriented design.  In a nutshell, it requires every class to have a class invariant, which specifies the acceptable states of objects of that class.  For example, a container class must ensure that its size as reported (or stored in a class field) is always non-negative.  The invariant must be maintained at all times, except when inside the implementation of a class method.  In the absence of concurrent executions of methods on the same object, it is sufficient to guarantee that the constructor establishes the invariant, and that every method maintains it.  In addition, each method has a precondition, which specifies the legal configurations (object state and argument values) in which it is possible to call the method, and a postcondition, which specifies the state in which the method must leave the object, as a function of its previous state and the argument values.  The precondition is the responsibility of the caller; for example, it is illegal to call pop() on a stack when its
empty() method returns true.  The postcondition, the responsibility of the method implementation; for example, after pushing an element to the stack, it must be the top element of the stack, and all other elements must stay the same.

From the point of view of the programmer who calls any operation on an object, the relevant contract is that of the static type of the object, since the dynamic type can't be known when writing the code; however, it must be a subtype of the static type.  The Design by Contract methodology follows Liskov's Substitution Principle, which implies that a subclass must obey the contracts of all its superclasses.  Therefore, relying on the static type is always correct.

However, in the Python implementation of super(), this isn't true.  As we saw in the previous post, a super call can invoke an implementation from a class that is not a superclass of the static type of the receiver object.  In the last example shown there, the call super(ContainerMixin, self).__init__(**properties) in the class Document could invoke a method of an unrelated class (SomeMixin).  Under such behaviors, it is impossible to know what is the contract of the called method, and therefore impossible to write correct code.  The only way to enable developers to write correct and reliable code is to restrict super calls to static superclasses; that is, superclasses of the class in which the super call appears, regardless of the dynamic type of the receiver object.

In addition, the super mechanism in Python (and most other object-oriented languages, including Java and C++) allows diagonal calls, in which one method can call a different method on the super object.  This is rarely done in practice, and should be disallowed in principle, since it bypasses the contract of the called method.  The implementation of a method can use implementations provided by superclasses for the same method, and will need to be mindful of their contracts; but if it needs a service from another method of the same object, it should call the implementation provided for that object, which is the only one that is responsible for the correct contract.  (Remember that the dynamic type of the object, which determines its contract, is not known when writing the code.)

Bertrand Meyer, the author of the Design by Contract methodology, embodied this methodology in the programming language #Eiffel.  In Eiffel, the super construct is called Precursor, and calls the implementation of the method in which it appears in the immediate (static) superclass, if there is exactly one such.  Otherwise, Precursor needs to be decorated with the name of one of the static superclasses, whose implementation is to be invoked.  This adheres to the principles above: it is based on static relationships between classes, and so the relevant contract is always known; and it prohibits diagonal calls.

This behavior can be implemented in Python, although a more efficient implementation will have to be done inside the compiler.  Before we show the implementation, let's look at a particularly difficult example.  This is taken from Meyer's book,
Object-Oriented Software Construction (2nd edition), and is based on an earlier example from Stroustrup's book The C++ Programming Language.

In this example, we have a Window class, describing a window or widget in a graphical user interface.  One of the important methods of this class is draw(), which draws the contents of the window on the screen.  Some windows have borders, and some have menus.  These are described by subclasses WindowWithBorder and WindowWithMenu, respectively.  The implementation of the draw() method in each of these subclasses first needs to draw the window, using the method in the parent Window class, and then add the border or menu.  The implementation of these classes could look like this, translated into Python syntax:

class Window:
    def draw(self):
        print('Window.draw()')


class WindowWithBorder(Window):
    def draw_border(self):
        print('WindowWithBorder.draw_border()')

    def draw(self):
        print('WindowWithBorder.draw()')
        precursor()
        self.draw_border()


class WindowWithMenu(Window):
    def draw_menu(self):
        print('WindowWithMenu.draw_menu()')

    def draw(self):
        print('WindowWithMenu.draw()')
        precursor()
        self.draw_menu()

Running the following test will print the output shown:

wb = WindowWithBorder()
wb.draw()
print('***')
wm = WindowWithMenu()
wm.draw()

Output:

WindowWithBorder.draw()
Window.draw()
WindowWithBorder.draw_border()
***
WindowWithMenu.draw()
Window.draw()
WindowWithMenu.draw_menu()

This demonstrates that the correct methods are called, in the correct order.

But what about windows that have both borders and menus?  These would be described by a class WindowWithBorderAndMenu, which should inherit from both WindowWithBorder and WindowWithMenu.  Unfortunately, its draw() method must not call the implementations in its immediate superclasses, since that will call Window.draw() twice; this is likely to undo the effects of the call from the previous superclass, and may well cause other issues.  Instead, the implementation of draw() in this class should call the Window implementation, followed by draw_border() and draw_menu().

class WindowWithBorderAndMenu(WindowWithBorder,
                              WindowWithMenu):
    def draw(self):
        print('WindowWithBorderAndMenu.draw()')
        precursor_of(Window)()
        self.draw_border()
        self.draw_menu()

The blue line shows how to call the implementation in a superclass, using a different function precursor_of, which takes the appropriate class object as a parameter and returns the method implementation from that class.

Eiffel prohibits calling a method implementation that is not in one of the immediate superclasses, based on the consideration that jumping over an implementation is likely to miss important parts of the implementation.  However, in this case this is justified, and it is possible to do this in Eiffel by inheriting yet again from Window, this time directly.  My implementation of precursor in Python does allow such jumps, although this can be changed.

The following test of WindowWithBorderAndMenu will give the output shown:

wbm = WindowWithBorderAndMenu()
wbm.draw()

Output:

WindowWithBorderAndMenu.draw()
Window.draw()
WindowWithBorder.draw_border()
WindowWithMenu.draw_menu()

How does this magic happen?  The short answer is "with some reflection."  Getting more technical, we use Python's inspect module to find the frame that called the precursor() function.  From that frame we can extract the name of the method, as well as the target of the call (that is, the receiver object), and the __bases__ field of its class gives the list of superclasses.  The precursor() function will only work if there is a single superclass, in which case it will call the implementation of the called method from that superclass (or whatever that superclass defers to).

The precursor_of() function is similar, but instead of searching for a unique superclass it accepts the superclass as an argument.  It then returns an internal function that will invoke the correct method implementation when it is called.  (This is why there is another set of parentheses in the call marked in blue in the code above; this is where you would provide any arguments to the super method.)

You can find the full implementation in my github repo yishaif/parseltongue, in the extensions/super directory; tests are in extensions/tests.

Of course, using reflection for every super call is not advisable.  This implementation is just an example of how this could work; an efficient implementation would have to be part of the Python compiler.

But what if you perversely want a static super implementation that can still perform diagonal calls?  I don't want to encourage you to do such a thing, but it can certainly be done.  Here is the windowing example using this style.  I am using the functions static_super() and  static_super_of()instead of super(), so as not to interfere with Python's builtin function.  The first is for the case in which the class has a single immediate superclass, and the second is for multiple inheritance, and takes a parameter that specified the class whose implementation is to be invoked.  The changes from the previous example are in blue.

class Window:
    def draw(self):
        print('Window.draw()')


class WindowWithBorder(Window):
    def draw_border(self):
        print('WindowWithBorder.draw_border()')

    def draw(self):
        print('WindowWithBorder.draw()')
        static_super().draw()
        self.draw_border()


class WindowWithMenu(Window):
    def draw_menu(self):
        print('WindowWithMenu.draw_menu()')

    def draw(self):
        print('WindowWithMenu.draw()')
        static_super().draw()
        self.draw_menu()


class WindowWithBorderAndMenu(WindowWithBorder, WindowWithMenu):
    def draw(self):
        print('WindowWithBorderAndMenu.draw()')
        static_super_of(Window).draw()
        self.draw_border()
        self.draw_menu()

Here are the tests, with the output:

wb = WindowWithBorder()
wb.draw()
print('***')
wm = WindowWithMenu()
wm.draw()
print('***')
wbm = WindowWithBorderAndMenu()
wbm.draw()

Output:

WindowWithBorder.draw()
Window.draw()
WindowWithBorder.draw_border()
***
WindowWithMenu.draw()
Window.draw()
WindowWithMenu.draw_menu()
***
WindowWithBorderAndMenu.draw()
Window.draw()
WindowWithBorder.draw_border()
WindowWithMenu.draw_menu()

As you can see, the results are exactly the same.  However, this form enables diagonal calls, since the static_super() object can invoke any method defined in the chosen superclass.

The code for static_super() can be found in the same github repo.  I will go into the details of the implementation in the next post.

No comments:

Post a Comment