Sunday, December 25, 2022

Metaprogramming in Python 1: Introduction

Contents

Part 1: Introduction 

Part 2: An Interning Decorator

See also the discussion of how to do super() in Python correctly; the implementation uses inspection to find the correct super method to apply.

Series Introduction

Metaprogramming is (among other things) a way of programming general techniques that operate by transforming a target program in various ways.  Python has many features that are implemented using metaprogramming, including static and class methods, abstract classes and methods, data classes, properties, and context managers.

Python also exposes metaprogramming facilities that enable Python developers to create their own features.  These include an elaborate data model that has many hooks for modifying the standard behavior of the language, and decorators, which are a handy shorthand mechanism for applying new features to user code.

In this series of posts I will present several new and useful features and show how they can be implemented using Python's metaprogramming facilities.

A Little History

Metaprogramming, in the sense of programs operating on other programs, includes assemblers, compilers, linkers, and loaders, and therefore harks back to the early days of computing.  Metaprogramming in the more restricted sense used in this series, of language facilities that enable modifying the default behavior of the language itself, also has its roots in one of the earliest high-level language, Lisp.  Unlike other early languages, which focused on scientific or business processing, Lisp was all about symbolic processing, and was one of the major languages used for Artificial Intelligence research for several decades.

Because of its focus on symbolic processing, a list is the basic data structure of Lisp (hence the name of the language, a contraction of list processing).  A Lisp program is also a list (with other lists nested inside it), so that writing Lisp interpreters and manipulating Lisp programs in Lisp is easy and natural.  The creators and many of the users of Lisp were also interested in language design, and created many features and language dialects by modifying the Lisp interpreter, and using metaprogramming.  These came together in the early 1980s to create a specification for Common Lisp, an attempt to collect the best features of previous dialects into a single language.  The Common Lisp Object System (CLOS) exposed its implementation through the CLOS Meta-Object Protocol, which enabled extensive customizations of the behavior of the basic language mechanisms.

Lisp, and Common Lisp in particular, had a strong influence on many newer languages, including Python, which follows the tradition of providing powerful metaprogramming capabilities.  This series will demonstrate how these can be used to add useful features to the language in a way that is very easy to use.

Metaprogramming

There are many definitions of metaprogramming; very generally, it refers to programs that operate on other programs (https://cs.lmu.edu/~ray/notes/metaprogramming/).  In the context of this series, it refers to a set of mechanisms for changing the standard behavior of a programming language in flexible ways.  The major mechanisms we will use are:

  • dynamically and programmatically adding or modifying class methods;
  • modifying the behavior of class constructors so that they don't necessarily create objects of the class, or even any new objects;
  • modifying classes, methods, or functions using decorators; and
  • inspecting classes, methods, and functions to find relevant properties at runtime.

Adding or Modifying Class Methods

Adding or modifying class methods can be done as for any class attribute, using setattr() and getattr().  These methods are similar to the use of dot notation to access a class attribute (as in "abc".capitalize()).  Unlike dot notation, you can use them to get and set attributes whose names are not fixed in the source code.  We will use these to create or modify the behavior of classes according to the feature we are implementing.

Modifying Class Constructors

Many of Python's metaprogramming facilities operate through "magic" methods, whose names have double underscores at the beginning and end (and are therefore called "dunder methods"). Most of these are part of Python's data model.

Changing the behavior of class constructors is done in Python through the magic method __new__().  This is a static method that is automatically invoked during the processing of a constructor call.  By default, it creates a new object, which is then initialized by the __init__() method.  However, you can override this method to return any object of any class, including previously-created objects.

Decorators

Decorators are a very powerful Python mechanism that can apply arbitrary code to a class, method, or function object and modify or replace it to customize its behavior.  One of the best known decorators in Python is @staticmethod, which turns a method defined inside a class into a static method (one that does not take the self object as a parameter).  A decorator is a function that takes the decorated class, method, or function, and returns another one to replace the original.  Syntactically, it is written as the name of the decorator preceded by an at-sign (@); the decorator is placed before the class, method, or function definition it modifies.

The meaning of a decorator written this way is to replace the defined element by whatever the decorator function returns when given the result of the original definition.  For example, the effect of the definition

@dataclass
class Foo:
    f1: int
    f2: str

will be first to create the class Foo, which has two class-level fields, f1 and f2.  Without the decorator, these fields belong to the class rather than to specific instances, and are therefore shared between all instances.  However, the decorator corresponds to the following statement:

Foo = dataclass(Foo)

The dataclass function examines the class object it receives (the one that has two class-level fields), and modifies it by (among other things) creating a method __init__(f1: int, f2: str); this new method create two instance-level fields in every object of the class.

The dataclass decorator can take various arguments that change the way it works on the given class, making it very flexible.  For more details, see the documentation.

Inspection

Inspection (also called introspection) is the ability of a program to investigate properties of itself.  This can include its global data (variables, classes, functions), the methods of a class, the arguments of a function or method, and much more.  This is useful for metaprogramming because the behavior being implemented may depend on these properties of the parts of the program it modifies.

As one example, modifying the behavior of object initialization, as done by the __init__() method, depends on whether or not a class defines this method.  Finding such a method definition in the class, as described above, is a simple case of inspection.

For a more complex example, see the series of posts on how to do super() in Python correctly and the corresponding implementation.

Putting Things Together

When adding a feature to the language, you will often have to create new classes, methods, or functions, or modify existing ones.  Your code will often need to override some of the magic dunder methods.  You will then probably deliver this functionality in the form of a decorator, like dataclass .  Subsequent posts will illustrate how this is done.

No comments:

Post a Comment