Part of my Python FAQ.
How does @property
work? Why does it call my __getattr__
? What’s a “descriptor”?
Python offers several ways to hook into attribute access—that is, there are several ways you can affect what happens when someone does obj.foo
to your object.
The most boring behavior is that the object has a foo
attribute (perhaps set in __init__
), or the class has a foo
method or attribute of its own.
If you need total flexibility, there are the magic methods __getattr__
and __getattribute__
, which can return a value depending on the attribute name.
Somewhere between these two extremes lie descriptors. A descriptor handles the attribute lookup for a single attribute, but can otherwise run whatever code it wants.
Properties are very simple descriptors. If you haven’t used them before, they look like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
This does some stuff to create a descriptor object named twice_n
, which jumps in whenever code tries to use the twice_n
attribute of a Whatever
object. In the case of @property
, you can then have things that look like plain attributes but act like methods. But descriptors are a bit more powerful.
How they work
A descriptor is just an object; there’s nothing inherently special about it. Like many powerful Python features, they’re surprisingly simple. To get the descriptor behavior, only three conditions need to be met:
- You have a new-style class.
- It has some object as a class attribute.
- That object’s class has the appropriate special descriptor method.
Note very carefully that these conditions are in terms of classes. In particular, a descriptor will not work if it’s assigned to an object instead of a class, and an object is not a descriptor if you assign the object a function named __get__
. Descriptors are all about modifying behavior for classes, not individual objects!
Ahem. So, about those special descriptor methods. There are three of them, and your object can implement whichever ones it needs. Assuming this useless setup:
1 2 3 4 |
|
You can implement these methods, sometimes called the “descriptor protocol”:
-
__get__(self, instance, owner)
hooks into reading, for both an object and the class itself.obj.descriptor
will calldescriptor.__get__(obj, OwnerClass)
.OwnerClass.descriptor
will calldescriptor.__get__(None, OwnerClass)
. Here, it’s polite to just returnself
, so you can still get at the descriptor object like a regular class attribute. -
__set__(self, instance, value)
hooks into writing.obj.descriptor = 5
will calldescriptor.__set__(obj, 5)
. -
__delete__(self, instance)
hooks into deletion.del obj.descriptor
will calldescriptor.__delete__(obj)
.Note this is not the same as
__del__
; that’s something different entirely.
A minor point of confusion here: the descriptor is triggered by touching attributes on obj
, but inside these methods, self
is the descriptor object itself, not obj
.
You can implement any combination of these you like, and whichever you implement will be triggered. This may or may not be what you want, e.g.: if you only implement __set__
, you won’t get a write-only attribute; obj.descriptor
will act as normal and produce your descriptor object.
Writing a descriptor
Talking about descriptors involves juggling several classes and instances. Let’s try a simple example, instead: recreating property
.
First, the read-only behavior.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
This code sneaks the descriptor in using a decorator. Remember that decorators can be rewritten as regular function calls. The class definition is roughly equivalent to this:
1 2 3 4 5 |
|
So the descriptor, attribute
, is just an object wrapping a single function. When code reads from Demo().attribute
, the descriptor calls its stored function on the Demo
instance and passes along the return value.
(The instance has to be passed in manually because the function isn’t being called as a method. If you refer to them within a class body directly, methods are just regular functions; they only get method magic added to them at the end of the class
block. It’s complicated.)
With this implementation, code could still do obj.attribute = 3
and the descriptor would be shadowed. Want setter behavior, too? No problem; add a __set__
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
|
Look at all this crazy stuff going on. Take it a step at a time.
The new __set__
method is pretty much the same as before: it calls a stored function on the given instance
.
The setter
method makes the @readwrite.setter
decoration work. It stores the function, and then returns itself—remember, it’s a decorator, so whatever it returns will end up assigned to the decorated function’s name, readwrite
. The class definition is equivalent to:
1 2 3 4 5 6 7 8 9 |
|
Don’t be fooled: it looks like there are two readwrite
functions, but the class ends up with a single object that happens to contain two functions.
I include a default setter function, set_func
, so that properties are read-only unless the class specifies otherwise. It’s got three arguments because it’s a regular method: calling it with (instance, value)
will tack the descriptor object on as the first argument.
This is most of the way to an exact clone of Python’s builtin property
type, and it’s only a handful of very short methods.
Potential uses
Properties are an obvious use, but they’re built in, so why would you care about descriptors otherwise?
Maybe you wouldn’t. It’s metaprogramming, after all, so you either know you need it or can’t imagine why you ever would. I’ve used them a couple times, though, and I’ve seen them in the wild enough. Some examples:
-
Pyramid includes a nifty decorator-descriptor,
@reify
. It acts like@property
, except that the function is only ever called once; after that, the value is cached as a regular attribute. This gives you lazy attribute creation on objects that are meant to be immutable. It’s handy enough that I’ve wished it were in the standard library more than once. -
SQLAlchemy’s ORM classes rely heavily on descriptors:
SomeTableClass.column == 3
is actually using a descriptor that overloads a bunch of operators. -
If you’re writing a class with a lot of properties that all do similar work, you can write your own descriptor class to factor out the logic, rather than writing a bunch of similar property functions that all call more methods.
-
If you find yourself writing a
__getattr__
with a huge stack ofif
s or attribute name parsing or similar, consider writing a descriptor instead. -
Ever wonder how, exactly,
self
gets passed to a method call? Well, methods are just these class attributes that do something special when accessed via an object… surprise, methods are descriptors!
Descriptors and AttributeError
One final gotcha. A __get__
method is allowed to raise an AttributeError
if it wants to express that the attribute doesn’t exist. Python will then fall back to __getattr__
as usual.
Consider this, then:
1 2 3 |
|
log.debg
probably doesn’t exist, so that code will raise an AttributeError
… which Python will take to mean the descriptor is saying it doesn’t exist. This is probably not what you want. Be very careful with attribute access inside a descriptor, especially for classes that also implement __getattr__
.
Conclusion
property
is cool.- Descriptors are cool.
- They aren’t hard to write, if you can keep
self
andinstance
straight. - They only work as class attributes!
Further reading
- The Python documentation on descriptors. Short, to the point, and totally useless for explaining what these things are.
- The Python HowTo on descriptors. Rather more useful.
- Perhaps also read up on
__getattr__
and__getattribute__
. - The implementation of
reify
is a nice example, and short enough that you may want to just paste it into your own project.