Part of my Python FAQ.
How does @property
work? Why does it call my __getattr__
? What’s a “descriptor”?
Python offers several ways to hook into attribute access—that is, there are several ways you can affect what happens when someone does obj.foo
to your object.
The most boring behavior is that the object has a foo
attribute (perhaps set in __init__
), or the class has a foo
method or attribute of its own.
If you need total flexibility, there are the magic methods __getattr__
and __getattribute__
, which can return a value depending on the attribute name.
Somewhere between these two extremes lie descriptors. A descriptor handles the attribute lookup for a single attribute, but can otherwise run whatever code it wants.
Properties are very simple descriptors. If you haven’t used them before, they look like this:
1class Whatever(object):
2 def __init__(self, n):
3 self.n = n
4
5 @property
6 def twice_n(self):
7 return self.n * 2
8
9 @twice_n.setter
10 def twice_n(self, new_n):
11 self.n = new_n / 2
12
13obj = Whatever(2)
14print obj.n # 2
15print obj.twice_n # 4
16obj.twice_n = 10
17print obj.n # 5
This does some stuff to create a descriptor object named twice_n
, which jumps in whenever code tries to use the twice_n
attribute of a Whatever
object. In the case of @property
, you can then have things that look like plain attributes but act like methods. But descriptors are a bit more powerful.
How they work
A descriptor is just an object; there’s nothing inherently special about it. Like many powerful Python features, they’re surprisingly simple. To get the descriptor behavior, only three conditions need to be met:
- You have a new-style class.
- It has some object as a class attribute.
- That object’s class has the appropriate special descriptor method.
Note very carefully that these conditions are in terms of classes. In particular, a descriptor will not work if it’s assigned to an object instead of a class, and an object is not a descriptor if you assign the object a function named __get__
. Descriptors are all about modifying behavior for classes, not individual objects!
Ahem. So, about those special descriptor methods. There are three of them, and your object can implement whichever ones it needs. Assuming this useless setup:
1class OwnerClass(object):
2 descriptor = DescriptorClass()
3
4obj = OwnerClass()
You can implement these methods, sometimes called the “descriptor protocol”:
-
__get__(self, instance, owner)
hooks into reading, for both an object and the class itself.obj.descriptor
will calldescriptor.__get__(obj, OwnerClass)
.OwnerClass.descriptor
will calldescriptor.__get__(None, OwnerClass)
. Here, it’s polite to just returnself
, so you can still get at the descriptor object like a regular class attribute. -
__set__(self, instance, value)
hooks into writing.obj.descriptor = 5
will calldescriptor.__set__(obj, 5)
. -
__delete__(self, instance)
hooks into deletion.del obj.descriptor
will calldescriptor.__delete__(obj)
.Note this is not the same as
__del__
; that’s something different entirely.
A minor point of confusion here: the descriptor is triggered by touching attributes on obj
, but inside these methods, self
is the descriptor object itself, not obj
.
You can implement any combination of these you like, and whichever you implement will be triggered. This may or may not be what you want, e.g.: if you only implement __set__
, you won’t get a write-only attribute; obj.descriptor
will act as normal and produce your descriptor object.
Writing a descriptor
Talking about descriptors involves juggling several classes and instances. Let’s try a simple example, instead: recreating property
.
First, the read-only behavior.
1class prop(object):
2 def __init__(self, get_func):
3 self.get_func = get_func
4
5 def __get__(self, instance, owner):
6 if instance is None:
7 return self
8
9 return self.get_func(instance)
10
11class Demo(object):
12 @prop
13 def attribute(self):
14 return 133
15
16print Demo().attribute
This code sneaks the descriptor in using a decorator. Remember that decorators can be rewritten as regular function calls. The class definition is roughly equivalent to this:
1def getter(self):
2 return 133
3
4class Demo(object):
5 attribute = prop(getter)
So the descriptor, attribute
, is just an object wrapping a single function. When code reads from Demo().attribute
, the descriptor calls its stored function on the Demo
instance and passes along the return value.
(The instance has to be passed in manually because the function isn’t being called as a method. If you refer to them within a class body directly, methods are just regular functions; they only get method magic added to them at the end of the class
block. It’s complicated.)
With this implementation, code could still do obj.attribute = 3
and the descriptor would be shadowed. Want setter behavior, too? No problem; add a __set__
.
1class prop(object):
2 # __init__ and __get__ same as before...
3
4 def __set__(self, instance, value):
5 self.set_func(instance, value)
6
7 def setter(self, set_func):
8 self.set_func = set_func
9 return self
10
11 def set_func(self, instance, value):
12 raise TypeError("can't set me")
13
14class Demo(object):
15 _value = None
16
17 @prop
18 def readwrite(self):
19 return self._value
20
21 @readwrite.setter
22 def readwrite(self, value):
23 self._value = value
24
25 @prop
26 def readonly(self):
27 return 133
28
29obj = Demo()
30print obj.readwrite
31obj.readwrite = 'foo'
32print obj.readwrite
33print obj.readonly
34obj.readonly = 'bar' # TypeError!
Look at all this crazy stuff going on. Take it a step at a time.
The new __set__
method is pretty much the same as before: it calls a stored function on the given instance
.
The setter
method makes the @readwrite.setter
decoration work. It stores the function, and then returns itself—remember, it’s a decorator, so whatever it returns will end up assigned to the decorated function’s name, readwrite
. The class definition is equivalent to:
1def func1(self):
2 return self._value
3
4readwrite = prop(func1)
5
6def func2(self, value):
7 self._value = value
8
9readwrite = readwrite.setter(func2)
Don’t be fooled: it looks like there are two readwrite
functions, but the class ends up with a single object that happens to contain two functions.
I include a default setter function, set_func
, so that properties are read-only unless the class specifies otherwise. It’s got three arguments because it’s a regular method: calling it with (instance, value)
will tack the descriptor object on as the first argument.
This is most of the way to an exact clone of Python’s builtin property
type, and it’s only a handful of very short methods.
Potential uses
Properties are an obvious use, but they’re built in, so why would you care about descriptors otherwise?
Maybe you wouldn’t. It’s metaprogramming, after all, so you either know you need it or can’t imagine why you ever would. I’ve used them a couple times, though, and I’ve seen them in the wild enough. Some examples:
-
Pyramid includes a nifty decorator-descriptor,
@reify
. It acts like@property
, except that the function is only ever called once; after that, the value is cached as a regular attribute. This gives you lazy attribute creation on objects that are meant to be immutable. It’s handy enough that I’ve wished it were in the standard library more than once. -
SQLAlchemy’s ORM classes rely heavily on descriptors:
SomeTableClass.column == 3
is actually using a descriptor that overloads a bunch of operators. -
If you’re writing a class with a lot of properties that all do similar work, you can write your own descriptor class to factor out the logic, rather than writing a bunch of similar property functions that all call more methods.
-
If you find yourself writing a
__getattr__
with a huge stack ofif
s or attribute name parsing or similar, consider writing a descriptor instead. -
Ever wonder how, exactly,
self
gets passed to a method call? Well, methods are just these class attributes that do something special when accessed via an object… surprise, methods are descriptors!
Descriptors and AttributeError
One final gotcha. A __get__
method is allowed to raise an AttributeError
if it wants to express that the attribute doesn’t exist. Python will then fall back to __getattr__
as usual.
Consider this, then:
1def __get__(self, instance, owner):
2 log.debg("i'm in a descriptor!")
3 # do stuff...
log.debg
probably doesn’t exist, so that code will raise an AttributeError
… which Python will take to mean the descriptor is saying it doesn’t exist. This is probably not what you want. Be very careful with attribute access inside a descriptor, especially for classes that also implement __getattr__
.
Conclusion
property
is cool.- Descriptors are cool.
- They aren’t hard to write, if you can keep
self
andinstance
straight. - They only work as class attributes!
Further reading
- The Python documentation on descriptors. Short, to the point, and totally useless for explaining what these things are.
- The Python HowTo on descriptors. Rather more useful.
- Perhaps also read up on
__getattr__
and__getattribute__
. - The implementation of
reify
is a nice example, and short enough that you may want to just paste it into your own project.