Post: When you want to see the internal code of Python builtins (feat. subclasses)
While looking into an open-source project as I had some exploring to do, I discovered for the first time that classes inheriting from object have a reserved method called subclasses. Let’s search for the code as a form of self-reflection… Although I can imagine how it might be implemented, I decided to post for those who may not know how to view the code of built-in objects.
With commonly used PyCharm, if you try to jump to the definition of builtin, it will possibly show something like this:
def __subclasses__(self, *args, **kwargs): # real signature unknown
""" Return a list of immediate subclasses. """
pass
That’s it!… But of course, this can’t be the real source code. If you check the path of the opened source code, you would realize it includes python_stubs, which is fake, containing only the signature for the purpose of autocompletion (this is my guess).
To actually see the built-ins, you need to check the source code of the Python interpreter you are using. Most likely, it is CPython, which can be obtained from the official website.
You don’t need to read everything. If you only search for the necessary parts…
(/Objects/typeobject.c.h)
...
return type_mro_impl(self);
}
PyDoc_STRVAR(type___subclasses____doc__,
"__subclasses__($self, /)\\n"
"--\\n"
#define TYPE___SUBCLASSES___METHODDEF \\
{"__subclasses__", (PyCFunction)type___subclasses__, METH_NOARGS, type___subclasses____doc__},
static PyObject *
...
You can find where it is easily. Since what I found is a header, and not the body I was curious about. Upon reading the source file along with the found header…
(/Objects/typeobject.c)
...
#define TYPE___SUBCLASSES___METHODDEF \\
{"__subclasses__", (PyCFunction)type___subclasses__, METH_NOARGS, type___subclasses____doc__},
static PyObject *
type___subclasses___impl(PyTypeObject *self);
static PyObject *
type___subclasses__(PyTypeObject *self, PyObject *Py_UNUSED(ignored))
{
return type___subclasses___impl(self);
}
There are two method signatures: one for creating the PyMethodDef
struct using macros and the other for the function. Perhaps, defining it separately with impl is because the call stack process, which goes from PyObject_CallMethod
-> callmethod
-> _PyObject_CallFunctionVa
, calls with self in vargs to use only self…? I’m not exactly sure.
Anyhow, if you look into the body of type___subclasses___impl
, it’s shown as follows:
/*[clinic input]
type.__subclasses__
Return a list of immediate subclasses.
[clinic start generated code]*/
static PyObject *
type___subclasses___impl(PyTypeObject *self)
/*[clinic end generated code: output=eb5eb54485942819 input=5af66132436f9a7b]*/
{
PyObject *list, *raw, *ref;
Py_ssize_t i;
list = PyList_New(0);
if (list == NULL)
return NULL;
raw = self->tp_subclasses;
if (raw == NULL)
return list;
assert(PyDict_CheckExact(raw));
i = 0;
while (PyDict_Next(raw, &i, NULL, &ref)) {
assert(PyWeakref_CheckRef(ref));
ref = PyWeakref_GET_OBJECT(ref);
if (ref != Py_None) {
if (PyList_Append(list, ref) < 0) {
Py_DECREF(list);
return NULL;
}
}
}
return list;
}
The tags at the top are directives from Python’s C code preprocessor, Argument clinic, which is used for generating docstrings to handle boilerplate…
Anyway, the body essentially makes a list from the values of the object, which appears to be a dictionary, called tp_subclasses
of self.
Further in the same typeobject.c
, you can see that when a subclass is created, it adds to a dictionary in the base class object as follows:
static int
add_subclass(PyTypeObject *base, PyTypeObject *type)
{
int result = -1;
PyObject *dict, *key, *newobj;
dict = base->tp_subclasses;
if (dict == NULL) {
base->tp_subclasses = dict = PyDict_New();
if (dict == NULL)
return -1;
}
assert(PyDict_CheckExact(dict));
key = PyLong_FromVoidPtr((void *) type);
if (key == NULL)
return -1;
newobj = PyWeakref_NewRef((PyObject *)type, NULL);
if (newobj != NULL) {
result = PyDict_SetItem(dict, key, newobj);
Py_DECREF(newobj);
}
Py_DECREF(key);
return result;
}
From this, it’s clear that when subclassing, it adds to a dictionary created in the base class object.