简介

在Python源码中，整数这个概念是通过PyLongObject对象实现的。这与python2中不同，在python2，存在PyIntObject的对象，将整数类型区分为int与long。而在最新的源码中，已经将int与long的类型统一。可以看下图案例说明：

# 基于python2.7解释器
>>> a = 0xfffffff
>>> type(a)
<type 'int'>
>>> type(a * a)
<type 'long'>

# 基于python3.10解释器
>>> a = 0xfffffff
>>> type(a)
<class 'int'>
>>> type(a * a)
<class 'int'>
>>> type(a * a * a)
<class 'int'>

PyLongObject

pylongobject对象是python源码中对于整数类型的实现，其将大整数和小整数进行了统一管理。整数在应用程序中，是使用最为广泛的类型，而python中的pyintobject其实就是对于C语言中long类型的包装。

[Include/longobject.h]
typedef struct _longobject PyLongObject; /* Revealed in longintrepr.h */

从longobject.h中可以看到整数对象的定义是在另一个文件中实现的，根据注释跟一下。

[Include/longintrepr.h]
struct _longobject {PyObject_VAR_HEADdigit ob_digit[1];
};

其中的ob_digit就是用来保存整数值的。

整数对象的元数据保存在与对象对应的类型对象中，和pylongobject对应的类型对象为PyLong_Type。

[Objects/longobject.c]
PyTypeObject PyLong_Type = {PyVarObject_HEAD_INIT(&PyType_Type, 0)"int",                                      /* tp_name */offsetof(PyLongObject, ob_digit),           /* tp_basicsize */sizeof(digit),                              /* tp_itemsize */0,                                          /* tp_dealloc */0,                                          /* tp_vectorcall_offset */0,                                          /* tp_getattr */0,                                          /* tp_setattr */0,                                          /* tp_as_async */long_to_decimal_string,                     /* tp_repr */&long_as_number,                            /* tp_as_number */0,                                          /* tp_as_sequence */0,                                          /* tp_as_mapping */(hashfunc)long_hash,                        /* tp_hash */0,                                          /* tp_call */0,                                          /* tp_str */PyObject_GenericGetAttr,                    /* tp_getattro */0,                                          /* tp_setattro */0,                                          /* tp_as_buffer */Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE |Py_TPFLAGS_LONG_SUBCLASS |_Py_TPFLAGS_MATCH_SELF,               /* tp_flags */long_doc,                                   /* tp_doc */0,                                          /* tp_traverse */0,                                          /* tp_clear */long_richcompare,                           /* tp_richcompare */0,                                          /* tp_weaklistoffset */0,                                          /* tp_iter */0,                                          /* tp_iternext */long_methods,                               /* tp_methods */0,                                          /* tp_members */long_getset,                                /* tp_getset */0,                                          /* tp_base */0,                                          /* tp_dict */0,                                          /* tp_descr_get */0,                                          /* tp_descr_set */0,                                          /* tp_dictoffset */0,                                          /* tp_init */0,                                          /* tp_alloc */long_new,                                   /* tp_new */PyObject_Del,                               /* tp_free */
};

每个类型的元数据都比较多，其中的tp_name就是类型名称，也就是我们type时输出的内容。上面完整的列出了pylongobject对象的元数据，其中包括了内存大小、文档信息、类型名称以及支持的操作等，这里pylongobject支持的操作统一放在了long_as_number中。而long_doc中，保存就是python关于对象的一些介绍说明。可以通过调用__doc__方法获取。

>>> a = 1
>>> a.__doc__
"int([x]) -> integer\nint(x, base=10) -> integer\n\nConvert a number or string to an integer, or return 0 if no arguments\nare given.  If x is a number, return x.__int__().  For floating point\nnumbers, this truncates towards zero.\n\nIf x is not a number or if base is given, then x must be a string,\nbytes, or bytearray instance representing an integer literal in the\ngiven base.  The literal can be preceded by '+' or '-' and be surrounded\nby whitespace.  The base defaults to 10.  Valid bases are 0 and 2-36.\nBase 0 means to interpret the base from the string as an integer literal.\n>>> int('0b100', base=0)\n4"

其它的不在一一说明，有需要的话，可以根据源码进一步了解。下面进一步了解pylongobject支持的操作：

[Objects/longobject.c]
static PyNumberMethods long_as_number = {(binaryfunc)long_add,       /*nb_add*/(binaryfunc)long_sub,       /*nb_subtract*/(binaryfunc)long_mul,       /*nb_multiply*/long_mod,                   /*nb_remainder*/long_divmod,                /*nb_divmod*/long_pow,                   /*nb_power*/(unaryfunc)long_neg,        /*nb_negative*/long_long,                  /*tp_positive*/(unaryfunc)long_abs,        /*tp_absolute*/(inquiry)long_bool,         /*tp_bool*/(unaryfunc)long_invert,     /*nb_invert*/long_lshift,                /*nb_lshift*/long_rshift,                /*nb_rshift*/long_and,                   /*nb_and*/long_xor,                   /*nb_xor*/long_or,                    /*nb_or*/long_long,                  /*nb_int*/0,                          /*nb_reserved*/long_float,                 /*nb_float*/0,                          /* nb_inplace_add */0,                          /* nb_inplace_subtract */0,                          /* nb_inplace_multiply */0,                          /* nb_inplace_remainder */0,                          /* nb_inplace_power */0,                          /* nb_inplace_lshift */0,                          /* nb_inplace_rshift */0,                          /* nb_inplace_and */0,                          /* nb_inplace_xor */0,                          /* nb_inplace_or */long_div,                   /* nb_floor_divide */long_true_divide,           /* nb_true_divide */0,                          /* nb_inplace_floor_divide */0,                          /* nb_inplace_true_divide */long_long,                  /* nb_index */
};

上面long_as_number中，包括了所有pylongobject对象支持的操作。这里取long_and来进一步查看具体的实现，这个方法主要实现了两个pyobject对象的&运算。

static PyObject *
long_and(PyObject *a, PyObject *b)
{PyObject *c;CHECK_BINOP(a, b);c = long_bitwise((PyLongObject*)a, '&', (PyLongObject*)b);return c;
}

大小整数

这里的大小整数和平常理解中的不同，这里的小整数是python中硬编码设置的范围，这个范围之外的就是大整数。默认小整数为（-5 ~ 257），这个之外的就是大整数了。

大小整数的区分

为什么要这么区分呢？在实际应用中，数值较小的整数，比如1、10、100等，在程序中的使用相比于大整数会十分频繁。如果没有特殊机制，对于这些使用频繁的小整数，将会频繁的申请空间和释放空间。会降低运行效率，同时造成大量内存碎片，影响python的整体性能。

因此，在python中，对小整数应用了对象池的技术，对于小整数只创建一次，长期保存，当有地方需要应用时，直接共享小整数池的信息。如果有需要的话，可以通过修改python源码，重编译的方式，来修改小整数的范围，来让这个范围值更加适用于自己的业务逻辑。

[Include/internal/pycore_interp.h]
#define _PY_NSMALLPOSINTS           257
#define _PY_NSMALLNEGINTS           5

小整数池初始化

python在运行时，会统一生成小整数的对象池，来避免因为小整数频繁使用造成的性能消耗。

小整数对象池的生成在_PyLong_Init中实现，在这个文件前面使用宏定义，获取了大小整数的范围值，再通过for循环，将小整数都创建好，缓存在内存中，后面需要使用小整数的时候就可以直接使用了。

[Objects/longobject.c]
#define NSMALLNEGINTS           _PY_NSMALLNEGINTS
#define NSMALLPOSINTS           _PY_NSMALLPOSINTSint
_PyLong_Init(PyInterpreterState *interp)
{for (Py_ssize_t i=0; i < NSMALLNEGINTS + NSMALLPOSINTS; i++) {sdigit ival = (sdigit)i - NSMALLNEGINTS;int size = (ival < 0) ? -1 : ((ival == 0) ? 0 : 1);PyLongObject *v = _PyLong_New(1);if (!v) {return -1;}Py_SET_SIZE(v, size);v->ob_digit[0] = (digit)abs(ival);interp->small_ints[i] = v;}return 0;
}

整数对象的创建

对于小整数，python通过在小整数对象池中缓存来提高整体的性能。但是大整数的使用，就无法避免申请空间和释放空间的损耗了。为了提前这方面的效能，python使用了内存池的方法，这里不过多介绍。

关于整数对象的创建，python中存在多种方式。

[Include/longobject.h]
PyAPI_FUNC(PyObject *) PyLong_FromLong(long);
PyAPI_FUNC(PyObject *) PyLong_FromUnsignedLong(unsigned long);
PyAPI_FUNC(PyObject *) PyLong_FromSize_t(size_t);
PyAPI_FUNC(PyObject *) PyLong_FromSsize_t(Py_ssize_t);
PyAPI_FUNC(PyObject *) PyLong_FromDouble(double);
...
PyAPI_FUNC(PyObject *) PyLong_FromString(const char *, char **, int);
#ifndef Py_LIMITED_API
PyAPI_FUNC(PyObject *) PyLong_FromUnicodeObject(PyObject *u, int base);
PyAPI_FUNC(PyObject *) _PyLong_FromBytes(const char *, Py_ssize_t, int);
#endif

下面我们通过对PyLong_FromLong的熟悉，来了解一个整数对象的创建。

[Include/longobject.c]
PyObject *
PyLong_FromLong(long ival)
{PyLongObject *v;unsigned long abs_ival;unsigned long t;  /* unsigned so >> doesn't propagate sign bit */int ndigits = 0;int sign;if (IS_SMALL_INT(ival)) {return get_small_int((sdigit)ival);}if (ival < 0) {/* negate: can't write this as abs_ival = -ival since thatinvokes undefined behaviour when ival is LONG_MIN */abs_ival = 0U-(unsigned long)ival;sign = -1;}else {abs_ival = (unsigned long)ival;sign = ival == 0 ? 0 : 1;}/* Fast path for single-digit ints */if (!(abs_ival >> PyLong_SHIFT)) {v = _PyLong_New(1);if (v) {Py_SET_SIZE(v, sign);v->ob_digit[0] = Py_SAFE_DOWNCAST(abs_ival, unsigned long, digit);}return (PyObject*)v;}#if PyLong_SHIFT==15/* 2 digits */if (!(abs_ival >> 2*PyLong_SHIFT)) {v = _PyLong_New(2);if (v) {Py_SET_SIZE(v, 2 * sign);v->ob_digit[0] = Py_SAFE_DOWNCAST(abs_ival & PyLong_MASK, unsigned long, digit);v->ob_digit[1] = Py_SAFE_DOWNCAST(abs_ival >> PyLong_SHIFT, unsigned long, digit);}return (PyObject*)v;}
#endif/* Larger numbers: loop to determine number of digits */t = abs_ival;while (t) {++ndigits;t >>= PyLong_SHIFT;}v = _PyLong_New(ndigits);if (v != NULL) {digit *p = v->ob_digit;Py_SET_SIZE(v, ndigits * sign);t = abs_ival;while (t) {*p++ = Py_SAFE_DOWNCAST(t & PyLong_MASK, unsigned long, digit);t >>= PyLong_SHIFT;}}return (PyObject *)v;
}

这里可以看到，在创建整数对象的时候。先进行了小整数的判断，如果是小整数的话，就直接从小整数的对象池中获取。如果不是的话，需要根据数值的大小，来申请空间，赋值返回。

计数回收机制

python中的所有对象都来源于PyObject，也包括前面介绍的整数类型。下面列出了相关源码：

typedef struct _object {_PyObject_HEAD_EXTRAPy_ssize_t ob_refcnt;PyTypeObject *ob_type;
} PyObject;typedef struct {PyObject ob_base;Py_ssize_t ob_size; /* Number of items in variable part */
} PyVarObject;#define PyObject_VAR_HEAD      PyVarObject ob_base;

PyObject中的ob_refcnt记录对象的引用数，了解python的应该都清楚，python的垃圾回收机制就是基于引用数来完成的，当一个对象的引用数为零时，会自动释放这部分的空间。

当一个对象被引用时，会调用_Py_INCREF函数，将对象的引用数加1。

static inline void _Py_INCREF(PyObject *op)
{
#if defined(Py_REF_DEBUG) && defined(Py_LIMITED_API) && Py_LIMITED_API+0 >= 0x030A0000// Stable ABI for Python 3.10 built in debug mode._Py_IncRef(op);
#else// Non-limited C API and limited C API for Python 3.9 and older access// directly PyObject.ob_refcnt.
#ifdef Py_REF_DEBUG_Py_RefTotal++;
#endifop->ob_refcnt++;
#endif
}

而当一个对象的引用减少时，会调用_Py_DECREF函数，将ob_refcnt的值减一，如果引用数降到0，将会调用_Py_Dealloc释放空间。

static inline void _Py_DECREF(
#if defined(Py_REF_DEBUG) && !(defined(Py_LIMITED_API) && Py_LIMITED_API+0 >= 0x030A0000)const char *filename, int lineno,
#endifPyObject *op)
{
#if defined(Py_REF_DEBUG) && defined(Py_LIMITED_API) && Py_LIMITED_API+0 >= 0x030A0000// Stable ABI for Python 3.10 built in debug mode._Py_DecRef(op);
#else// Non-limited C API and limited C API for Python 3.9 and older access// directly PyObject.ob_refcnt.
#ifdef Py_REF_DEBUG_Py_RefTotal--;
#endifif (--op->ob_refcnt != 0) {
#ifdef Py_REF_DEBUGif (op->ob_refcnt < 0) {_Py_NegativeRefcount(filename, lineno, op);}
#endif}else {_Py_Dealloc(op);}
#endif
}

Python源码解析-整数与引用计数器相关推荐

Python源码解析：内存管理（DEBUG模式）的几个理解点
写了这多贴子,顺带写点自己的感想吧!其实很多贴子在写的时候很踌躇,比如这次打算写的python内存管理,因为内存管理都比较琐碎,在软件架构里,也是很容易出问题的地方,涉及的细节内容非常多,要写好写明白 ...
python 源码解析 object 定义的介绍
在python的世界中一切皆对象,所有的子类都是继承自同一个父类,object 那object 到底是什么呢? 来看源码定义 typedef struct _object {_PyObject_HEA ...
python 源码解析
http://blog.donews.com/lemur/archive/category/cpython%E6%BA%90%E7%A0%81%E5%89%96%E6%9E%90/ 转载于:https ...
BasicGames Python 源码解析 02 Amazing
:apachecn/python-code-anal 这个游戏会接收用户输入的长和宽,动态生成一个迷宫. 改进自 Frank Palazzolo 的版本. 导入 import random impor ...
BasicGames Python 源码解析 01 AceyDucey
:apachecn/python-code-anal 导入 import random cards # 定义卡牌面值和名称的映射 cards = {1: "1",2: " ...
python整型数据源码分析_Python2 基本数据结构源码解析
Python2 基本数据结构源码解析 Contents 0x00. Preface 0x01. PyObject 0x01. PyIntObject 0x02. PyFloatObject 0x04. ...
python简单代码加法-CPython 源码中整数加法的实现
最近突然涌起兴趣去阅读 CPython 源码,网上也看了不少解析的文章,后来网上看到<Python源码剖析>评价不错,可惜现在已经绝版,只能从豆瓣阅读购买了一本电子书观摩 . 我从网上下载 ...
python flask源码解析_用尽洪荒之力学习Flask源码
[TOC] 一直想做源码阅读这件事,总感觉难度太高时间太少,可望不可见.最近正好时间充裕,决定试试做一下,并记录一下学习心得. 首先说明一下,本文研究的Flask版本是0.12. 首先做个小示例,在p ...
python处理回显_Python中getpass模块无回显输入源码解析
本文主要讨论了python中getpass模块的相关内容,具体如下. getpass模块昨天跟学弟吹牛b安利Python标准库官方文档的时候偶然发现了这个模块.仔细一看内容挺少的,只有两个主要api ...

Python源码解析-整数与引用计数器

文章目录

简介