python矩阵运算库效率_python - 布尔矩阵运算的最快方法_performance

只需在compute中进行一些小的更改：def compute(m, n):

m = np.asarray(m)

n = np.asarray(n)

# Apply mask N in advance

m2 = m & n

# Pack booleans into uint8 for more efficient bitwise operations

# Also transpose for better caching (maybe?)

mb = np.packbits(m2.T, axis=1)

# Table with number of ones in each uint8

num_bits = (np.arange(256)[:, np.newaxis] & (1 << np.arange(8))).astype(bool).sum(1)

# Allocate output array

out = np.zeros((m2.shape[1], m2.shape[1]), np.int32)

# Do the counting with Numba

_compute_nb(mb, num_bits, out)

# Make output symmetric

out = out + out.T

# Add values in diagonal

out[np.diag_indices_from(out)] = m2.sum(0)

# Scale by number of ones in n

return out

我会使用一些Numba技巧。首先，您只能执行按列操作的一半，因为另一半是重复的。第三，可以使用多进程并行处理，总的来说，你可以这样做：import numpy as np

import numba as nb

def compute(m, n):

m = np.asarray(m)

n = np.asarray(n)

# Pack booleans into uint8 for more efficient bitwise operations

# Also transpose for better caching (maybe?)

mb = np.packbits(m.T, axis=1)

# Table with number of ones in each uint8

num_bits = (np.arange(256)[:, np.newaxis] & (1 << np.arange(8))).astype(bool).sum(1)

# Allocate output array

out = np.zeros((m.shape[1], m.shape[1]), np.int32)

# Do the counting with Numba

_compute_nb(mb, num_bits, out)

# Make output symmetric

out = out + out.T

# Add values in diagonal

out[np.diag_indices_from(out)] = m.sum(0)

# Scale by number of ones in n

out *= n.sum()

return out

@nb.njit(parallel=True)

def _compute_nb(mb, num_bits, out):

# Go through each pair of columns without repetitions

for i in nb.prange(mb.shape[0] - 1):

for j in nb.prange(1, mb.shape[0]):

# Count common bits

v = 0

for k in range(mb.shape[1]):

v += num_bits[mb[i, k] & mb[j, k]]

out[i, j] = v

# Test

m = np.array([[ True, True, False, True],

[False, True, True, True],

[False, False, False, False],

[False, True, False, False],

[ True, True, False, False]])

n = np.array([[ True],

[False],

[ True],

[ True]])

out = compute(m, n)

print(out)

# [[ 8 8 0 4]

# [ 8 16 4 8]

# [ 0 4 4 4]

# [ 4 8 4 8]]

快速比较，这是针对原始循环和仅NumPy的方法的一个小型基准:import numpy as np

# Original loop

def compute_loop(m, n):

out = np.zeros((m.shape[1], m.shape[1]), np.int32)

for i in range(m.shape[1]):

for j in range(m.shape[1]):

result = m[:, i] & m[:, j]

out[i, j] = np.sum(result & n)

return out

# Divakar methods

def compute2(m, n):

return np.einsum('ij,ik,lm->jk', m, m.astype(int), n)

def compute3(m, n):

return np.einsum('ij,ik->jk',m, m.astype(int)) * n.sum()

def compute4(m, n):

return np.tensordot(m, m.astype(int),axes=((0,0))) * n.sum()

def compute5(m, n):

return m.T.dot(m.astype(int))*n.sum()

# Make random data

np.random.seed(0)

m = np.random.rand(1000, 100) > .5

n = np.random.rand(1000, 1) > .5

print(compute(m, n).shape)

# (100, 100)

%timeit compute(m, n)

# 768 µs ± 17.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit compute_loop(m, n)

# 11 s ± 1.23 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit compute2(m, n)

# 7.65 s ± 1.06 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit compute3(m, n)

# 23.5 ms ± 1.53 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit compute4(m, n)

# 8.96 ms ± 194 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit compute5(m, n)

# 8.35 ms ± 266 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

python矩阵运算库效率_python - 布尔矩阵运算的最快方法_performance_酷徒编程知识库...相关推荐

python history没有定义_python AttributeError:'Tensor'对象没有属性'_keras_history'_python_酷徒编程知识库...
我正在開發一種GAN(通用對抗網路).Layer (type) Output Shape Param # Connected to __________________________________ ...
python执行shell命令查看输出_python 运行 shell 命令并捕获输出_python_酷徒编程知识库...
这个问题的答案取决于你使用的python 版本. 最简单的方法是使用 subprocess.check_output 函数:>>> subprocess.check_output([ ...
python中如何输入矩阵_python - 如何向矩阵中添加向量_numpy_酷徒编程知识库
首先,我们可以初始化一个用零填充所需形状的矩阵,然后将a复制到前13行.在任何情况下,我们都必须形成一个新的矩阵,因为我们无法摆弄现有的矩阵/向量,因为我们需要为额外的空行分配更多的内存. 你可以在下 ...
python输入函数后无法运行_python - 如何在函数运行期间忽略所有用户输入？_python_酷徒编程知识库...
我有一个python模块,它使用pynput监听按键,但是一旦按下它,它就会在一个文本程序中键入一个字符串. 我需要一种方法来禁用键盘,直到pyautogui输完字符串. from pynput.ke ...
python字符串的表示形式_python - 如何为类对象创建自定义字符串表示形式？_class_酷徒编程知识库...
当前python 3的更新如下:class MC(type): def __repr__(self): return 'Wahaha!' class C(object, metaclass=MC): ...
python预处理删除特殊字符_python - 如何删除包含特殊字符的字符串？_others_酷徒编程知识库...
我试图删除所有包含特殊字符的字符串.description_list = ['$', '2,850', 'door', '.', 'sale', '...', 'trades', '.', 'pay' ...
python 判断时间是否大于6点_python - 在dataframe中，如何检查时间增量是否大于一分钟？_pandas_酷徒编程知识库...
我试图在dataframe中比较不同的时间戳,并在时间差异大于一分钟时打印输出,这是我试图运行的代码:for e in TestDF['date']: delta = TestDF.date.iloc ...
python能查询MySQL视图_python - 在使用Django的视图中，如何从mysql检索数据，并显示它_python_酷徒编程知识库...
这是模型:from django.db import models # Create your models here. class Contact(models.Model): name = mod ...
python中int对象不可迭代_python - 情感分析接收错误：'int'对象不可迭代_python-3.x_酷徒编程知识库...
我在csv文件上运行情感分析,并且收到这个错误消息, 这是我的代码:def sentimentAFINN(text): words = pattern_split.split(text.lower() ...

python矩阵运算库效率_python - 布尔矩阵运算的最快方法_performance_酷徒编程知识库...

python矩阵运算库效率_python - 布尔矩阵运算的最快方法_performance_酷徒编程知识库...相关推荐

最新文章

热门文章