
Dlib 是一个 C++ 工具包,被广泛应用于工业和学术界。Dlib 的开源许可允许在任何应用程序中免费使用它。Dlib支持导出其他编程语言如Python的binding。






pip install dlib


>>> import dlib
>>> dlib.DLIB_USE_CUDA

但是在使用face_recognition库进行推理的时候,出现了多卡机器上的显存分配异常(如下图中的253MiB的异常显存占用)和illegal memory access was encountered等问题。

|    3   N/A  N/A     46077      C   ...envs/test/bin/python     4409MiB |
|    3   N/A  N/A     46115      C   ...envs/test/bin/python     8569MiB |
|    3   N/A  N/A     46165      C   ...envs/test/bin/python     5822MiB |
|    3   N/A  N/A     46176      C   ...envs/test/bin/python      253MiB |
|    3   N/A  N/A     46186      C   ...envs/test/bin/python      253MiB |
|    3   N/A  N/A     46192      C   ...envs/test/bin/python      253MiB |
|    4   N/A  N/A     46089      C   ...envs/test/bin/python     4409MiB |
|    4   N/A  N/A     46118      C   ...envs/test/bin/python     8569MiB |
|    4   N/A  N/A     46176      C   ...envs/test/bin/python     5569MiB |
|    6   N/A  N/A     46100      C   ...envs/test/bin/python     4409MiB |
|    6   N/A  N/A     46125      C   ...envs/test/bin/python     8569MiB |
|    6   N/A  N/A     46186      C   ...envs/test/bin/python     5569MiB |
|    7   N/A  N/A     46111      C   ...envs/test/bin/python     4409MiB |
|    7   N/A  N/A     46150      C   ...envs/test/bin/python     8569MiB |
while calling cudnnFindConvolutionForwardAlgorithm( context(),
descriptor(data), (const cudnnFilterDescriptor_t)filter_handle,
(const cudnnConvolutionDescriptor_t)conv_handle,
descriptor(dest_desc), num_possible_algorithms, &num_algorithms,
perf_results.data()) in file /tmp/pip-install-fz7s/dlib_537e10d/dlib/cuda/cudnn_dlibapi.cpp:819.
code: 2, reason: CUDA Resources could not be allocated.
6846:CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
6847:For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
6889:2022-11-04 18:08:52,712 CUDA error: an illegal memory access was encountered
6890:CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
6891:For debugging consider passing CUDA_LAUNCH_BLOCKING=1.



>>> import dlib
>>> dlib.DLIB_USE_CUDA



git clone https://github.com/davisking/dlib.git
cd dlib
python setup.py install --no DLIB_USE_CUDA


>>> import dlib
>>> dlib.DLIB_USE_CUDA


import numpy as np
import face_recognitionimg = np.zeros((100,100,3)).astype(np.uint8)face_encodings = face_recognition.face_encodings(img, known_face_locations=[[10, 50, 50, 10]], model="small")


face_encodings = face_recognition.face_encodings(img[:, : , ::-1], known_face_locations=[[10, 50, 50, 10]], model="small")


Traceback (most recent call last):File "<stdin>", line 1, in <module>File "/data1/miniconda3/envs/torch1.12/lib/python3.7/site-packages/face_recognition/api.py", line 214, in face_encodingsreturn [np.array(face_encoder.compute_face_descriptor(face_image, raw_landmark_set, num_jitters)) for raw_landmark_set in raw_landmarks]File "/data1/miniconda3/envs/torch1.12/lib/python3.7/site-packages/face_recognition/api.py", line 214, in <listcomp>return [np.array(face_encoder.compute_face_descriptor(face_image, raw_landmark_set, num_jitters)) for raw_landmark_set in raw_landmarks]
TypeError: compute_face_descriptor(): incompatible function arguments. The following argument types are supported:1. (self: _dlib_pybind11.face_recognition_model_v1, img: numpy.ndarray[(rows,cols,3),numpy.uint8], face: _dlib_pybind11.full_object_detection, num_jitters: int = 0, padding: float = 0.25) -> _dlib_pybind11.vector2. (self: _dlib_pybind11.face_recognition_model_v1, img: numpy.ndarray[(rows,cols,3),numpy.uint8], num_jitters: int = 0) -> _dlib_pybind11.vector3. (self: _dlib_pybind11.face_recognition_model_v1, img: numpy.ndarray[(rows,cols,3),numpy.uint8], faces: _dlib_pybind11.full_object_detections, num_jitters: int = 0, padding: float = 0.25) -> _dlib_pybind11.vectors4. (self: _dlib_pybind11.face_recognition_model_v1, batch_img: List[numpy.ndarray[(rows,cols,3),numpy.uint8]], batch_faces: List[_dlib_pybind11.full_object_detections], num_jitters: int = 0, padding: float = 0.25) -> _dlib_pybind11.vectorss5. (self: _dlib_pybind11.face_recognition_model_v1, batch_img: List[numpy.ndarray[(rows,cols,3),numpy.uint8]], num_jitters: int = 0) -> _dlib_pybind11.vectorsInvoked with: <_dlib_pybind11.face_recognition_model_v1 object at 0x7f746912a370>, array([[[0, 0, 0],[0, 0, 0],[0, 0, 0],
img = np.ascontiguousarray(img[:, :, ::-1])
face_encodings = face_recognition.face_encodings(img, known_face_locations=[[10, 50, 50, 10]], model="small")




2022-11-04 15:32:20.964078: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:723] failed to record completion event; therefore, failed to create inter-stream dependency
2022-11-04 15:32:20.964146: E tensorflow/stream_executor/cuda/cuda_driver.cc:1183] failed to enqueue async memcpy from host to device: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered; GPU dst: 0x7f6b56e84a00; host src: 0x55bf1eaa4cc0; size: 602112=0x93000
2022-11-04 15:32:20.964171: E tensorflow/stream_executor/stream.cc:334] Error recording event in stream: Error recording CUDA event: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered; not marking stream as bad, as the Event object may be at fault. Monitor for further errors.
2022-11-04 15:32:20.964341: E tensorflow/stream_executor/cuda/cuda_event.cc:29] Error polling for event status: failed to query event: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
2022-11-04 15:32:20.964352: F tensorflow/core/common_runtime/device/device_event_mgr.cc:221] Unexpected Event status: 1




