2021.08.23学习内容Pytorch与Torch的关系以及Torchvision作用

PyTorch

**:
是一个开源的Python机器学习库，基于Torch，用于自然语言处理等应用程序。 2017年1月，由Facebook人工智能研究院（FAIR）基于Torch推出了PyTorch。它是一个基于Python的可续计算包，提供两个高级功能：1、具有强大的GPU加速的张量计算（如NumPy）。2、包含自动求导系统的深度神经网络。
PyTorch的前身是Torch，其底层和Torch框架一样，但是使用Python重新写了很多内容，不仅更加灵活，支持动态图，而且提供了Python接口。它是由Torch7团队开发，是一个以Python优先的深度学习框架，不仅能够实现强大的GPU加速，同时还支持动态神经网络。
PyTorch既可以看作加入了GPU支持的numpy，同时也可以看成一个拥有自动求导功能的强大的深度神经网络。除了Facebook外，它已经被Twitter、CMU和Salesforce等机构采用
**

基础环境

**：一台PC设备、一张高性能NVIDIA显卡(可选)、Ubuntu系统。

pytorch与torch的区别、联系

torchvision

**
torchvision 工具库是 pytorch 框架下常用的图像处理包，可以用来生成图片和视频数据集（torchvision.datasets），做一些图像预处理(torchvision.transforms)，导入预训练模型(torchvision.models)，以及生成和保存图像（torchvision.utils）。

其中，transforms函数对图像做预处理可以是：归一化(normalize)，尺寸剪裁(resize)，翻转(flip) 等。

上面的这些步骤实际操作起来往往是一系列的，此时可以用compose将这些图像预处理操作连起来。

很多基于Pytorch的工具集都非常好用，比如处理自然语言的torchtext，处理音频的torchaudio，以及处理图像视频的torchvision。

torchvision包含一些常用的数据集、模型、转换函数等等。当前版本0.5.0包括图片分类、语义切分、目标识别、实例分割、关键点检测、视频分类等工具，它将mask-rcnn功能也都包含在内了。mask-rcnn的Pytorch版本最高支持torchvision 0.2.*，0.3.0之后mask-rcnn就包含到tensorvision之中了。

Python图像库PIL的类Image及其方法介绍

**
图像读取通常是RGB，但是图像不一定是RGB格式，这是需要转换。这篇博客介绍非常详细，收藏备用

def convert_PIL_to_numpy(image, format):"""Convert PIL image to numpy array of target format.Args:image (PIL.Image): a PIL imageformat (str): the format of output imageReturns:(np.ndarray): also see `read_image`"""if format is not None:# PIL only supports RGB, so convert to RGB and flip channels over belowconversion_format = formatif format in ["BGR", "YUV-BT.601"]:conversion_format = "RGB"image = image.convert(conversion_format)image = np.asarray(image)# PIL squeezes out the channel dimension for "L", so make it HWCif format == "L":image = np.expand_dims(image, -1)# handle formats not supported by PILelif format == "BGR":# flip channels if neededimage = image[:, :, ::-1]elif format == "YUV-BT.601":image = image / 255.0image = np.dot(image, np.array(_M_RGB2YUV).T)return image

def convert_image_to_rgb(image, format):"""Convert an image from given format to RGB.Args:image (np.ndarray or Tensor): an HWC imageformat (str): the format of input image, also see `read_image`Returns:(np.ndarray): (H,W,3) RGB image in 0-255 range, can be either float or uint8"""if isinstance(image, torch.Tensor):image = image.cpu().numpy()if format == "BGR":image = image[:, :, [2, 1, 0]]elif format == "YUV-BT.601":image = np.dot(image, np.array(_M_YUV2RGB).T)image = image * 255.0else:if format == "L":image = image[:, :, 0]image = image.astype(np.uint8)image = np.asarray(Image.fromarray(image, mode=format).convert("RGB"))return image

def read_image(file_name, format=None):with open(file_name, "rb") as f:image = Image.open(f)return convert_PIL_to_numpy(image, format)

版权声明：本文为CSDN博主「Leemboy」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/leemboy/article/details/83792729

这篇着重讲convert,转换图片格式

版权声明：本文为CSDN博主「icamera0」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/icamera0/article/details/50843172