Pytorch2.x实现四种经典卷积神经网络算子

OpenCV学堂

发布于 2026-04-02 17:45:42

1230

Mastering Object Detection: Training YOLO-NAS on Custom Datasets

PyTorch 2.x 是一个最新的版本，提供了许多新的功能和改进。如何在 PyTorch 2.x 中实现常见的卷积操作，包括：

标准卷积（Conv2d）
深度可分离卷积（Depthwise Separable Convolution）
转置卷积（Transposed Convolution）
空洞卷积（Dilated Convolution）

1.标准卷积 (Conv2d)

标准卷积是最常见的卷积操作，用于提取特征图。

import torch
import torch.nn as nn
# 定义一个标准的2维卷积层
conv_layer = nn.Conv2d(in_channels=3, out_channels=64, kernel_size=3, stride=1, padding=1)
# 创建一个输入张量（批量大小，通道数，高度，宽度）
input_tensor = torch.randn(8, 3, 32, 32)
# 前向传播
output_tensor = conv_layer(input_tensor)
print(output_tensor.shape)  # 输出形状 (8, 64, 32, 32)

2. 深度可分离卷积 (Depthwise Separable Convolution)

深度可分离卷积将标准卷积分解为深度卷积和逐点卷积。

class DepthwiseSeparableConv(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size=3, stride=1, padding=1):
        super(DepthwiseSeparableConv, self).__init__()
        self.depthwise = nn.Conv2d(in_channels, in_channels, kernel_size, stride, padding, groups=in_channels)
        self.pointwise = nn.Conv2d(in_channels, out_channels, 1)
    def forward(self, x):
        return self.pointwise(self.depthwise(x))
# 定义一个深度可分离卷积层
ds_conv_layer = DepthwiseSeparableConv(in_channels=3, out_channels=64)
# 创建一个输入张量（批量大小，通道数，高度，宽度）
input_tensor = torch.randn(8, 3, 32, 32)
# 前向传播
output_tensor = ds_conv_layer(input_tensor)
print(output_tensor.shape)  # 输出形状 (8, 64, 32, 32)

3. 转置卷积 (Transposed Convolution)，转置卷积用于上采样操作。

# 定义一个转置卷积层
trans_conv_layer = nn.ConvTranspose2d(in_channels=64, out_channels=3, kernel_size=4, stride=2, padding=1)
# 创建一个输入张量（批量大小，通道数，高度，宽度）
input_tensor = torch.randn(8, 64, 16, 16)
# 前向传播
output_tensor = trans_conv_layer(input_tensor)
print(output_tensor.shape)  # 输出形状 (8, 3, 32, 32)

4. 空洞卷积 (Dilated Convolution)

空洞卷积通过在卷积核中引入间隙来扩大感受野。

# 定义一个空洞卷积层
dil_conv_layer = nn.Conv2d(in_channels=3, out_channels=64, kernel_size=3, stride=1, padding=2, dilation=2)
# 创建一个输入张量（批量大小，通道数，高度，宽度）
input_tensor = torch.randn(8, 3, 32, 32)
# 前向传播
output_tensor = dil_conv_layer(input_tensor)
print(output_tensor.shape)  # 输出形状 (8, 64, 32, 32)

掌握这个四种经典的卷积网络算子，在计算机视觉任务中非常有用，从图像分类、图像分类、目标检测、实例分割、姿态评估、语义分割等模型中都有它们的身影。

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2025-04-06，如有侵权请联系 cloudcommunity@tencent.com 删除

卷积神经网络