with torch.no_grad()，在该模块下，所有计算得出的tensor的requires_grad都自动设置为False。即使一个tensor(命令为x)的requires_grad=True，在with torch.no_grad下，由x计算得到的requires_grad也为False。

torch.state_dict()函数

torch.state_dict()方法是一个简单的python字典对象，将每一层与它对应的参数建立映射关系(如model中每一层的weights及偏置bias等)。
pytorch一种模型保存和加载的方式state_dict()，返回一个OrderDict，存储了网络结构名字和对应的参数。

Notice：

只有那些参数可以被训练的layer才会被保存到模型的state_dict中，如卷积层、线性层等等，像池化层这些本身没有参数的层是没有在这个字典中的。
state_dict()方法一方面方便查看某一层的权值和偏置，另一方面是在模型保存的时候使用。
查看Module层的权值和偏置bias。

e.g. 字典默认的遍历是遍历key，所以param实际上是键值。

for param in model.state_dict():
    print(param, '\t', model.state_dict()[param].size())
'''
results:
	conv1.weight	torch.size([32,3,3,3])
	conv1.bias		torch.size([32])
'''

优化器中optimizer中的state_dict()方法

优化器对象optimizer也有一个state_dict()方法，它包含了优化器的状态以及被使用的超参数(如：lr，momentum，weight等)

optimizer = SGD(model.parameters(), lr=0.001, momentum=0.9)
for var in optimizer.state_dict():
    print(var, '\t', optimizer.state_dict()[var])
'''
results:
		param_groups[{
‘lr’:0.001,
‘momentum’:0.9,
‘param’:[1.24, 1.421, 1.366,…]
}]
'''

torch.where(condition, x, y)函数

condition是条件，x 和 y 是同shape 的矩阵, 针对矩阵中的某个位置的元素, 满足条件就返回x，不满足就返回y

a = torch.randn(3, 4)
print(a)
b = torch.arange(12, dtype=torch.float).reshape(3, 4)
print(b)
print(torch.where(a > 0, a, b))
-----------------------------------
tensor([[ 0.8974,  1.1078, -0.8711,  0.9044],
        [ 0.1937, -0.3344, -0.1034, -0.0874],
        [-0.4632, -1.5329,  1.0019, -0.8950]])
tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])
tensor([[ 0.8974,  1.1078,  2.0000,  0.9044],
        [ 0.1937,  5.0000,  6.0000,  7.0000],
        [ 8.0000,  9.0000,  1.0019, 11.0000]])

torch.cuda()方法

CUDA，Compute Unified Device Architecture?

本质：CUDA是用于深度学习模型，在cpu和GPU之间交换数据的并行计算架构。

简单粗暴记忆用法：将running涉及到的model和tensors都传送到显卡上！！

在pytorch中，即使是有GPU的机器，它也不会自动使用GPU，而是需要在程序中显示指定。调用model.cuda()，可以将模型加载到GPU上去。这种方法不被提倡

建议使用model.to(device)的方式，数据和模型送入GPU中。这样可以显示指定需要使用的计算资源，特别是有多个GPU的情况下。

# CUDA GPU 设置方式
 
# 方案一：使用os.environ，这种情况如果使用GPU不需要设置
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0,1' # 指明调用的GPU为0,1号
-------------------------------------------------------
# 方案二：使用“device”，后续对要使用GPU的变量用.to(device)即可
device = torch.device("cuda:1" if torch.cuda.is_available() else "cpu") # 指明调用的GPU为1号

# CUDA GPU 调用方式
# 方案一：model.cuda()
-------------------------------------------------------
# 方案二：model.to(device)

torch.cuda.is_available()

判断当前运行环境是否成功调用显卡GPU和CUDA()

torch.to(device)函数

将vector变量copy一份到device指定的GPU上，之后的运算都在GPU上进行。

torch.cuda.empty_caches()

释放显存

用GPU来运行Python代码

import torch
# 定义一个简单的自定义函数
def add(a, b):
    return a + b
# 将 Python 函数转换为 Torch 脚本 
add_script = torch.jit.script(add)
# 将 Torch 脚本移动到 GPU 上
add_cuda = add_script.cuda()
# 在 GPU 上调用自定义函数
x = torch.ones(5).cuda()
y = torch.ones(5).cuda()
z = add_cuda(x, y)
print(z)

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

原因: GPU CUDA 只支持tensor操作，不支持numpy操作

将tensor从显卡传回本地 -》tensor.cpu()

torch file

.pt, .pth, .pkl文件

?torch有后缀名为.pt, .pth, .pkl等后缀名的文件，但它们职级上没有什么不同，只是后缀名不同而已，根据个人喜好，用不同的后缀名。

torch.save()和torch.load()函数?

torch.save()函数一般用于保存文件/模型，torch.load()函数一般用于加载文件/模型。

注意：torch.save()保存的是模型参数，而不是模型本身！

# 保存model
torch.save(model.state_dict(), mymodel.pth)  # 只是保存模型权重参数，不保存模型结构

# 加载模型
model = My_model(*args, **kwargs)  # 这里需要重新定义模型结构，mymodel
model.load_state_dict(torch.load(mymodel.pth))  # 这里根据模型结构，调用存储的模型参数
model.eval()

?保存整个model的状态

# 保存model
torch.save(model, mymodel.pth)  # 保存整个model的状态

# 加载model
model = torch.load(mymodel.pth)  # 这里已经不需要重构模型结构了，直接load就可以
model.eval()

torch.nn

传统建模方法

torch.nn.Module是所有网络的基类，DNN模型继承这个类。?

import torch.nn as nn

class Model(nn.Module):
    def __init__(self):
        super(model1,self).__init__()

        self.linear1=nn.Linear(1,10)
        self.activation1=nn.ReLU()
        self.linear2=nn.Linear(10,100)
        self.activation2=nn.ReLU()
        self.linear3=nn.Linear(100,10)
        self.activation3=nn.ReLU()
        self.linear4=nn.Linear(10,1)
    def forward(self,x):
        out=self.linear1(x)
        out=self.activation1(out)
        out=self.linear2(out)
        out=self.activation2(out)
        out=self.linear3(out)
        out=self.activation3(out)
        out=self.linear4(out)
        return out

model = Model()

torch.nn.Parameter(Tensor)函数?

background：当我们的网络有一些其他的设计时，会需要一些额外的参数同样很着整个网络的训练进行学习更新，最后得到最优的值?

本质：torch.nn.Parameter(Tensor)的输入是tensor变量，用于生成参数矩阵W。将一个固定不可训练的tensor转化成一个可以训练改变的vector(即parameter)，并将parameter绑定到这个module里面。

Requires_grad=True表示可以训练(改变)，False表示值不可改变。

nn.Parameter类其实是Tensor的子类，所以它也会被自动记录计算历史和反向传播，如果一个Tensor是Parameter，那么它会自动被添加到模型的参数列表里。所以在自定义含模型参数的层时，我们应该将参数定义成Parameter，除了直接定义成Parameter类外，还可以使用ParameterList和ParameterDict分别定义参数的列表和字典。

class MyListDense(nn.Module):
    def __init__(self):
        super(MyListDense, self).__init__()
        self.params = nn.ParameterList([nn.Parameter(torch.randn(4, 4))
                                         for i in range(3)])
        self.params.append(nn.Parameter(torch.randn(4, 1)))

    def forward(self, x):
        for i in range(len(self.params)):
            x = torch.mm(x, self.params[i])
        return x
net = MyListDense()
print(net)

class MyDictDense(nn.Module):
    def __init__(self):
        super(MyDictDense, self).__init__()
        self.params = nn.ParameterDict({
                'linear1': nn.Parameter(torch.randn(4, 4)),
                'linear2': nn.Parameter(torch.randn(4, 1))
        })
        self.params.update({'linear3': nn.Parameter(torch.randn(4, 2))}) # 新增

    def forward(self, x, choice='linear1'):
        return torch.mm(x, self.params[choice])

net = MyDictDense()
print(net)

import torch
from torch import nn

# 卷积运算（二维互相关）
def corr2d(X, K): 
    h, w = K.shape
    X, K = X.float(), K.float()
    Y = torch.zeros((X.shape[0] - h + 1, X.shape[1] - w + 1))
    for i in range(Y.shape[0]):
        for j in range(Y.shape[1]):
            Y[i, j] = (X[i: i + h, j: j + w] * K).sum()
    return Y

# 二维卷积层
class Conv2D(nn.Module):
    def __init__(self, kernel_size):
        super(Conv2D, self).__init__()
        self.weight = nn.Parameter(torch.randn(kernel_size))
        self.bias = nn.Parameter(torch.randn(1))

    def forward(self, x):
        return corr2d(x, self.weight) + self.bias

torch.nn.ModuleList()

nn.ModuleList()是一个无序性的序列，并没有实现forward()方法

nn.ModuleList()方法：

ModuleList可以存储多个model，传统的方法一个model就要写一个forward，但如果将它们存到一个ModuleList的话，就可以使用一个forward。
ModuleList是Module的子类，当Module使用它的时候，就能自动识别为子module，所以nn.ModuleList内部的nn.Module参数也被添加到我们网络的parameter中
ModuleList使用网络结构具有灵活性。

import torch
import torch.nn as nn
 
class testNet(nn.Module):
    def __init__(self):
        super(testNet, self).__init__()
        self.combine = nn.Sequential(
            nn.Linear(100,50),
            nn.Linear(50,25),
        ) 
    
    def forward(self, x):
        x = self.combine(x)
 
        return x
 
testnet = testNet()
input_x = torch.ones(100)
output_x = testnet(input_x) 
print(output_x)

torch.Module package

torch.Module.children()和modules()函数?

torch.Module.children()和modules()函数，都是返回网络模型里的组成元素。

children()，返回的是最外层的元素
modules()，返回的是所有的元素??

torch.Module.parameters()函数

model.parameters()方法，返回模型可学习(梯度更新)的参数，只是返回的是一个generator。

for param in model.parameters():
    print(param.size())
'''
results:
	torch.size([32, 3, 3, 3])
'''

parameters()和state_dict()方法相比，没有相应的key，只是一个有纯参数组成的generator。

torch.next()函数?

返回迭代器的下一个项目

torch.iter()函数

生成迭代器函数

torch.manual_seed()函数

设置cpu生成随机数的种子，方便下次复现结果

model.train()函数

训练时启动batch normalization和dropout

如果模型中有BN和dropout，需要在训练时添加model.train()，保证BN层能够用到每一批数据的均值和方差。对于dropout，model.train()是随机取一部分网络连接来训练更新参数。

model.eval()函数

测试时不启动batch normalization和dropout

如果模型中有BN和dropout，在测试时添加model.eval()。Model.eval()是保证BN层能够用全部训练数据的均值和方差，即测试过程中要保证BN层的均值和方差不变。对于dropout，model.eval()是利用了所有网络连接，即不进行随机舍弃神经元。

torch.nn.Sequential()函数

用于快速搭建神经网络的方法。?

nn.sequential是一个sequential容器，模块将按照构造函数中传递的顺序添加到模块中。

nn.Sequential()定义的网络中各层会按照定义的顺序进行级联，需要保证各层的输入和输出之间要衔接，并且nn.Sequential实现了forward方法。

import torch
import torch.nn as nn
 
class testNet(nn.Module):
    def __init__(self):
        super(testNet, self).__init__()
        self.combine = nn.Sequential(
            nn.Linear(100,50),
            nn.Linear(50,25),
        ) 
    
    def forward(self, x):
        x = self.combine(x)
 
        return x
 
testnet = testNet()
input_x = torch.ones(100)
output_x = testnet(input_x) 
print(output_x)

torch.nn.Linear()

nn.Linear(input, output)用于设置网络中全连接层的

nn.Linear()全连接层函数里面应该有w矩阵，矩阵size=[input_dim, output_dim]

全连接层的输入与输出一般都设置为二维向量，形状为[batch_size, size]，不同于卷积层要求输入输出是四维向量。

torch.nn.Linear(input_features, output_features, bias=True)

-?in_features, 指的是输入二维张量的大小，即输入[batch_size, size]中的size。

-?out_feature, 指的是输出二维张量的大小，即输出二维张量的形状为[batch_size, output_size]，output_size代表了全连接层神经元的个数。

从输入输出的角度看，相当于一个[batch_size, in_features]的输入tensor变换成了一个[batch_size, out_features]的输出tensor。

import torch
connected_layer = nn.Linear(in_features=64*64*3, out_features=1)
# 输入图像形状为[64,64,3]
input = torch.randn(1, 64, 64, 3)
# 将思维张量转换为2维张量后，才能作为全连接层输入
input = input.view(1, 64*64*3)
output = connected_layer(input) # 调用全连接层

torch 标准化函数

torch.nn.LayerNorm() 层标准化

torch.nn.LayerNorm(normalized_shape, eps = 1e-5, elementwise_affine = True, device=None, dtype=None)

normalized_shape: 可以设定为int, list, 或torch.Size([3, 4])
eps: float，对输入数据进行归一化时加在分母上，防止除零
elementwise_affine: bool

torch.nn.BatchNorm1d() 批标准化

torch.nn.BatchNorm1d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True, device=None, dtype=None)
将num_features那一维进行归一化，防止梯度散射。如果你输出的tensor是(N, C, L)维度的，那么这里定义为C；如果你输入的tensor是(N, L)维度的，则此处设定为L。这里N表示batch_size，C是数据的channel(通道)，L是特征维度(数据长度)
eps

torch.nn.BatchNorm2d() 批标准化

torch.nn.BatchNorm3d() 批标准化

torch.nn.Dropout(input, p=0.5, inplace=False)函数

防止过拟合，以概率p将张量tensor元素置0。

torch.nn.Dropout(p=0.5, inplace=False)

p (float) - probability of an element to be zeroed. Default 0.5
inplace (bool) - If set to True, will do this operation in-place. Default: False

m = nn.Dropout(p=0.2)
input = torch.randn(20, 16)
output = m(input)

torch.nn.functional package

torch.nn.functional.conv1d()一维卷积函数?

torch.nn.functional.conv1d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1) → Tensor

input?– input tensor of shape?(\text{minibatch} , \text{in\_channels} , iW)(minibatch,in_channels,iW)

weight?– filters of shape?(\text{out\_channels} , \frac{\text{in\_channels}}{\text{groups}} , kW)(out_channels,groupsin_channels?,kW)

bias?– optional bias of shape?(\text{out\_channels})(out_channels). Default:?None

stride?– the stride of the convolving kernel. Can be a single number or a one-element tuple?(sW,). Default: 1

>>> inputs = torch.randn(33, 16, 30)
>>> filters = torch.randn(20, 16, 5)
>>> F.conv1d(inputs, filters)

torch.nn.functional.conv2d()二维卷积函数

torch.nn.functional.conv2d(input,?weight,?bias=None,?stride=1,?padding=0,?dilation=1,?groups=1) → Tensor

input?– input tensor of shape?(\text{minibatch} , \text{in\_channels} , iH , iW)(minibatch,in_channels,iH,iW)

weight?– filters of shape?(\text{out\_channels} , \frac{\text{in\_channels}}{\text{groups}} , kH , kW)(out_channels,groupsin_channels?,kH,kW)

bias?– optional bias tensor of shape?(\text{out\_channels})(out_channels). Default:?None

stride?– the stride of the convolving kernel. Can be a single number or a tuple?(sH, sW). Default: 1

padding?–

>>> # With square kernels and equal stride
>>> filters = torch.randn(8, 4, 3, 3)
>>> inputs = torch.randn(1, 4, 5, 5)
>>> F.conv2d(inputs, filters, padding=1)

torch.nn.Conv1d()一维卷积函数?

class torch.nn.Conv1d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)

in_channels(int) – 输入信号的通道。在文本分类中，即为词向量的维度
out_channels(int) – 卷积产生的通道。有多少个out_channels，就需要多少个1维卷积
kernel_size(int or tuple) - 卷积核的尺寸，卷积核的大小为(k,)，第二个维度是由in_channels来决定的，所以实际上卷积大小为kernel_size*in_channels
stride(int or tuple, optional) - 卷积步长
padding (int or tuple, optional)- 输入的每一条边补充0的层数
dilation(int or tuple, `optional``) – 卷积核元素之间的间距
groups(int, optional) – 从输入通道到输出通道的阻塞连接数
bias(bool, optional) - 如果bias=True，添加偏置

conv1 = nn.Conv1d(in_channels=256，out_channels=100,kernel_size=2)
input = torch.randn(32,35,256)
# batch_size x text_len x embedding_size -> batch_size x embedding_size x text_len
input = input.permute(0,2,1)
out = conv1(input)
print(out.size())

torch.nn.Conv2d()二维卷积函数

对由多个输入平面组成的输入信号进行二维卷积。

torch.nn.Conv2d(in_features, out_channels, kernel_size, stride=1, padding=0,*)
- in_channels, 输入图像通道数
- out_channels, 卷积产生的通道数
- kernel_size, (int or turple)，卷积核尺寸，可以设为(1)或(2,3)
- stride，步长，默认为1
- padding，填充操作，默认为0。

x = torch.randn(3,1,5,4)  # 表示batch_size，一个batch中样本个数为3；表示channels，通道数，也就是当前层的深度；height_1，图片高度；weight_1，图片宽度
conv = torch.nn.Conv2d(1, 4, (2,3))  # channels，通道数1；output_channels，输出通道数4，即需要4个filter；卷积核
res = conv(x)  # res.shape是(3,4,4,2)

torch.nn.functional.dropout()函数

以概率p对input tensor的值随机置0。

Hinton提出用于在training时防止过拟合的trick

torch.nn.functional.dropout(input, p=0.5, training=True, inplace=False)

torch.nn.Dropout 与 torch.nn.functional.dropout 区别

nn.dropout需要先在__init__()函数中定义为layer，后面才能使用；

而F.dropout可以作为内部函数直接在forward()函数中调用

import torch
import torch.nn as nn

class Model1(nn.Module):
    # Model 1 using functional dropout
    def __init__(self, p=0.0):
        super().__init__()
        self.p = p

    def forward(self, inputs):
        return nn.functional.dropout(inputs, p=self.p, training=True)

class Model2(nn.Module):
    # Model 2 using dropout module
    def __init__(self, p=0.0):
        super().__init__()
        self.drop_layer = nn.Dropout(p=p)

    def forward(self, inputs):
        return self.drop_layer(inputs)
model1 = Model1(p=0.5) # functional dropout 
model2 = Model2(p=0.5) # dropout module
————————————————
版权声明：本文为CSDN博主「图灵的喵」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/r1254/article/details/91867736

torch.nn.functional.log_softmax()函数

简单来说，log_softmax就是对softmax进行一个log.

import torch.nn.functional as F

F.log_softmax(self.linear(x), dim=-1)

Note that，log_softmax比softmax更快并且在数字特征上表现更好。?

torch.nn.init()

权值初始化方法

xavier_uniform()_，均匀分布
xavier_normal_()，正态分布

torch.nn 激活函数

torch.nn.Sigmoid()函数

tensor合理保证数据不出现nan和inf。

不出现0，下一步就不会出现inf
不出现inf，下一步就不会出现nan

相比于Tanh函数值域(-1,1)、ReLU函数值域(0,正无穷)、LeakyReLU值域R、ELU值域R

nn.Sigmoid函数合理处理tensor中的0元素。将实数域R上的值域映射到(0,1)。

m = nn.Sigmoid()
input = torch.randn(2)
output = m(input)

torch.nn.Softmax()函数?

# dim = 0,在列上进行Softmax;dim=1,在行上进行Softmax

torch.nn.functional.softmax无用用于模型__init__()函数中。

    def __init__(self, nce_m, eps):
        super(NCECriterion, self).__init__()
        self.nce_m = nce_m
        self.eps = eps
        self.relu = nn.ReLU()
        self.softmax = nn.Softmax(dim=0)

    def forward(self, x, labels):
        pred_prob = self.softmax(x)

torch.nn.ReLU(inplace=False)函数

ReLU(x)=max(0,x)

  >>> m = nn.ReLU()
  >>> input = torch.randn(2)
  >>> output = m(input)

torch.nn.LeakyReLU(negative_slope=0.01,?inplace=False)函数

m = nn.LeakyReLU(0.1)
input = torch.randn(2)
output = m(input)

torch.nn.Softplus(beta=1, threshold=20)函数

Softplus是ReLU函数的平滑近似，可以用来约束机器的输出始终为正。

m = nn.Softplus()
input = torch.randn(2)
output = m(input)

?torch.optimizer()优化函数

Adam()函数

torch.optimizer.step()方法

torch.optimizer.step()方法会更新所有的参数。

????????scheduler.step()是pytorch用来更新优化器学习率的，一般按照epoch为单位进行更换。

import torch.optim as optim

optimizer = optim.Adam(model.parameters(), lr=args.lr, weight_decay=1e-4)

torch 创建张量tensor

torch.requires_grad()函数

torch.requires_grad参数

if设置为True，那么它将会追踪对于该张量的所有操作。当完成计算后可以通过调用backward()方法来自动计算所有梯度，即tensor会自动求导，它所有依赖的节点requires_grad都为True。
默认为False，反向传播时就不会自动求导了，节约了显卡内存。

torch.tensor()方法

torch.tensor()创建张量。

Note that, torch.tensor创建tensor需要有两个方框[[], []]

arr = np.ones((3, 3))
t = torch.tensor(arr)

# 或直接创建一个空tensor
t = torch.tensor()

torch.Tensor()方法

torch.Tensor是一种包含单一数据类型元素的多维矩阵。

定义了7种cpu tensor类型和8种GPU tensor类型：torch.Tensor默认的是(torch.FloatTensor)

cpu tensor类型：torch.FloatTensor, torch.DoubleTensor, torch.ByteTensor, torch.CharTensor, torch.ShortTensor, torch.IntTensor, torch.LongTensor
GPU tensor类型：torch.cuda.FloatTensor, torch.cuda.DoubleTensor, torch.cuda.ByteTensor, torch.cuda.CharTensor, torch.cuda.ShortTensor, torch.cuda.IntTensor, torch.cuda.LongTensor

可以通过list或序列来构建

>>> torch.FloatTensor([[1, 2, 3], [4, 5, 6]])
1 2 3
4 5 6
[torch.FloatTensor of size 2x3]

可以通过size来创建

>>> torch.IntTensor(2, 4).zero_()
0 0 0 0
0 0 0 0
[torch.IntTensor of size 2x4]

torch.from_numpy()函数

torch.from_numpy()函数，把数组转换成张量，且二者共享内存，对张量进行修改，那么原始数组也会相应发生改变。?

import torch

x = torch.Tensor(2, 3)  # 生成一个2*3的Tensor张量

# 将Tensor转换为numpy数组
y = x.numpy()

# 将numpy数组转换为Tensor
z = torch.from_numpy(y)

torch.randn()函数和torch.rand()函数

torch.randn(*sizes, out=None)函数，从标准正态分布（均值为0，方差为1，即高斯白噪声）中抽取的一组随机数，数据类型是tensor。
torch.rand(*sizes, out=None)从区间[0, 1)的均匀分布中抽取的一组随机数。张量tensor的形状由参数sizes定义

a=torch.randn(3)  #生成一个一维的矩阵
b=torch.randn(1,3)  #生成一个二维的矩阵
print(a)
print(b)
torch.mean(a)

torch.randperm(n)函数

torch.randperm(n)函数返回一个随机打散的0~n-1 tensor数组

torch.randperm(4)

-----------------------------
tensor([3, 1, 2, 0])
torch.Tensor

torch.ones(*sizes, out=None)函数

torch.ones()函数返回一个全为1的张量tensor，形状由可变参数sizes定义。

 torch.ones(2, 3)
 
 1  1  1
 1  1  1

torch.zeros(sizes, out=None)函数

torch.zeros()函数，返回一个全为标量0的张量，形状由可变参数sizes定义。

torch.zeros(2, 3)
 
 0  0  0
 0  0  0

torch.tensor.zero_()函数

fill tensor with zero, 将tensor重置为0，常用于权重更新过程中梯度(误差e)重置。

torch.eye(n)函数

主要是为了生成对角线全是1，其余部分全为0的数组。

torch.normal()函数

torch.normal(means, std, out=None)?

返回一个张量tensor，包含从给定参数means, std的离散正态分布中抽取随机数。

可以用于模拟真实数据的"噪音"数据。

torch tensor calculation

NOTE: 索引操作(类似于numpy)。索引出来的结果与元数据共享内存，修改一个，另一个会跟着修改。如果不想修改，可以考虑使用copy()等方法。

加

torch.add(input, value, other, out=None)函数

torch.add()函数对other张量的每一个元素乘以一个标量值value，并加到输入张量input上，并返回结果到一个新的张量out，即out = input?+ (other * value)。

torch.sum()函数?

torch.sum(input_tensor, dim)函数对输入tensor的某一维求和

input: 输入一个tensor
dim，表示按照数组的第几个维度求和，e.g. dim=0，表示按tensor.shape的第一个维度进行求和，即dim=0维改变，变为(1, n)；同理，dim=1，sum后为(n, 1)。

a = torch.ones((2, 3))
print(a):
tensor([[1, 1, 1],
 		[1, 1, 1]])

a1 =  torch.sum(a)
a2 =  torch.sum(a, dim=0)  # 对第0维求和
a3 =  torch.sum(a, dim=1)  # 对第1维求和

tensor(6.)
tensor([2., 2., 2.])
tensor([3., 3.])

减

torch.sub()减法函数

torch.sub(input, other, *, alpha=1, out=None)函数 -> tensor.

subtracts other, scaled by alpha, from input. 即 out = input - alpha * other.

另一种形式是：torch.parameter.data.sub(other)

>>> a = torch.tensor((1, 2))
>>> b = torch.tensor((0, 1))
>>> c = torch.sub(a, b, alpha=2)
>>> print(c)
tensor([1, 0]) 
'''
结合计算公式可知：c1 = a1 - alpha*b1 = 1 - 2*0 = 1 ；c2 = a2 - alpha*b2 = 2 - 2*1 = 0
'''

torch.sub_(x)函数

torch.sub(x)函数的inplace运算形式，即替换原有的数据 -> tensor

>>> a = torch.tensor((1, 2))
>>> b = torch.tensor((0, 1))
>>> a.sub_(b, alpha=2)
>>> print(a)
tensor([1, 0]) 
'''
可见对a直接用调用方法的形式使用torch.sub_()可以直接将a原先的值修改为torch.sub(a,b,alpha=2)的运算结果
'''

?乘

torch.mm()函数和torch.mul()函数

torch.mul(a, b)函数，是矩阵a和b对应位相乘，比如a的维度是(1,2)，b的维度是(1,2)，返回的仍是(1,2)的矩阵。
torch.mm(a, b)函数，是矩阵a和b矩阵相乘。

torch.matmul()函数：内积

torch.matmul()函数几乎用于所有矩阵/向量相乘的情况，其乘法规则由参与乘法的两个张量的维度而定，包括: 1d *?1d, 2d * 2d, 1d * 2d, 2d * 1d, 3d and above.

如果两个张量都是一维的，即torch.Size([n])，此时返回两个向量的点积，作用于torch.dot()相同，同样要求两个一维向量的元素个数相同。

>>> vec1 = torch.tensor([1, 2, 3])
>>> vec2 = torch.tensor([2, 3, 4])
>>> torch.matmul(vec1, vec2)
tensor(20)
>>> torch.dot(vec1, vec2)
tensor(20)

# 两个一维张量的元素个数要相同！
>>> vec1 = torch.tensor([1, 2, 3])
>>> vec2 = torch.tensor([2, 3, 4, 5])
>>> torch.matmul(vec1, vec2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: inconsistent tensor size, expected tensor [3] and src [4] to have the same number of elements, but got 3 and 4 elements respectively

如果两个参数都是二维张量，那么将返回矩阵乘积，作用与torch.mm()相同，同样要求两个张量的形状要满足矩阵乘法的条件，即(nxm) * (mxp) = (nxp)

>>> arg1 = torch.tensor([[1, 2], [3, 4]])
>>> arg1
tensor([[1, 2],
        [3, 4]])
>>> arg2 = torch.tensor([[-1], [2]])
>>> arg2
tensor([[-1],
        [ 2]])
>>> torch.matmul(arg1, arg2)
tensor([[3],
        [5]])
>>> torch.mm(arg1, arg2)
tensor([[3],
        [5]])

>>> arg2 = torch.tensor([[-1], [2], [1]])
>>> torch.matmul(arg1, arg2)					# 要求满足矩阵乘法的条件
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: mat1 and mat2 shapes cannot be multiplied (2x2 and 3x1)

如果第一个参数是一维向量，第二个参数是二维张量，那么在一维向量的前面增加一个维度，然后进行矩阵乘法，矩阵乘法结束后移除添加的维度。

>>> arg1 = torch.tensor([-1, 2])
>>> arg2 = torch.tensor([[1, 2], [3, 4]])
>>> torch.matmul(arg1, arg2)
tensor([5, 6])

>>> arg1 = torch.unsqueeze(arg1, 0)			# 在一维张量前增加一个维度
>>> arg1.shape
torch.Size([1, 2])
>>> ans = torch.mm(arg1, arg2)				# 进行矩阵乘法
>>> ans
tensor([[5, 6]])
>>> ans = torch.squeeze(ans, 0)				# 移除增加的维度
>>> ans
tensor([5, 6])

如果第一个参数是二维张量，第二个参数是一维向量，那么将返回矩阵乘向量的积，作用与torch.mv()相同。

tensor对应点乘：Hadamard product

>>> a = torch.Tensor([[1,2], [3,4], [5, 6]])
>>> a
tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
>>> a.mul(a)
tensor([[ 1.,  4.],
        [ 9., 16.],
        [25., 36.]])
 
# a*a等价于a.mul(a)

torch.pow()函数

torch.pow()函数，对输入的每个分量求幂次运算。

a = torch.randn(4)
a  # tensor([ 0.4331,  1.2475,  0.6834, -0.2791])
torch.pow(a, 2)  # tensor([ 0.1875,  1.5561,  0.4670,  0.0779])

exp = torch.arange(1., 5.)  # tensor([ 1.,  2.,  3.,  4.])
a = torch.arange(1., 5.)  # tensor([ 1.,  2.,  3.,  4.])
torch.pow(a, exp)  # tensor([   1.,    4.,   27.,  256.])

torch.exp()指数函数

torch.exp()指数函数 -> tensor。 Returns a new tensor with the exponential of the elements of the input tensor。

torch.log()函数

torch.log()以e为底的对数函数

h_1_log = torch.log(h_1)
h_2_log = torch.log(1 - h_2)

torch.bnn(X, Y)函数

batch乘法函数，(n, a, b)和(n, b, c)相乘，得到(n, a, c)

X = torch.ones((2, 1, 4))
Y = torch.ones((2, 4, 6))

torch.bmm(X, Y).shape

-----------------------
torch.Size([2, 1, 6])

除

tensor.div()除法

>>> a = torch.randn(5)
>>> a
tensor([ 0.3810,  1.2774, -0.2972, -0.3719,  0.4637])
>>> torch.div(a, 0.5)
tensor([ 0.7620,  2.5548, -0.5944, -0.7439,  0.9275])

tensor.sqrt()开平方函数

torch.sqrt(input, out=None)函数 -> tensor，获取input tensor的平方根。注意，sqrt(x)和sqrt(x)导函数定义域不同，一个[0, ∞)，另一个(0, ∞)。

>>> a = torch.randn(4)
>>> a
tensor([-2.0755,  1.0226,  0.0831,  0.4806])
>>> torch.sqrt(a)
tensor([    nan,  1.0112,  0.2883,  0.6933])

torch.mean()函数

torch.mean()函数输入各个向量的均值。

import torch

x = torch.Tensor([1, 2, 3, 4, 5, 6]).view(2, 3)
y_0 = torch.mean(x, dim=0)
y_1 = torch.mean(x, dim=1)
print(x)
print(y_0)
print(y_1)
----------------------
tensor([[1., 2., 3.],
        [4., 5., 6.]])
tensor([2.5000, 3.5000, 4.5000])
tensor([2., 5.])

advanced mechanisms

广播机制

当对两个形状不同的tensor按元素计算时，可能会出发广播(broadcasting)机制：先适当复制元素使这个两个tensor形状相同后再按元素运算。

x = torch.arange(1, 3).view(1, 2)
print(x)
y = torch.arange(1, 4).view(3, 1)
print(y)
print(x + y)
----------------------------------
tensor([[1, 2]])
tensor([[1],
        [2],
        [3]])
tensor([[2, 3],
        [3, 4],
        [4, 5]])

torch tensor operation

torch tensor 类型变换?

tensor转换为long、float、int

import torch

tensor = torch.randn(3, 5)
print(tensor)

# torch.long() 将tensor投射为long类型
long_tensor = tensor.long()
print(long_tensor)

# torch.int()将该tensor投射为int类型
int_tensor = tensor.int()
print(int_tensor)

# torch.float()将该tensor投射为float类型
float_tensor = tensor.float()
print(float_tensor)

tensor 与numpy 数组转换

a = torch.ones(5)

# tensor转换为numpy array
b = a.numpy()
print(b)

# numpy array转换为tensor
a = torch.from_numpy(b)

torch.logical_not() 对bool tensor取反操作

label_mask = (labels == label)  # numpy array, (100,), ([True, False, True, True])
label_indices = torch.where(label_mask)[0]  # 同一标签索引, label_index, (3, ) array([0, 2, 3], dtype=int64)
if len(label_indices) < 2:
    continue
negative_indices = torch.where(torch.logical_not(label_mask))[0]  # 其他标签索引, not_label_index, array([1], dtype=int64)
anchor_pos_list = list(combinations(label_indices, 2))  # 2个元素的标签索引组合, list: 3, [(23, 66), (23, 79), (66, 79)]
anchor_pos_list = torch.asarray(anchor_pos_list)  # 转换成np.array才能进行slice切片操作, (3, 2)

torch tensor 维度变换

torch.view()函数

torch.view(a, b)函数的作用是重构张量的维度为a行b列，相当于numpy中的resize()功能。

维度-1是指这一维的维数由其他维度决定。

x = torch.randn(4, 4)
y = x.view(16)
z = x.view(-1, 8) # -1是指这一维的维数由其他维度决定
print(x.size(), y.size(), z.size())
---------------------------------------
torch.Size([4, 4]) torch.Size([16]) torch.Size([2, 8])

Note: torch.vew()返回的新的tensor与source tensor共享内存，更改其中一个，另一个也会跟着改变。?

--》为了source tensor与变换后的tensor互不影响，不共享内存，可以使用torch.reshape()方法，但此函数并不能保证返回的是其拷贝值。

推荐方法是先试用clone()函数创造一个张量副本，然后再使用torch.view()进行函数维度变换。

使用clone()还有一个好处是会被记录在计算图中，即梯度回传到副本时也会传到source tensor。
如果我们只有一个元素tensor，可以使用.item()来获得这个value。

torch.sort()函数

torch.sort(input, dim=- 1, descending=False, stable=False, *, out=None)

input (Tensor) – the input tensor
形式上与 numpy.narray 类似
dim (int, optional) – the dimension to sort along
维度，对于二维数据：dim=0 按列排序，dim=1 按行排序，默认 dim=1
descending (bool, optional) – controls the sorting order (ascending or descending)
降序，descending=True 从大到小排序，descending=False 从小到大排序，默认 descending=Flase
A tuple of (sorted_tensor, sorted_indices) is returned,?
where?the?sorted_indices?are?the?indices?of?the?elements?in?the?original?inputtensor.

import torch
x = torch.randn(3,4)
x  #初始值，始终不变
tensor([[-0.9950, -0.6175, -0.1253,  1.3536],
        [ 0.1208, -0.4237, -1.1313,  0.9022],
        [-1.1995, -0.0699, -0.4396,  0.8043]])
sorted, indices = torch.sort(x)  #按行从小到大排序
sorted
tensor([[-0.9950, -0.6175, -0.1253,  1.3536],
        [-1.1313, -0.4237,  0.1208,  0.9022],
        [-1.1995, -0.4396, -0.0699,  0.8043]])
indices
tensor([[0, 1, 2, 3],
        [2, 1, 0, 3],
        [0, 2, 1, 3]])

torch.inverse()--逆函数

torch.inverse(input, *, out=None) -> tensor，获取输入方形矩阵的逆。

torch.repeat()函数

torch.repeat()，表示沿着指定的维度重复tensor的次数

loop = torch.tensor(node).repeat(neighbor_sum, 1)  # repeat表示按列重复node的次数

torch.cat()函数

torch.cat()， dim=0，表示按tensor.shape的第一个维度进行拼接，即dim=0维改变, dim=1(第二个维度)或dim=-1(最后一维)拼接在一起。拼接后，dim=0的数值发生变化，e.g. (3,2)和(3,2)，dim=0->(5,2); dim=1 ->(3,4).

 import torch
>>> A=torch.ones(2,3)    #2x3的张量（矩阵）                                     
>>> A
tensor([[ 1.,  1.,  1.],
        [ 1.,  1.,  1.]])
>>> B=2*torch.ones(4,3)  #4x3的张量（矩阵）                                    
>>> B
tensor([[ 2.,  2.,  2.],
        [ 2.,  2.,  2.],
        [ 2.,  2.,  2.],
        [ 2.,  2.,  2.]])
>>> C=torch.cat((A,B),0)  #按维数0（行）拼接
>>> C
tensor([[ 1.,  1.,  1.],
         [ 1.,  1.,  1.],
         [ 2.,  2.,  2.],
         [ 2.,  2.,  2.],
         [ 2.,  2.,  2.],
         [ 2.,  2.,  2.]])
>>> C.size()
torch.Size([6, 3])
>>> D=2*torch.ones(2,4) #2x4的张量（矩阵）
>>> C=torch.cat((A,D),1)#按维数1（列）拼接
>>> C
tensor([[ 1.,  1.,  1.,  2.,  2.,  2.,  2.],
        [ 1.,  1.,  1.,  2.,  2.,  2.,  2.]])
>>> C.size()
torch.Size([2, 7])

torch tensor 转置

torch.t()函数

torch.t()，表示对二维矩阵进行转置

>>> import torch
>>> x = torch.Tensor([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
>>> x
 
  1   2   3   4   5
  6   7   8   9  10
[torch.FloatTensor of size 2x5]

>>> x.t()
 
  1   6
  2   7
  3   8
  4   9
  5  10
[torch.FloatTensor of size 5x2]

torch.transpose()函数

pytorch中ndarray矩阵进行转置的操作。

transpose()一次只能在两个维度间进行转置。?

torch.transpose(input, dim0, dim1, out=None) -> Tensor

input(tensor) - 输入张量，必填
dim0 (int) - 转置的第一维，默认0，optional
dim1 (int) - 转置的第二位，默认1，optional

>>> x = torch.randn(2, 3)
>>> x
tensor([[ 1.0028, -0.9893,  0.5809],
        [-0.1669,  0.7299,  0.4942]])
>>> torch.transpose(x, 0, 1)
tensor([[ 1.0028, -0.1669],
        [-0.9893,  0.7299],
        [ 0.5809,  0.4942]])

torch.permute()函数

torch转置函数，与tf.transpose(tensor, [0,2,1])对应

torch.permute(inputs, dims) -> Tensor

input (tensor) - the input tensor
dims (tuple of python: int) - the desired ordering of dimensions?

torch tensor压缩函数

torch.squeeze()函数和unsqueeze()函数

torch.squeeze(input, dim)方法，对数据的维度进行压缩，删掉第dim维为1的维度

a=torch.randn(1,1,3)
print(a.shape)
b=torch.squeeze(a)
print(b.shape)
c=torch.squeeze(a,0)
print(c.shape)
---------------
torch.Size([1, 1, 3])
torch.Size([3])
torch.Size([1, 3])

torch.unsqueeze(input, dim)方法，对数据维度进行扩充，在第dim维加上维数为1的维度。

a=torch.randn(1,3)
print(a.shape)
b=torch.unsqueeze(a,0)
print(b.shape)
--------------------
torch.Size([1, 3])
torch.Size([1, 1, 3])

torch.size()函数

?torch.size()函数，查看张量的维度信息

torch.size(0)方法，表示第0维度的数据量

torch.numel()函数

torch.numel()函数，查看一个张量有多少个元素，返回int类型的元素个数。

torch.eq()函数

torch.eq()函数，返回一个Boolean类型的tensor张量。

对两个张量tensor进行逐元素的比较，若相同位置的两个元素相同，则返回True，否则返回False。

torch 取整函数

torch.floor()，向下取整函数

torch.ceil()，向上取整函数

torch.floor_()，torch.ceil_()

torch.round()，四舍五入函数

torch.tunc()，取整数部分

torch.frac()，取小数部分

torch创建稀疏矩阵SparseTensor

torch.sparse_coo_tensor()

参考:?pytorch每日一学13(torch.spares_coo_tensor())创建稀疏矩阵_pytorch中指定grad为稀疏_Fluid_ray的博客-CSDN博客?[PyTroch系列-5]：PyTorch基础 - 稀疏矩阵与其创建方法_51CTO博客_pytorch稀疏矩阵乘法

import torch

indices = torch.tensor([[4, 2, 1], [2, 0, 2]])
values = torch.tensor([3, 4, 5], dtype=torch.float32)
x = torch.sparse_coo_tensor(indices=indices, values=values, size=[5, 5])
x

b = x.to_dense()  #把COO矩阵转换成稀疏矩阵

torch.sparse.FloatTensor(i,v,x,shape)

coo(coordinate format)是最为简单的稀疏矩阵格式，以三元组(此三元组不是知识图谱中的三元组，它只是用来表示稀疏矩阵的一种方式)的形式存储稀疏矩阵。

?FloatTensor定义一个coo类型的稀疏矩阵，i是index，v是value，shape是size

torch SparseTensor operations

SparseTensor 乘

torch.sparse.mm(mat1: torch.SparseTensor, mat2: torch.Tensor)

稀疏矩阵乘法，返回torch.tensor。

注意：第一个矩阵是SparseTensor, 第二个是张量Tensor。

torch.utils

utils是utility的缩写，表示"工具"的意思。

torch.utils.data.Dataset()函数

代表自定义数据集方法的类。用户可以通过继承该类来自定义自己的数据集类，在继承时要求用户重载__len__()和__getitem__()这两个魔法方法。

__len__(): 返回的是数据集大小
__getitem__(): 实现索引数据集中的某一个数据?

torch_scatter package

from torch_scatter import

torch_sparse package

torch-sparse?ERROR: Failed building wheel for torch-sparse

"Building wheel for torch-sparse (setup.py) ... error" when installing torch_sparse via pip3 · Issue #5015 · pyg-team/pytorch_geometric · GitHub

pip install torch-sparse==0.6.13 -f https://pytorch-geometric.com/whl/torch-1.10.0+cu113.html

torch_sparse.SparseTensor

Torch Error

TypeError: cannot unpack non-iterable NoneType object（类型错误:无法解包非迭代的NoneType对象）

原因: 将单个None赋给了多个值。

error：损失函数loss出现Nan或者INF无穷

Pytorch训练模型损失Loss为Nan或者无穷大（INF）原因_loss为inf_ytusdc的博客-CSDN博客

文章来源:https://blog.csdn.net/qq_33419476/article/details/127510347
本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若内容造成侵权/违法违规/事实不符，请联系我的编程经验分享网邮箱：chenni525@qq.com进行投诉反馈，一经查实，立即删除！