我有一个拥抱脸模型的包装纸。在这个包装器中,我有一些编码器,它们主要是一系列嵌入。在包装模型的前面,我想在一个循环中调用每个编码器的转发,但我得到了错误:
Traceback (most recent call last):
File "/home/pouramini/mt5-comet/comet/train/train.py", line 1275, in <module>
run()
File "/home/pouramini/anaconda3/lib/python3.8/site-packages/click/core.py", line 716, in __call__
return self.main(*args, **kwargs)
File "/home/pouramini/anaconda3/lib/python3.8/site-packages/click/core.py", line 696, in main
rv = self.invoke(ctx)
File "/home/pouramini/anaconda3/lib/python3.8/site-packages/click/core.py", line 1060, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/pouramini/anaconda3/lib/python3.8/site-packages/click/core.py", line 889, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/pouramini/anaconda3/lib/python3.8/site-packages/click/core.py", line 534, in invoke
return callback(*args, **kwargs)
File "/home/pouramini/mt5-comet/comet/train/train.py", line 1069, in train
result = wrapped_model(**batch)
File "/home/pouramini/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/pouramini/mt5-comet/comet/transformers_ptuning/ptuning_wrapper.py", line 135, in forward
prompt_embeds = encoder(prompt_input_ids,\
File "/home/pouramini/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/pouramini/mt5-comet/comet/transformers_ptuning/ptuning_wrapper.py", line 238, in forward
return self.embedding(prompt_token_ids)
File "/home/pouramini/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/pouramini/anaconda3/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 158, in forward
return F.embedding(
File "/home/pouramini/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 2043, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking arugment for argument index in method wrapper_index_select)下面是导致错误的代码:
for encoder in self.prompt_encoders:
#encoder = self.prompt_encoders[0]
wlog.info("********** offset: %s, length: %s", encoder.id_offset, encoder.length)
prompt_token_fn = encoder.get_prompt_token_fn()
encoder_masks = prompt_token_fn(input_ids)
wlog.info("Encoder masks: %s", encoder_masks)
if encoder_masks.any():
#find input ids for prompt tokens
prompt_input_ids = input_ids[encoder_masks]
wlog.info("Prompt Input ids: %s", prompt_input_ids)
# call forwards on prompt encoder whose outputs are prompt embeddings
prompt_embeds = encoder(prompt_input_ids,\
prompt_ids).to(device=inputs_embeds.device)然而,如果我只使用cpu作为设备,代码就会运行。另外,如果我有一个编码器,代码是通过cuda运行的,但当有多个编码器时,似乎希望所有的编码器都被传输到设备上,我不知道如何做到这一点。
发布于 2021-12-03 13:55:18
基于注释,我在训练前添加了以下代码。
wrapped_model.to(device=device)
for encoder in wrapped_model.prompt_encoders:
encoder.to(device=device)有趣的是,当有一个编码器或一个编码器列表包括一个编码器时,我不需要显式地将它放在设备上,但对于编码器列表,我似乎必须这样做。
原因可能是我将设备上的单个编码器放在了forward函数中。
https://stackoverflow.com/questions/70212272
复制相似问题