【华为昇腾】DynamicGRUV2算子不支持

本文介绍了在昇腾910B2芯片上运行PyTorch训练代码时遇到的DynamicGRUV2算子不支持问题的解决方案。当使用torch.nn.GRU模块时，系统会报错显示该算子不被支持。通过将DynamicGRUV2算子添加到二进制黑名单中，并将nn.GRU模块的输入输出参数都设为float16类型，成功解决了该问题。具体实现方法包括设置NPU_FUZZY_COMPILE_BLACKLIST选项为&

qq_33211006

659人浏览 · 2025-09-01 15:21:03

qq_33211006 · 2025-09-01 15:21:03 发布

这里写自定义目录标题

先置条件

芯片：昇腾910B2
驱动：25.0.RC1.1
CANN： 8.2.RC1
Pytorch：2.6.0
PTA：7.1.0

问题现象

训练代码设计到torch的nn.GRU模块，会触发DynamicGRUV2算子不支持报错，如下所示：

[ERROR] xxxx-xx-xx-xx:xx:xx (PID:xxxxxx, Device:0, RankID:-1) ERR00100 PTA call acl api failed.
EZ3002: [PID: xxxxxx] xxxx-xx-xx-xx:xx:xx Optype [DynamicGRUV2] of Ops kernel [AIcoreEngine] is unsupported. Reason: [tbe-custom]:op type DynamicGRUV2 is not found in this op store.[tbe-custom]:op type DynamicGRUV2 is not found in this op store.[Static shape check]:data type DT_FLOAT of input [x] is not supported. All supported data type and format of tensor input0.x is: Data Type: {DT_FLOAT16,DT_FLOAT16,DT_FLOAT16,DT_FLOAT16}Format:{FRACTAL_NZ,FRACTAL_NZ,FRACTAL_NZ,FRACTAL_NZ}.
        Possible Cause: The operator type is unsupported in the operator information library due to specification mismatch.
        Solution: Submit an issue to request for support at https://gitee.com/ascend, or remove this type of operators from your model.
        TraceBack (most recent call last):
        No supported Ops kernel and engine are found for [DynamicGRUV21], optype [DynamicGRUV2].
        Assert ((SelectEngine(node_ptr, exclude_engines, is_check_support_success, op_info)) == ge::SUCCESS) failed[FUNC:operator()][FILE:engine_place.cc][LINE:148]
        RunAllSubgraphs failed, graph=online.[FUNC:RunAllSubgraphs][FILE:engine_place.cc][LINE:122]
        build graph failed, graph id:0, ret:4294967295[FUNC:BuildModelWithGraphId][FILE:ge_generator.cc][LINE:1624]
        [Build][SingleOpModel]call ge interface generator.BuildSingleOpModel failed. ge result = 4294967295[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]
        [Build][Op]Fail to build op model[FUNC:ReportInnerError][FILE:log_inner.cpp][LINE:145]
        build op model failed, result = 500002[FUNC:ReportInnerError][FILE:log_inner.cpp][LINE:145]

官网的算子文档中，显示nn.GRU算子不支持。

解决方案

把DynamicGRUV2算子添加到二进制黑名单中，并且保证nn.GRU模块的入参和出参都是float16，nn.GRU模块顺利执行成功。

import torch
import torch.nn as nn
import torch_npu
torch_npu.npu.set_compile_mode(jit_compile=False)
torch.npu.set_option({"NPU_FUZZY_COMPILE_BLACKLIST": "DynamicGRUV2"})

device = "npu:0"
x = torch.randn(101,1,16).to(device).to(torch.float16)
gru = nn.GRU(16, 16, 2, batch_first=False).to(device).to(torch.float16)
out, _ = gru(x)
print(out)