大模型文件下载参考地址:

魔搭社区

可以编辑shell文件,把链接都提取存好后台执行。

例如:

g.sh内容如下:

wget https://www.modelscope.cn/models/Qwen/Qwen2.5-72B-Instruct/resolve/master/configuration.json
wget https://www.modelscope.cn/models/Qwen/Qwen2.5-72B-Instruct/resolve/master/vocab.json
wget https://www.modelscope.cn/models/Qwen/Qwen2.5-72B-Instruct/resolve/master/tokenizer.json
wget https://www.modelscope.cn/models/Qwen/Qwen2.5-72B-Instruct/resolve/master/tokenizer_config.json
wget https://www.modelscope.cn/models/Qwen/Qwen2.5-72B-Instruct/resolve/master/README.md
wget https://www.modelscope.cn/models/Qwen/Qwen2.5-72B-Instruct/resolve/master/generation_config.json
wget https://www.modelscope.cn/models/Qwen/Qwen2.5-72B-Instruct/resolve/master/LICENSE 

执行nohup ./g.sh > ./g.log &后台下载即可。

下载完成后务必检查g.log

cat g.log|grep save确认都正常下载。遇到的问题就是文件不完整导致后边遇到莫名奇妙的问题。


关于MindIE的参考:

产品版本信息-版本说明-MindIE1.0.RC3开发文档-昇腾社区

MindIE Service Deployment — MindSpore master documentation

华为镜像仓库使用MindIE:

昇腾镜像仓库详情

遇到的问题:

华为arm npu环境的论坛:

1.https://www.hiascend.com/forum/thread-02105166327838594152-1-1.html

2.

https://www.hiascend.com/forum/thread-0290164564141262022-1-1.html

3. 

https://www.hiascend.com/forum/thread-0296165487498000083-1-1.html

运行报错过程及处理:(已解决,就是模型文件下载不完整导致,处理过程见最后)

root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# export LD_LIBRARY_PATH=/usr/local/Ascend/driver/lib64/driver:$LD_LIBRARY_PATH
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# ./bin/mindieservice_daemon 
./bin/mindieservice_daemon: error while loading shared libraries: libsecurec.so: cannot open shared object file: No such file or directory

root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# export LD_LIBRARY_PATH=/usr/local/Ascend/driver/lib64/driver:$LD_LIBRARY_PATH
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# ./bin/mindieservice_daemon 
./bin/mindieservice_daemon: error while loading shared libraries: libsecurec.so: cannot open shared object file: No such file or directory
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# find / -name libsecurec.so
find: '/sys/kernel/slab/dentry/cgroup/dentry(9886462:user@0.service)': No such file or directory
/usr/local/Ascend/ascend-toolkit/latest/aarch64-linux/lib64/libsecurec.so
/usr/local/Ascend/ascend-toolkit/8.0.RC2/aarch64-linux/lib64/libsecurec.so
/usr/local/Ascend/driver/lib64/common/libsecurec.so
find: '/proc/749/map_files': Permission denied
find: '/proc/750/map_files': Permission denied
find: '/proc/760/map_files': Permission denied
find: '/proc/761/map_files': Permission denied
find: '/proc/1199/map_files': Permission denied
find: '/proc/1227/map_files': Permission denied
find: '/proc/1228/map_files': Permission denied
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# source /usr/local/Ascend/driver/bin/setenv.bash
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# source /usr/local/Ascend/ascend-toolkit/set_env.sh
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# echo $LD_LIBRARY_PATH
/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64/plugin:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/aarch64:/usr/local/Ascend/driver/lib64/driver:/usr/local/Ascend/driver/lib64/driver:/usr/local/Ascend/driver/lib64/driver:
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# 
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# 
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# 
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# 
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# ./bin/mindieservice_daemon 
./bin/mindieservice_daemon: error while loading shared libraries: libboost_thread.so.1.82.0: cannot open shared object file: No such file or directory
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# find ./ -name libboost_thread.so.1.82.0
./lib/libboost_thread.so.1.82.0
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# pwd
/usr/local/Ascend/mindie/latest/mindie-service
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# ll
total 56
drwxr-x--- 1 root root 4096 Nov  7 10:54 ./
drwxr-x--- 1 root root 4096 Aug 31 00:59 ../
dr-xr-x--- 2 root root 4096 Aug 31 00:59 bin/
drwxr-x--- 1 root root 4096 Nov  7 10:54 conf/
drwxr-x--- 3 root root 4096 Aug 31 00:59 include/
drwxr-x--- 2 root root 4096 Aug 31 00:59 lib/
drwxr-x--- 1 root root 4096 Nov  6 16:45 logs/
drwx------ 3 root root 4096 Aug 31 00:59 scripts/
drwx------ 6 root root 4096 Aug 31 00:59 security/
-r-xr-x--- 1 root root 1654 Aug 31 00:59 set_env.sh*
-r--r----- 1 root root  100 Aug 31 00:59 version.info
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# source set_env.sh 
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# ./bin/mindieservice_daemon 
Fatal Python error: PyThreadState_Get: the function must be called with the GIL held, but the GIL is released (the current Python thread state is NULL)
Python runtime state: finalizing (tstate=0x0000ffffb801d1b0)

Daemon wait pid with 2379, status 134Daemon is killing...
Killed
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# env | grep ASCEND_HOME_PATH
ASCEND_HOME_PATH=/usr/local/Ascend/ascend-toolkit/latest
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# env | grep ATB_SPEED_HOME_PATH
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# env | grep ATB_HOME_PATH
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# ./bin/mindieservice_daemon 
Fatal Python error: PyThreadState_Get: the function must be called with the GIL held, but the GIL is released (the current Python thread state is NULL)
Python runtime state: finalizing (tstate=0x0000ffffb001d1b0)

Daemon wait pid with 4336, status 134Daemon is killing...
Killed
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# echo $ATB_SPEED_HOME_PATH

root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# source /usr/local/Ascend/ascend-toolkit/set_env.sh
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# echo $ATB_SPEED_HOME_PATH

root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# source /usr/local/Ascend/nnal/atb/set_env.sh
root@dev-8242526b-01f2-4a54-b89d-f6d9c57c692d-qjhpf:/usr/local/Ascend/mindie/latest/mindie-service# source /usr/local/Ascend/mindie/latest/mindie-llm/set_env.sh 
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# 
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# echo $ATB_SPEED_HOME_PATH

root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# source /usr/local/Ascend/llm_model/set_env.sh
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# 
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# 
Display all 836 possibilities? (y or n)
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# 
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# echo $ATB_SPEED_HOME_PATH
/usr/local/Ascend/llm_model
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# env | grep ATB_HOME_PATH
ATB_HOME_PATH=/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_0
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# env | grep ATB_SPEED_HOME_PATH
ATB_SPEED_HOME_PATH=/usr/local/Ascend/llm_model
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# ./bin/mindieservice_daemon 
2024-11-07 11:14:10,906 [INFO] [pid: 5615] env.py-55: {'use_ascend': True, 'max_memory_gb': None, 'reserved_memory_gb': 3, 'skip_warmup': False, 'visible_devices': None, 'use_host_chooser': True, 'bind_cpu': True}
2024-11-07 11:14:12,689 [INFO] [pid: 5615] cpu_binding.py-206: rank_id: 0, device_id: 0, numa_id: 2, shard_devices: [0], cpus: [48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71]
2024-11-07 11:14:12,691 [INFO] [pid: 5615] cpu_binding.py-231: process 5615, new_affinity is [48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71], cpu count 24
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
2024-11-07 11:14:13,560 [INFO] [pid: 5615] logging.py-53: model_runner.quantize: None
, model_runner.kv_quant: None
, model_runner.dytpe: torch.bfloat16
2024-11-07 11:14:13,560 [INFO] [pid: 5615] logging.py-53: Rank table file location: 
[W compiler_depend.ts:623] Warning: expandable_segments currently defaults to false. You can enable this feature by `export PYTORCH_NPU_ALLOC_CONF = expandable_segments:True`. (function operator())
2024-11-07 11:14:19,719 [INFO] [pid: 5615] dist.py-94: initialize_distributed has been Set
2024-11-07 11:14:19,722 [INFO] [pid: 5615] logging.py-53: init tokenizer done: Qwen2TokenizerFast(name_or_path='/home/apulis-dev/teamdata/qwen2.5-72B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='left', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False),  added_tokens_decoder={
        151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
Fatal Python error: PyThreadState_Get: the function must be called with the GIL held, but the GIL is released (the current Python thread state is NULL)
Python runtime state: finalizing (tstate=0x0000ffffa401d6d0)

Daemon wait pid with 5615, status 134Daemon is killing...
Killed
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# 

 

找到故障原因了:是模型文件下载不完整导致。

仔细检查下载cat g.log|grep save日志,发现少了00012这个文件,对比文件大小,发现明显小于其它模型文件。

正确的输出结果:

root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/llm_model# more /usr/local/Ascend/mindie/latest/mindie-service/conf/config.json
{
    "OtherParam" :
    {
        "ResourceParam" :
        {
            "cacheBlockSize" : 128
        },
        "LogParam" :
        {
            "logLevel" : "Info",
            "logPath" : "logs/mindservice.log"
        },
        "ServeParam" :
        {
            "ipAddress" : "127.0.0.1",
            "managementIpAddress" : "127.0.0.2",
            "port" : 1025,
            "managementPort" : 1026,
            "maxLinkNum" : 1000,
            "httpsEnabled" : false,
            "tlsCaPath" : "security/ca/",
            "tlsCaFile" : ["ca.pem"],
            "tlsCert" : "security/certs/server.pem",
            "tlsPk" : "security/keys/server.key.pem",
            "tlsPkPwd" : "security/pass/mindie_server_key_pwd.txt",
            "tlsCrl" : "security/certs/server_crl.pem",
            "managementTlsCaFile" : ["management_ca.pem"],
            "managementTlsCert" : "security/certs/management_server.pem",
            "managementTlsPk" : "security/keys/management_server.key.pem",
            "managementTlsPkPwd" : "security/pass/management_mindie_server_key_pwd.txt",
            "managementTlsCrl" : "security/certs/management_server_crl.pem",
            "kmcKsfMaster" : "tools/pmt/master/ksfa",
            "kmcKsfStandby" : "tools/pmt/standby/ksfb",
            "multiNodesInferPort" : 1120,
            "interNodeTLSEnabled" : true,
            "interNodeTlsCaFile" : "security/ca/ca.pem",
            "interNodeTlsCert" : "security/certs/server.pem",
            "interNodeTlsPk" : "security/keys/server.key.pem",
            "interNodeTlsPkPwd" : "security/pass/mindie_server_key_pwd.txt",
            "interNodeKmcKsfMaster" : "tools/pmt/master/ksfa",
            "interNodeKmcKsfStandby" : "tools/pmt/standby/ksfb"
        }
    },
    "WorkFlowParam" :
    {
        "TemplateParam" :
        {
            "templateType" : "Standard",
            "templateName" : "Standard_llama"
        }
    },
    "ModelDeployParam" :
    {
        "engineName" : "mindieservice_llm_engine",
        "modelInstanceNumber" : 1,
        "tokenizerProcessNumber" : 8,
        "maxSeqLen" : 2560,
        "npuDeviceIds" : [[0]],
        "multiNodesInferEnabled" : false,
        "ModelParam" : [
            {
                "modelInstanceType" : "Standard",
                "modelName" : "qwen",
                "modelWeightPath" : "/home/apulis-dev/teamdata/qwen2.5-72B-Instruct",
                "worldSize" : 1,
                "cpuMemSize" : 5,
                "npuMemSize" : 8,
                "backendType" : "atb",
                "pluginParams" : ""
            }
        ]
    },
    "ScheduleParam" :
    {
        "maxPrefillBatchSize" : 8,
        "maxPrefillTokens" : 8192,
        "prefillTimeMsPerReq" : 150,
        "prefillPolicyType" : 0,

        "decodeTimeMsPerReq" : 50,
        "decodePolicyType" : 0,

        "maxBatchSize" : 8,
        "maxIterTimes" : 512,
        "maxPreemptCount" : 0,
        "supportSelectBatch" : false,
        "maxQueueDelayMicroseconds" : 5000
    }
}

root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# ll
total 60
drwxr-x--- 1 root root 4096 Nov  7 12:18 ./
drwxr-x--- 1 root root 4096 Aug 31 00:59 ../
dr-xr-x--- 2 root root 4096 Aug 31 00:59 bin/
drwxr-x--- 1 root root 4096 Nov  7 10:54 conf/
drwxr-x--- 3 root root 4096 Aug 31 00:59 include/
drwxr-x--- 2 root root 4096 Aug 31 00:59 lib/
drwx------ 1 root root 4096 Nov  7 12:18 logs/
drwx------ 3 root root 4096 Aug 31 00:59 scripts/
drwx------ 6 root root 4096 Aug 31 00:59 security/
-r-xr-x--- 1 root root 1654 Aug 31 00:59 set_env.sh*
-r--r----- 1 root root  100 Aug 31 00:59 version.info
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# 
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# 
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# 
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# ./bin/mindieservice_daemon 
2024-11-07 12:44:23,000 [INFO] [pid: 22202] env.py-55: {'use_ascend': True, 'max_memory_gb': None, 'reserved_memory_gb': 3, 'skip_warmup': False, 'visible_devices': None, 'use_host_chooser': True, 'bind_cpu': True}
2024-11-07 12:44:24,617 [INFO] [pid: 22202] cpu_binding.py-206: rank_id: 0, device_id: 0, numa_id: 2, shard_devices: [0], cpus: [48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71]
2024-11-07 12:44:24,620 [INFO] [pid: 22202] cpu_binding.py-231: process 22202, new_affinity is [48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71], cpu count 24
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
2024-11-07 12:44:25,393 [INFO] [pid: 22202] logging.py-53: model_runner.quantize: None
, model_runner.kv_quant: None
, model_runner.dytpe: torch.bfloat16
2024-11-07 12:44:25,394 [INFO] [pid: 22202] logging.py-53: Rank table file location: 
[W compiler_depend.ts:623] Warning: expandable_segments currently defaults to false. You can enable this feature by `export PYTORCH_NPU_ALLOC_CONF = expandable_segments:True`. (function operator())
2024-11-07 12:44:31,197 [INFO] [pid: 22202] dist.py-94: initialize_distributed has been Set
2024-11-07 12:44:31,199 [INFO] [pid: 22202] logging.py-53: init tokenizer done: Qwen2TokenizerFast(name_or_path='/home/apulis-dev/teamdata/qwen2.5-72B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='left', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False),  added_tokens_decoder={
        151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
2024-11-07 12:44:42,406 [INFO] [pid: 22202] logging.py-53: NPUSocInfo(soc_name='', soc_version=202, need_nz=True)
2024-11-07 12:44:42,435 [INFO] [pid: 22202] flash_causal_qwen2.py-52: >>>> qwen_DecoderModel is called.

root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest# cd /usr/local/Ascend/llm_model
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/llm_model# ll
total 392
drwxr-xr-x 1 root root   4096 Nov  7 12:14 ./
drwxr-xr-x 1 root root   4096 Aug 31 00:59 ../
-rw-r--r-- 1 root root  20514 Jul 22 15:42 README.md
drwxr-xr-x 1 root root   4096 Jul 22 15:42 atb_llm/
-rw-r--r-- 1 root root 308196 Jul 22 15:42 atb_llm-0.0.1-py3-none-any.whl
drwxr-xr-x 1 root root   4096 Nov  7 11:20 examples/
drwxr-xr-x 2 root root   4096 Jul 22 15:42 lib/
-rw-r--r-- 1 root root   4426 Jul 22 15:42 public_address_statement.md
drwxr-xr-x 3 root root   4096 Jul 22 15:42 requirements/
-rw-r--r-- 1 root root   2156 Jul 22 15:38 set_env.sh
-rw-r--r-- 1 root root    461 Jul 22 15:42 setup.py
drwxr-xr-x 3 root root   4096 Aug 31 00:59 tests/
-rw-r--r-- 1 root root    180 Jul 22 15:42 version.info
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/llm_model# python examples/run_pa.py --model_path /home/apulis-dev/teamdata/qwen2.5-72B-Instruct
2024-11-07 12:48:48,604 [INFO] [pid: 23427] env.py-55: {'use_ascend': True, 'max_memory_gb': None, 'reserved_memory_gb': 3, 'skip_warmup': False, 'visible_devices': None, 'use_host_chooser': True, 'bind_cpu': True}
2024-11-07 12:48:50,888 [INFO] [pid: 23427] cpu_binding.py-206: rank_id: 0, device_id: 0, numa_id: 2, shard_devices: [0], cpus: [48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71]
2024-11-07 12:48:50,890 [INFO] [pid: 23427] cpu_binding.py-231: process 23427, new_affinity is [48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71], cpu count 24
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
2024-11-07 12:48:51,855 [INFO] [pid: 23427] logging.py-53: model_runner.quantize: None
, model_runner.kv_quant: None
, model_runner.dytpe: torch.bfloat16
2024-11-07 12:48:51,855 [INFO] [pid: 23427] logging.py-53: Rank table file location: 
[W compiler_depend.ts:623] Warning: expandable_segments currently defaults to false. You can enable this feature by `export PYTORCH_NPU_ALLOC_CONF = expandable_segments:True`. (function operator())
2024-11-07 12:48:57,366 [INFO] [pid: 23427] dist.py-94: initialize_distributed has been Set
2024-11-07 12:48:57,369 [INFO] [pid: 23427] logging.py-53: init tokenizer done: Qwen2TokenizerFast(name_or_path='/home/apulis-dev/teamdata/qwen2.5-72B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='left', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False),  added_tokens_decoder={
        151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
2024-11-07 12:48:57,533 [INFO] [pid: 23427] logging.py-53: NPUSocInfo(soc_name='', soc_version=202, need_nz=True)
2024-11-07 12:48:57,568 [INFO] [pid: 23427] flash_causal_qwen2.py-52: >>>> qwen_DecoderModel is called.



后台方式启动:
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest# cd mindie-service/
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# ll
total 64
drwxr-x--- 1 root root 4096 Nov  7 12:44 ./
drwxr-x--- 1 root root 4096 Aug 31 00:59 ../
dr-xr-x--- 2 root root 4096 Aug 31 00:59 bin/
drwxr-x--- 1 root root 4096 Nov  7 10:54 conf/
drwxr-x--- 3 root root 4096 Aug 31 00:59 include/
drwxr-x--- 3 root root 4096 Nov  7 12:44 kernel_meta/
drwxr-x--- 2 root root 4096 Aug 31 00:59 lib/
drwx------ 1 root root 4096 Nov  7 12:44 logs/
drwx------ 3 root root 4096 Aug 31 00:59 scripts/
drwx------ 6 root root 4096 Aug 31 00:59 security/
-r-xr-x--- 1 root root 1654 Aug 31 00:59 set_env.sh*
-r--r----- 1 root root  100 Aug 31 00:59 version.info
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# nohup ./bin/mindieservice_daemon > output.log 2>&1 &
[2] 25894
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# 
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# 
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# 
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# 
root@dev-535f74b8-a02a-4d8e-aa28-1234c09547ff-jdbp8:/usr/local/Ascend/mindie/latest/mindie-service# tail -f output.log 
nohup: ignoring input
2024-11-07 13:01:03,735 [INFO] [pid: 25918] env.py-55: {'use_ascend': True, 'max_memory_gb': None, 'reserved_memory_gb': 3, 'skip_warmup': False, 'visible_devices': None, 'use_host_chooser': True, 'bind_cpu': True}
2024-11-07 13:01:05,385 [INFO] [pid: 25918] cpu_binding.py-206: rank_id: 0, device_id: 0, numa_id: 2, shard_devices: [0], cpus: [48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71]
2024-11-07 13:01:05,387 [INFO] [pid: 25918] cpu_binding.py-231: process 25918, new_affinity is [48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71], cpu count 24
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
2024-11-07 13:01:06,122 [INFO] [pid: 25918] logging.py-53: model_runner.quantize: None
, model_runner.kv_quant: None
, model_runner.dytpe: torch.bfloat16
2024-11-07 13:01:06,122 [INFO] [pid: 25918] logging.py-53: Rank table file location: 
[W compiler_depend.ts:623] Warning: expandable_segments currently defaults to false. You can enable this feature by `export PYTORCH_NPU_ALLOC_CONF = expandable_segments:True`. (function operator())
2024-11-07 13:01:11,742 [INFO] [pid: 25918] dist.py-94: initialize_distributed has been Set
2024-11-07 13:01:11,744 [INFO] [pid: 25918] logging.py-53: init tokenizer done: Qwen2TokenizerFast(name_or_path='/home/apulis-dev/teamdata/qwen2.5-72B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='left', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False),  added_tokens_decoder={
        151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
        151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
2024-11-07 13:01:11,942 [INFO] [pid: 25918] logging.py-53: NPUSocInfo(soc_name='', soc_version=202, need_nz=True)
2024-11-07 13:01:11,967 [INFO] [pid: 25918] flash_causal_qwen2.py-52: >>>> qwen_DecoderModel is called.

检查日志报错:

单卡显存不足:

RuntimeError: NPU out of memory. Tried to allocate 464.00 MiB (NPU 0; 21.02 GiB total capacity; 18.90 GiB already allocated; 18.90 GiB current active; 887.61 MiB free; 18.91 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try sett
ing max_split_size_mb to avoid fragmentation.
2024-11-07 14:53:55,195 [ERROR] model.py:33 - [Model]   >>> return initialize error result: {'status': 'error', 'npuBlockNum': '0', 'cpuBlockNum': '0'}
root@dev-8242526b-01f2-4a54-b89d-f6d9c57c692d-qjhpf:/usr/local/Ascend/mindie/latest/mindie-service/logs# ll
total 96
drwx------ 1 root root  4096 Nov  7 15:36 ./
drwxr-x--- 1 root root  4096 Nov  7 15:36 ../
-rw-r----- 1 root root 16278 Nov  7 15:36 mindie_audit.log
-rw-r----- 1 root root 16620 Nov  7 15:36 mindservice.log
-rw-r--r-- 1 root root  2079 Nov  7 12:18 pythonlog.log.17538
-rw-r--r-- 1 root root  1763 Nov  7 13:41 pythonlog.log.1761
-rw-r--r-- 1 root root     0 Nov  7 12:44 pythonlog.log.22202
-rw-r--r-- 1 root root  1763 Nov  7 13:43 pythonlog.log.2366
-rw-r--r-- 1 root root  1763 Nov  7 11:00 pythonlog.log.2379
-rw-r--r-- 1 root root     0 Nov  7 13:00 pythonlog.log.25918
-rw-r--r-- 1 root root     0 Nov  7 13:06 pythonlog.log.27337
-rw-r--r-- 1 root root     0 Nov  7 13:17 pythonlog.log.30201
-rw-r--r-- 1 root root  1763 Nov  7 13:47 pythonlog.log.3124
-rw-r--r-- 1 root root  3313 Nov  7 14:53 pythonlog.log.33308
-rw-r--r-- 1 root root  1763 Nov  7 11:11 pythonlog.log.4336
-rw-r--r-- 1 root root  3313 Nov  7 15:17 pythonlog.log.45779
-rw-r--r-- 1 root root  2079 Nov  7 11:14 pythonlog.log.5615
-rw-r--r-- 1 root root  2351 Nov  7 15:32 pythonlog.log.65298
-rw-r--r-- 1 root root   150 Nov  7 14:06 pythonlog.log.6647
-rw-r--r-- 1 root root     0 Nov  7 15:36 pythonlog.log.69882

MindIE服务化部署 — MindSpore master 文档

Logo

鲲鹏昇腾开发者社区是面向全社会开放的“联接全球计算开发者,聚合华为+生态”的社区,内容涵盖鲲鹏、昇腾资源,帮助开发者快速获取所需的知识、经验、软件、工具、算力,支撑开发者易学、好用、成功,成为核心开发者。

更多推荐