昇思25天学习打卡营第22天|文本解码原理

【代码】昇思25天学习打卡营第22天|文本解码原理。

2301_78538042

588人浏览 · 2024-07-24 23:53:23

2301_78538042 · 2024-07-24 23:53:23 发布

I love this section, for most popular model's principles are explained here (models in NLP).

We explain the text decoder principle by MindNLP.

auto-regression Language model:

the method supplied by MindNLP:

Greedy_search:

Beam search:

keep several top probable predictions in each step: (num_beams = 2)

tokenizer = GPT2Tokenizer.from_pretrained('iiBcai/gpt2',mirror = 'modelscope')
model = GPT2MHeadModel.from_pretrained('iiBcai/gpt2',pad_token_id = tokenizer.eos_token_id, mirror = 'modelscope')
input_ids = tokenizer.encode('I enjoy walking with my cute dog', return_tensors = 'ms')
beam_output = model.generate(
    input_ids, 
    max_length=50,
    num_beams = 5,
    early_stopping = True
)
print('Output:\n ' + 100 * '-')
print(tokenizer.decode(beam_output[0], skip_special_tokens  = True))
print(100 * '-')

beam_output = model.generate(
    input_ids,
    max_length = 50,
    num_beams =5,
    no_repeat_ngram_size = 2,
    early_stopping = True
)
print('Beam search with ngram, Output:\n' + 100* '-')
print(tokenizer.decode(beam_output[0], skip_special_token = True))
print(100*'-')
beam_output  = model.generate(
    input_ids,
    max_length = 50,
    num_beams = 5,
    no_repeat_ngram_size = 2,
    num_return_sequences = 5,
    early_stopping      =True
)

print('return_num_sequences, Output :\n' + 100 * '-')
for i, beam_output in enumerate(beam_outputs):
    print("{}:{}".format(i, tokenizer.decode(beam_output, skip_special_tokens = True)))
print(100 * '-')

good output:

n-gram:an n-gram model predicts the occurrence of a word based on the occurrence of its previous n - 1 words, making it a type of Markov model.

no_repeat_ngram_size :set those words to have probability 0 to appear more than n times

sample:

randomly choose output word according to the conditional distribution now

temperature:if high you can imagine that all hopeful output waiting to be choose look less different(i.e choose more randomly), otherwise they varies in probablity to be choose to a greater extent

mindspore.set_seed(1234)
sample_output = model.generate(
    input_ids,
    do_sample = True,
    max_length = 50,
    top_k = 0,
    temperature= 0.7
    )

TopK sample : choose those topK words and normalize them to choose the sample again

Top-P sample:

choose those words whose probablity > p

and normalize them to sample again

combine topk and topp

sample_outputs = model.generate(
    input_ids,
    do_samples = True,
    max_length = 50,
    top_k = 5,
    top_p = 0.95,
    num_return_sequences = 3
    )
print('Output:\n' + 100*'-')
for i, sample_output in enumerate(sample_outputs):
    print("{} : {}".format(i, tokenizer.decode(sample_output,skip_special_tokens = True)))

鲲鹏昇腾开发者社区是面向全社会开放的“联接全球计算开发者，聚合华为+生态”的社区，内容涵盖鲲鹏、昇腾资源，帮助开发者快速获取所需的知识、经验、软件、工具、算力，支撑开发者易学、好用、成功，成为核心开发者。

更多推荐

从环境搭建到算子调试：CANN 9.0 + ops-cv 全流程实战指南

鲲鹏昇腾开发者社区

HCCL常见问题定位指南

鲲鹏昇腾开发者社区

昇腾 910B NPU 大模型部署实践：vLLM 与 Transformers 方案详解

公司之前用 A100，但懂的都懂——买不到、买不起、不敢买。华为昇腾 910B 是国产替代方案，理论上性能对标 A100。老板拍板：“就它了，先上一台试试。# 我的第一反应docker run --runtime=nvidia ... # 报错：没有这个 runtimeexport CUDA_VISIBLE_DEVICES=0 # 毫无反应NPU 不是"国产 CUDA"，它是完全不同的生态。有自己