LSTM Word 语言模型上的(实验)动态量化

LSTM Word 语言模型上的(实验)动态量化

介绍

量化涉及将模型的权重和激活从 float 转换为 int,这可能会导致模型尺寸更小,推断速度更快,而对准确性的影响很小。

在本教程中,我们将最简单的量化形式-动态量化应用于基于 LSTM 的下一个单词预测模型,紧紧遵循 PyTorch 示例中的单词语言模型

复制代码
# imports
import os
from io import open
import time

import torch
import torch.nn as nn
import torch.nn.functional as F

1.定义模型

在这里,我们根据词语言模型示例中的模型定义 LSTM 模型体系结构。

复制代码
class LSTMModel(nn.Module):
    """Container module with an encoder, a recurrent module, and a decoder."""

    def __init__(self, ntoken, ninp, nhid, nlayers, dropout=0.5):
        super(LSTMModel, self).__init__()
        self.drop = nn.Dropout(dropout)
        self.encoder = nn.Embedding(ntoken, ninp)
        self.rnn = nn.LSTM(ninp, nhid, nlayers, dropout=dropout)
        self.decoder = nn.Linear(nhid, ntoken)

        self.init_weights()

        self.nhid = nhid
        self.nlayers = nlayers

    def init_weights(self):
        initrange = 0.1
        self.encoder.weight.data.uniform_(-initrange, initrange)
        self.decoder.bias.data.zero_()
        self.decoder.weight.data.uniform_(-initrange, initrange)

    def forward(self, input, hidden):
        emb = self.drop(self.encoder(input))
        output, hidden = self.rnn(emb, hidden)
        output = self.drop(output)
        decoded = self.decoder(output)
        return decoded, hidden

    def init_hidden(self, bsz):
        weight = next(self.parameters())
        return (weight.new_zeros(self.nlayers, bsz, self.nhid),
                weight.new_zeros(self.nlayers, bsz, self.nhid))

2.加载文本数据

接下来,我们再次根据单词模型示例对预处理,将 Wikitext-2 数据集加载到<cite>语料库</cite>中。

复制代码
class Dictionary(object):
    def __init__(self):
        self.word2idx = {}
        self.idx2word = []

    def add_word(self, word):
        if word not in self.word2idx:
            self.idx2word.append(word)
            self.word2idx[word] = len(self.idx2word) - 1
        return self.word2idx[word]

    def __len__(self):
        return len(self.idx2word)

class Corpus(object):
    def __init__(self, path):
        self.dictionary = Dictionary()
        self.train = self.tokenize(os.path.join(path, 'train.txt'))
        self.valid = self.tokenize(os.path.join(path, 'valid.txt'))
        self.test = self.tokenize(os.path.join(path, 'test.txt'))

    def tokenize(self, path):
        """Tokenizes a text file."""
        assert os.path.exists(path)
        # Add words to the dictionary
        with open(path, 'r', encoding="utf8") as f:
            for line in f:
                words = line.split() + ['<eos>']
                for word in words:
                    self.dictionary.add_word(word)

        # Tokenize file content
        with open(path, 'r', encoding="utf8") as f:
            idss = []
            for line in f:
                words = line.split() + ['<eos>']
                ids = []
                for word in words:
                    ids.append(self.dictionary.word2idx[word])
                idss.append(torch.tensor(ids).type(torch.int64))
            ids = torch.cat(idss)

        return ids

model_data_filepath = 'data/'

corpus = Corpus(model_data_filepath + 'wikitext-2')

3.加载预训练的模型

这是有关动态量化的教程,动态量化是在训练模型后应用的一种量化技术。 因此,我们只需将一些预先训练的权重加载到此模型架构中即可; 这些权重是通过使用单词语言模型示例中的默认设置训练五个纪元而获得的。

复制代码
ntokens = len(corpus.dictionary)

model = LSTMModel(
    ntoken = ntokens,
    ninp = 512,
    nhid = 256,
    nlayers = 5,
)

model.load_state_dict(
    torch.load(
        model_data_filepath + 'word_language_model_quantize.pth',
        map_location=torch.device('cpu')
        )
    )

model.eval()
print(model)

出:

复制代码
LSTMModel(
  (drop): Dropout(p=0.5, inplace=False)
  (encoder): Embedding(33278, 512)
  (rnn): LSTM(512, 256, num_layers=5, dropout=0.5)
  (decoder): Linear(in_features=256, out_features=33278, bias=True)
)

现在,我们生成一些文本以确保预先训练的模型能够正常工作-与以前类似,我们在此处遵循

复制代码
input_ = torch.randint(ntokens, (1, 1), dtype=torch.long)
hidden = model.init_hidden(1)
temperature = 1.0
num_words = 1000

with open(model_data_filepath + 'out.txt', 'w') as outf:
    with torch.no_grad():  # no tracking history
        for i in range(num_words):
            output, hidden = model(input_, hidden)
            word_weights = output.squeeze().div(temperature).exp().cpu()
            word_idx = torch.multinomial(word_weights, 1)[0]
            input_.fill_(word_idx)

            word = corpus.dictionary.idx2word[word_idx]

            outf.write(str(word.encode('utf-8')) + ('\n' if i % 20 == 19 else ' '))

            if i % 100 == 0:
                print('| Generated {}/{} words'.format(i, 1000))

with open(model_data_filepath + 'out.txt', 'r') as outf:
    all_output = outf.read()
    print(all_output)

Out:

复制代码
| Generated 0/1000 words
| Generated 100/1000 words
| Generated 200/1000 words
| Generated 300/1000 words
| Generated 400/1000 words
| Generated 500/1000 words
| Generated 600/1000 words
| Generated 700/1000 words
| Generated 800/1000 words
| Generated 900/1000 words
b'and' b'O' b'\xe2\x80\x99' b'Gacy' b',' b'and' b'then' b'defined' b'that' b'next' b'novel' b'succeeded' b'large' b'property' b',' b'so' b'neither' b'number' b'is' b'currently'
b'a' b'identical' b'planet' b'by' b'stiff' b'culture' b'.' b'Mosley' b'may' b'settle' b'in' b'non' b'@-@' b'bands' b'for' b'the' b'beginning' b'of' b'its' b'home'
b'stations' b',' b'being' b'also' b'in' b'charge' b'for' b'two' b'other' b'@-@' b'month' b'ceremonies' b'.' b'The' b'first' b'Star' b'Overseas' b'took' b'to' b'have'
b'met' b'its' b'leadership' b'for' b'investigation' b'such' b'as' b'Discovered' b'lbw' b',' b'club' b',' b'<unk>' b',' b'<unk>' b',' b'or' b'Crac' b"'Malley" b','
b'although' b'with' b'the' b'other' b'victory' b',' b'assumes' b'it' b'.' b'(' b'not' b'containment' b'to' b'a' b'recent' b'problem' b')' b'.' b'His' b'traditional'
b'scheme' b'process' b'is' b'proceeded' b'outdoor' b'in' b'overweight' b'clusters' b';' b'God' b'Davis' b'was' b'interested' b'on' b'her' b'right' b'touring' b',' b'although' b'they'
b'had' b'previously' b'previously' b'risen' b'near' b'eclipse' b'in' b'his' b'work' b'by' b'the' b'latter' b'@-@' b'perspective' b'.' b'During' b'the' b'release' b'of' b'Bell'
b',' b'the' b'first' b'promotional' b'mention' b'included' b'a' b'Magnetic' b'seam' b'was' b'put' b'into' b'Shakespeare' b"'s" b'Special' b'Company' b'is' b'katra' b'than' b'chops'
b'@-@' b'up' b'history' b'for' b'frets' b'of' b'actions' b'.' b'<eos>' b'Until' b'arrival' b',' b'Griffin' b'wrote' b'that' b'a' b'"' b'sense' b'"' b'included'
b'especially' b'declining' b'individual' b'forces' b',' b'though' b'are' b'stronger' b'<unk>' b'.' b'According' b'to' b'lessen' b'very' b'role' b',' b'Ceres' b'believed' b'he' b'each'
b'conflicted' b'pump' b'fight' b'follows' b'the' b'malignant' b'polynomial' b'to' b'make' b'Albani' b'.' b'The' b'nobility' b'found' b'a' b'spinners' b'from' b'a' b'special' b'to'
b'vertical' b'@-@' b'term' b'crimes' b',' b'and' b'the' b'Neapolitan' b'apparent' b'<unk>' b'show' b'forcing' b'no' b'of' b'the' b'worst' b'traditions' b'of' b'tallest' b'<unk>'
b'teacher' b'+' b'green' b'crushing' b',' b'with' b'4' b'%' b',' b'and' b'560' b'doctrines' b',' b'with' b'other' b'Asian' b'assistance' b'<unk>' b'.' b'The'
b'game' b'is' b'unadorned' b',' b'especially' b'or' b'steadily' b'favoured' b'according' b'to' b'its' b'inside' b',' b'leading' b'to' b'the' b'removal' b'of' b'gauges' b'.'
b'vanishing' b',' b'a' b'jagged' b'race' b'rested' b'with' b'be' b'rich' b'if' b'these' b'legislation' b'remained' b'together' b'.' b'The' b'anthology' b'and' b'initially' b'regularly'
b'Cases' b'Cererian' b'and' b'acknowledge' b'individual' b'being' b'poured' b'with' b'the' b'Chicago' b'melee' b'.' b'Europium' b',' b'<unk>' b',' b'and' b'Lars' b'life' b'for'
b'electron' b'plumage' b',' b'will' b'deprive' b'themselves' b'.' b'The' b'<unk>' b'gryllotalpa' b'behave' b'have' b'Emerald' b'doubt' b'.' b'When' b'limited' b'cubs' b'are' b'rather'
b'attempting' b'to' b'address' b'.' b'Two' b'birds' b'as' b'being' b'also' b'<unk>' b',' b'such' b'as' b'"' b'<unk>' b'"' b',' b'and' b'possessing' b'criminal'
b'spots' b',' b'lambskin' b'ponderosa' b'mosses' b',' b'which' b'might' b'seek' b'to' b'begin' b'less' b'different' b'delineated' b'techniques' b'.' b'Known' b',' b'on' b'the'
b'ground' b',' b'and' b'only' b'cooler' b',' b'first' b'on' b'other' b'females' b'factory' b'in' b'mathematics' b'.' b'Pilgrim' b'alone' b'has' b'a' b'critical' b'substance'
b',' b'probably' b'in' b'line' b'.' b'He' b'used' b'a' b'<unk>' b',' b'with' b'the' b'resin' b'being' b'transported' b'to' b'the' b'12th' b'island' b'during'
b'the' b'year' b'of' b'a' b'mixture' b'show' b'that' b'it' b'is' b'serving' b';' b'they' b'are' b'headed' b'by' b'prone' b'too' b'species' b',' b'rather'
b'than' b'the' b'risk' b'of' b'carbon' b'.' b'In' b'all' b'other' b'typical' b',' b'faith' b'consist' b'of' b'<unk>' b'whereas' b'<unk>' b'when' b'quotes' b'they'
b'Abrams' b'restructuring' b'vessels' b'.' b'It' b'also' b'emerged' b'even' b'when' b'any' b'lack' b'of' b'birds' b'has' b'wide' b'pinkish' b'structures' b',' b'directing' b'a'
b'chelicerae' b'of' b'amputated' b'elementary' b',' b'only' b'they' b'on' b'objects' b'.' b'A' b'female' b'and' b'a' b'female' b'Leisler' b'@-@' b'shaped' b'image' b'for'
b'51' b'@.@' b'5' b'm' b'(' b'5' b'lb' b')' b'Frenchman' b'2' b'at' b'sea' b'times' b'is' b'approximately' b'2' b'years' b'ago' b',' b'particularly'
b'behind' b'reducing' b'Trujillo' b"'s" b'and' b'food' b'specific' b'spores' b'.' b'Males' b'fibrous' b'females' b'can' b'be' b'severely' b'gregarious' b'.' b'The' b'same' b'brood'
b'behind' b'100' b'minutes' b'after' b'it' b'is' b'estimated' b'by' b'damaging' b'the' b'nest' b'base' b',' b'with' b'some' b'other' b'rare' b'birds' b'and' b'behavior'
b',' b'no' b'transport' b'and' b'Duty' b'demand' b'.' b'Two' b'rare' b'chicks' b'have' b'from' b'feed' b'engage' b'to' b'come' b'with' b'some' b'part' b'of'
b'nesting' b'.' b'The' b'1808' b'to' b'be' b'reduced' b'to' b'Scots' b'and' b'fine' b'stones' b'.' b'There' b'they' b'also' b'purple' b'limitations' b'of' b'certain'
b'skin' b'material' b'usually' b'move' b'during' b'somewhat' b'.' b'A' b'mothers' b'of' b'external' b'take' b'from' b'poaching' b',' b'typically' b'have' b'people' b'processes' b'and'
b'toll' b';' b'while' b'bird' b'plumage' b'differs' b'to' b'Fight' b',' b'they' b'may' b'be' b'open' b'after' b'<unk>' b',' b'thus' b'rarely' b'their' b'<unk>'
b'for' b'a' b'emotional' b'circle' b'.' b'Rough' b'Dahlan' b'probably' b'suggested' b'how' b'they' b'impose' b'their' b'cross' b'of' b'relapse' b'where' b'they' b'changed' b'.'
b'They' b'popularisation' b'them' b'of' b'their' b'<unk>' b',' b'charming' b'by' b'limited' b'or' b'Palestinians' b'the' b'<unk>' b'<unk>' b'.' b'Traffic' b'of' b'areas' b'headed'
b',' b'and' b'their' b'push' b'will' b'articulate' b'.' b'<eos>' b'<unk>' b'would' b'be' b'criticized' b'by' b'protein' b'rice' b',' b'particularly' b'often' b'rather' b'of'
b'the' b'cellular' b'extent' b'.' b'They' b'could' b'overlap' b'forward' b',' b'and' b'there' b'are' b'no' b'governing' b'land' b',' b'they' b'do' b'not' b'find'
b'it' b'.' b'In' b'one' b'place' b',' b'reddish' b'kakapo' b'(' b'kakapo' b'<unk>' b')' b'might' b'be' b'performed' b'that' b'conduct' b',' b'stadia' b','
b'gene' b'or' b'air' b',' b'noise' b',' b'and' b'offensive' b'or' b'skin' b',' b'which' b'may' b'be' b'commercially' b'organized' b'strong' b'method' b'.' b'In'
b'changing' b',' b'Chen' b'and' b'eukaryotes' b'were' b'Membrane' b'spiders' b'in' b'larger' b'growth' b',' b'by' b'some' b'regions' b'.' b'If' b'up' b'about' b'5'
b'%' b'of' b'the' b'males' b',' b'there' b'are' b'displays' b'that' b'shift' b'the' b'bird' b'inclination' b'after' b'supreme' b'<unk>' b'to' b'move' b'outside' b'tests'
b'.' b'The' b'aim' b'of' b'Mouquet' b'Sites' b'is' b'faster' b'as' b'an' b'easy' b'asteroid' b',' b'with' b'ocean' b'or' b'grey' b',' b'albeit' b','
b'as' b'they' b'they' b'CBs' b',' b'and' b'do' b'not' b'be' b'performed' b',' b'greatly' b'on' b'other' b'insects' b',' b'they' b'can' b'write' b'chromosomes'
b',' b'and' b'planners' b',' b'galericulata' b'should' b'be' b'a' b'bird' b'.' b'Also' b'on' b'a' b'holodeck' b'they' b'were' b'divine' b'out' b'of' b'bare'
b'handwriting' b'.' b'Unlike' b'this' b',' b'they' b'makes' b'only' b'anything' b'a' b'variation' b'of' b'skin' b'skeletons' b'further' b'.' b'They' b'have' b'to' b'be'
b'able' b'under' b'their' b'herding' b'tree' b',' b'or' b'dart' b'.' b'When' b'many' b'hypothesis' b'(' b'plant' b',' b'they' b'were' b'@-@' b'looped' b'aged'
b'play' b')' b'is' b'very' b'clear' b'as' b'very' b'on' b'comparison' b'.' b'<eos>' b'Furthermore' b',' b'Wikimania' b'decorations' b'@-@' b'sponsored' b'naming' b'hydrogen' b'when'
b'the' b'kakapo' b'commenced' b',' b'they' b'are' b'slowly' b'on' b'heavy' b'isolation' b'.' b'Sometimes' b'that' b'Larssen' b'leave' b'gently' b',' b'they' b'usually' b'made'
b'short' b'care' b'of' b'feral' b'or' b'any' b'dual' b'species' b'.' b'<eos>' b'Further' b'males' b'that' b'outfitting' b',' b'when' b'there' b'are' b'two' b'envelope'
b'shorter' b'flocks' b'to' b'be' b'males' b'ideally' b'they' b'are' b'highly' b'emission' b'.' b'<eos>' b'As' b'of' b'danger' b',' b'taking' b'in' b'one' b'of'
b'the' b'other' b'surviving' b'structure' b'of' b'Ceres' b'can' b'be' b'rebuffed' b'to' b'be' b'caused' b'by' b'any' b'combination' b'of' b'food' b'or' b'modified' b'its'

它不是 GPT-2,但看起来该模型已开始学习语言结构!

我们几乎准备好演示动态量化。 我们只需要定义一些辅助函数:

复制代码
bptt = 25
criterion = nn.CrossEntropyLoss()
eval_batch_size = 1

# create test data set
def batchify(data, bsz):
    # Work out how cleanly we can divide the dataset into bsz parts.
    nbatch = data.size(0) // bsz
    # Trim off any extra elements that wouldn't cleanly fit (remainders).
    data = data.narrow(0, 0, nbatch * bsz)
    # Evenly divide the data across the bsz batches.
    return data.view(bsz, -1).t().contiguous()

test_data = batchify(corpus.test, eval_batch_size)

# Evaluation functions
def get_batch(source, i):
    seq_len = min(bptt, len(source) - 1 - i)
    data = source[i:i+seq_len]
    target = source[i+1:i+1+seq_len].view(-1)
    return data, target

def repackage_hidden(h):
  """Wraps hidden states in new Tensors, to detach them from their history."""

  if isinstance(h, torch.Tensor):
      return h.detach()
  else:
      return tuple(repackage_hidden(v) for v in h)

def evaluate(model_, data_source):
    # Turn on evaluation mode which disables dropout.
    model_.eval()
    total_loss = 0.
    hidden = model_.init_hidden(eval_batch_size)
    with torch.no_grad():
        for i in range(0, data_source.size(0) - 1, bptt):
            data, targets = get_batch(data_source, i)
            output, hidden = model_(data, hidden)
            hidden = repackage_hidden(hidden)
            output_flat = output.view(-1, ntokens)
            total_loss += len(data) * criterion(output_flat, targets).item()
    return total_loss / (len(data_source) - 1)

4.测试动态量化

最后,我们可以在模型上调用torch.quantization.quantize_dynamic! 特别,

  • 我们指定我们要对模型中的nn.LSTMnn.Linear模块进行量化

  • 我们指定希望将权重转换为int8

    import torch.quantization

    quantized_model = torch.quantization.quantize_dynamic(
    model, {nn.LSTM, nn.Linear}, dtype=torch.qint8
    )
    print(quantized_model)

Out:

复制代码
LSTMModel(
  (drop): Dropout(p=0.5, inplace=False)
  (encoder): Embedding(33278, 512)
  (rnn): DynamicQuantizedLSTM(
    512, 256, num_layers=5, dropout=0.5
    (_all_weight_values): ModuleList(
      (0): PackedParameter()
      (1): PackedParameter()
      (2): PackedParameter()
      (3): PackedParameter()
      (4): PackedParameter()
      (5): PackedParameter()
      (6): PackedParameter()
      (7): PackedParameter()
      (8): PackedParameter()
      (9): PackedParameter()
    )
  )
  (decoder): DynamicQuantizedLinear(
    in_features=256, out_features=33278
    (_packed_params): LinearPackedParams()
  )
)

该模型看起来相同; 这对我们有什么好处? 首先,我们看到模型尺寸显着减小:

复制代码
def print_size_of_model(model):
    torch.save(model.state_dict(), "temp.p")
    print('Size (MB):', os.path.getsize("temp.p")/1e6)
    os.remove('temp.p')

print_size_of_model(model)
print_size_of_model(quantized_model)

Out:

复制代码
Size (MB): 113.941574
Size (MB): 76.807204

其次,我们看到了更快的推断时间,而评估损失没有差异:

注意:由于量化模型运行单线程,因此用于单线程比较的线程数为 1。

复制代码
torch.set_num_threads(1)

def time_model_evaluation(model, test_data):
    s = time.time()
    loss = evaluate(model, test_data)
    elapsed = time.time() - s
    print('''loss: {0:.3f}\nelapsed time (seconds): {1:.1f}'''.format(loss, elapsed))

time_model_evaluation(model, test_data)
time_model_evaluation(quantized_model, test_data)

Out:

复制代码
loss: 5.167
elapsed time (seconds): 233.9
loss: 5.168
elapsed time (seconds): 164.9

在 MacBook Pro 上本地运行此程序,无需进行量化,推理大约需要 200 秒,而进行量化则只需大约 100 秒。

结论

动态量化可能是减小模型大小的简单方法,而对精度的影响有限。

谢谢阅读! 与往常一样,我们欢迎您提供任何反馈,因此,如果有任何问题,请在此处创建一个问题

相关推荐
Python测试之道21 小时前
利用生成式AI与大语言模型(LLM)革新自动化软件测试 —— 测试工程师必读深度解析
人工智能·语言模型·自动化
揭老师高效办公1 天前
WPS文字和Word文档如何选择多个不连续的行、段
word·wps文字
Hcoco_me2 天前
【4】Transformers快速入门:自然语言模型 vs 统计语言模型
人工智能·语言模型·自然语言处理
Kyln.Wu2 天前
【python实用小脚本-182】Python一键爬取今日新闻:5分钟生成Word+CSV——再也不用复制粘贴
开发语言·python·word
TDengine (老段)2 天前
TDengine IDMP 基本功能(3.数据三化处理)
大数据·数据库·物联网·ai·语言模型·时序数据库·tdengine
陈敬雷-充电了么-CEO兼CTO2 天前
OpenAI开源大模型 GPT-OSS 开放权重语言模型解析:技术特性、部署应用及产业影响
人工智能·gpt·ai·语言模型·自然语言处理·chatgpt·大模型
Hcoco_me2 天前
【8】Transformers快速入门:Decoder 分支和统计语言模型区别?
人工智能·语言模型·自然语言处理
_oP_i2 天前
Model Context Protocol (MCP)标准化应用程序向大型语言模型 (LLM) 提供上下文协议
人工智能·语言模型·自然语言处理
PythonPioneer2 天前
如何使用AI大语言模型解决生活中的实际小事情?
人工智能·语言模型·生活
揭老师高效办公2 天前
在Word和WPS文字一页中实现一栏与多栏混排
word·wps文字