深度學習自然語言處理模型實現大集合(精簡版<100行)

深度學習自然語言處理模型實現大集合(精簡版<100行)

本資源整理了現有常見NLP深度學習模型,借鑑相關TensorFlow和Pytorch代碼實現相關的模型代碼,對絕大多數NLP模型進行精簡,多數模型都是用不到100行代碼實現的,(註釋或空行除外)。


從NLP中的第一個語言模型NNLM開始,逐步包括RNN,LSTM,TextCNN,Word2Vec等經典模型。幫助讀者更輕鬆地學習NLP模型,實現和訓練各種seq2seq,attention注意力模型,bi-LSTM attenton,Transformer(self-attention)到BERT模型等等。


1. Embedding 語言Model

•1-1. NNLM(Neural Network Language Model) - Predict Next Word

oPaper - A Neural Probabilistic Language Model(2003)

oColab - NNLM_Tensor.ipynb, NNLM_Torch.ipynb

•1-2. Word2Vec(Skip-gram) - Embedding Words and Show Graph

oPaper - Distributed Representations of Words and Phrases and their Compositionality(2013)

oColab - Word2Vec_Tensor(NCE_loss).ipynb, Word2Vec_Tensor(Softmax).ipynb, Word2Vec_Torch(Softmax).ipynb

•1-3. FastText(Application Level) - Sentence Classification

oPaper - Bag of Tricks for Efficient Text Classification(2016)

oColab - FastText.ipynb


2. CNN(Convolutional Neural Network)

•2-1. TextCNN - Binary Sentiment Classification

oPaper - Convolutional Neural Networks for Sentence Classification(2014)

oColab - TextCNN_Tensor.ipynb, TextCNN_Torch.ipynb

•2-2. DCNN(Dynamic Convolutional Neural Network)


3. RNN(Recurrent Neural Network)

•3-1. TextRNN - Predict Next Step

oPaper - Finding Structure in Time(1990)

oColab - TextRNN_Tensor.ipynb, TextRNN_Torch.ipynb

•3-2. TextLSTM - Autocomplete

oPaper - LONG SHORT-TERM MEMORY(1997)

oColab - TextLSTM_Tensor.ipynb, TextLSTM_Torch.ipynb

•3-3. Bi-LSTM - Predict Next Word in Long Sentence

oColab - Bi_LSTM_Tensor.ipynb, Bi_LSTM_Torch.ipynb


4. Attention Mechanism

•4-1. Seq2Seq - Change Word

oPaper - Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation(2014)

oColab - Seq2Seq_Tensor.ipynb, Seq2Seq_Torch.ipynb

•4-2. Seq2Seq with Attention - Translate

oPaper - Neural Machine Translation by Jointly Learning to Align and Translate(2014)

oColab - Seq2Seq(Attention)_Tensor.ipynb, Seq2Seq(Attention)_Torch.ipynb

•4-3. Bi-LSTM with Attention - Binary Sentiment Classification

oColab - Bi_LSTM(Attention)_Tensor.ipynb, Bi_LSTM(Attention)_Torch.ipynb


5. Model based on Transformer

•5-1. The Transformer - Translate

oPaper - Attention Is All You Need(2017)

oColab - Transformer_Torch.ipynb, Transformer(Greedy_decoder)_Torch.ipynb

•5-2. BERT - Classification Next Sentence & Predict Masked Tokens

oPaper - BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding(2018)

oColab - BERT_Torch.ipynb

深度學習自然語言處理模型實現大集合(精簡版<100行)

內容源地址:https://www.toutiao.com/a1664931447872520

往期精品內容推薦

【全網首發】京東AI三大NLP項目實戰

342箇中、英文等NLP開源數據集分享

20年2月新書-《貝葉斯算法分析技術第三版》免費分享

李宏毅-《深度學習/機器學習2020》中文視頻課程及ppt分享

MIT新課-《6.824分佈式系統2020春》視頻及ppt分享

最新免費書推薦-《因果推理算法概述》pdf免費下載

40+機器學習教程分享-涵蓋機器學習所有方面

機器學習必看經典教材-《統計機器學習(數據挖掘、推理和預測)核心元素》最新版免費分享

斯坦福大學新課CS224W-2019-圖網絡機器學習算法-視頻及ppt資源分享

歷史最全自然語言處理測評基準分享-數據集、基準(預訓練)模型、語料庫、排行榜

自然語言領域中圖神經網絡模型(GNN)應用現狀(論文)

中文自然語言處理測評數據集、基準模型、語料庫、排行榜整理分享


分享到:


相關文章: