本资源整理了与自然语言处理(NLP)相关的深度学习技术核心概念,以及2019年概念相关最新的论文,涉及算法优化(Adam,Adagrad、AMS、Mini-batch SGD等),参数初始化(Glorot initialization、 He initialization),模型约束(Dropout、 Word Dropout、Patience、Weight Decay等),归一化,损失函数类型,网络训练方法,激活函数选择,CNN、RNN网络结构等核心概念。
核心概念连个方面:1、梳理了深度学习、NLP相关技术核心概念;2、整理了这些概念相关最新论文。非常值得推荐。
资源整理自网络,源地址:https://github.com/neulab/nn4nlp-concepts/blob/master/concepts.md
带论文链接资源下载地址:
链接: https://pan.baidu.com/s/1lC8DiPJnyzbxtvns-HXr_w
提取码: yv6g
参数优化/学习
优化器与优化策略
•Mini-batch SGD: optim-sgd
•Adam: optim-adam (implies optim-sgd)
•Adagrad: optim-adagrad (implies optim-sgd)
•Adadelta: optim-adadelta (implies optim-sgd)
•Adam with Specialized Transformer Learning Rate ("Noam" Schedule): optim-noam (implies optim-adam)
•SGD with Momentum: optim-momentum (implies optim-sgd)
•AMS: optim-amsgrad (implies optim-sgd)
•Projection / Projected Gradient Descent: optim-projection (implies optim-sgd)
参数初始化
•Glorot/Xavier Initialization: init-glorot
•He Initialization: init-he
参数约束策略
•Dropout: reg-dropout
•Word Dropout: reg-worddropout (implies reg-dropout)
•Norm (L1/L2) Regularization: reg-norm
•Early Stopping: reg-stopping
•Patience: reg-patience (implies reg-stopping)
•Weight Decay: reg-decay
•Label Smoothing: reg-labelsmooth
归一化策略
•Layer Normalization: norm-layer
•Batch Normalization: norm-batch
•Gradient Clipping: norm-gradient
损失函数
•Canonical Correlation Analysis (CCA): loss-cca
•Singular Value Decomposition (SVD): loss-svd
•Margin-based Loss Functions: loss-margin
•Contrastive Loss: loss-cons
•Noise Contrastive Estimation (NCE): loss-nce (implies loss-cons)
•Triplet Loss: loss-triplet (implies loss-cons)
训练方法
•Multi-task Learning (MTL): train-mtl
•Multi-lingual Learning (MLL): train-mll (implies train-mtl)
•Transfer Learning: train-transfer
•Active Learning: train-active
•Data Augmentation: train-augment
•Curriculum Learning: train-curriculum
•Parallel Training: train-parallel
序列模型结构
激活函数
•Hyperbolic Tangent (tanh): activ-tanh
•Rectified Linear Units (RelU): activ-relu
池化操作
•Max Pooling: pool-max
•Mean Pooling: pool-mean
•k-Max Pooling: pool-kmax
循环结构
•Recurrent Neural Network (RNN): arch-rnn
•Bi-directional Recurrent Neural Network (Bi-RNN): arch-birnn (implies arch-rnn)
•Long Short-term Memory (LSTM): arch-lstm (implies arch-rnn)
•Bi-directional Long Short-term Memory (LSTM): arch-bilstm (implies arch-birnn, arch-lstm)
•Gated Recurrent Units (GRU): arch-gru (implies arch-rnn)
•Bi-directional Gated Recurrent Units (GRU): arch-bigru (implies arch-birnn, arch-gru)
其他序列化/结构化结构
•Bag-of-words, Bag-of-embeddings, Continuous Bag-of-words (BOW): arch-bow
•Convolutional Neural Networks (CNN): arch-cnn
•Attention: arch-att
•Self Attention: arch-selfatt (implies arch-att)
•Recursive Neural Network (RecNN): arch-recnn
•Tree-structured Long Short-term Memory (TreeLSTM): arch-treelstm (implies arch-recnn)
•Graph Neural Network (GNN): arch-gnn
•Graph Convolutional Neural Network (GCNN): arch-gcnn (implies arch-gnn)
结构优化技巧
•Residual Connections (ResNet): arch-residual
•Gating Connections, Highway Connections: arch-gating
•Memory: arch-memo
•Copy Mechanism: arch-copy
•Bilinear, Biaffine Models: arch-bilinear
•Coverage Vectors/Penalties: arch-coverage
•Subword Units: arch-subword
•Energy-based, Globally-normalized Mdels: arch-energy
标准复合结构
•Transformer: arch-transformer (implies arch-selfatt, arch-residual, arch-layernorm, optim-noam)
模型组合
•Ensembling: comb-ensemble
寻优搜索算法
•Greedy Search: search-greedy
•Beam Search: search-beam
•A* Search: search-astar
•Viterbi Algorithm: search-viterbi
•Ancestral Sampling: search-sampling
•Gumbel Max: search-gumbel (implies search-sampling)
预测任务
•Text Classification (text -> label): task-textclass
•Text Pair Classification (two texts -> label: task-textpair
•Sequence Labeling (text -> one label per token): task-seqlab
•Extractive Summarization (text -> subset of text): task-extractive (implies text-seqlab)
•Span Labeling (text -> labels on spans): task-spanlab
•Language Modeling (predict probability of text): task-lm
•Conditioned Language Modeling (some input -> text): task-condlm (implies task-lm)
•Sequence-to-sequence Tasks (text -> text, including MT): task-seq2seq (implies task-condlm)
•Cloze-style Prediction, Masked Language Modeling (right and left context -> word): task-cloze
•Context Prediction (as in word2vec) (word -> right and left context): task-context
•Relation Prediction (text -> graph of relations between words, including dependency parsing): task-relation
•Tree Prediction (text -> tree, including syntactic and some semantic semantic parsing): task-tree
•Graph Prediction (text -> graph not necessarily between nodes): task-graph
•Lexicon Induction/Embedding Alignment (text/embeddings -> bi- or multi-lingual lexicon): task-lexicon
•Word Alignment (parallel text -> alignment between words): task-alignment
预训练向量融合策略
•word2vec: pre-word2vec (implies arch-cbow, task-cloze, task-context)
•fasttext: pre-fasttext (implies arch-cbow, arch-subword, task-cloze, task-context)
•GloVe: pre-glove
•Paragraph Vector (ParaVec): pre-paravec
•Skip-thought: pre-skipthought (implies arch-lstm, task-seq2seq)
•ELMo: pre-elmo (implies arch-bilstm, task-lm)
•BERT: pre-bert (implies arch-transformer, task-cloze, task-textpair)
•Universal Sentence Encoder (USE): pre-use (implies arch-transformer, task-seq2seq)
结构化模型/算法
•Hidden Markov Models (HMM): struct-hmm
•Conditional Random Fields (CRF): struct-crf
•Context-free Grammar (CFG): struct-cfg
•Combinatorial Categorical Grammar (CCG): struct-ccg
不可导函数训练方法
•Complete Enumeration: nondif-enum
•Straight-through Estimator: nondif-straightthrough
•Gumbel Softmax: nondif-gumbelsoftmax
•Minimum Risk Training: nondif-minrisk
•REINFORCE: nondif-reinforce
对抗方法
•Generative Adversarial Networks (GAN): adv-gan
•Adversarial Feature Learning: adv-feat
•Adversarial Examples: adv-examp
•Adversarial Training: adv-train (implies adv-examp)
隐变量模型
•Variational Auto-encoder (VAE): latent-vae
•Topic Model: latent-topic
元学习
•Meta-learning Initialization: meta-init
•Meta-learning Optimizers: meta-optim
•Meta-learning Loss functions: meta-loss
•Neural Architecture Search: meta-arch
往期精品内容推荐
AI必读10本经典-深度学习花书《Deep Learning》-中英文版分享
大数据实战书籍 -《Kafka实战指引-实时海量流式数据处理》最新免费pdf分享
程序员必读-《统计思考:程序员必备概率和统计知识》免费pdf分享
DeepLearning_NLP
深度学习与NLP
商务合作请联系微信号:lqfarmerlq