[논문 정리]Memory Networks

[논문 정리]Memory Networks

2020. 7. 12. 10:49ㆍMachine Learning/NLP-UGRP

2014

Memory Networks

We describe a new class of learning models called memory networks. Memory networks reason with inference components combined with a long-term memory component; they learn how to use these jointly. The long-term memory can be read and written to, with the g

arxiv.org

Authors: Jason Weston, Sumit Chopra, Antoine Bordes

Search | arXiv e-print repository

Showing 1–33 of 33 results for author: Bordes, A arXiv:2006.12442 [pdf, other] cs.CL cs.AI Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions Authors: Stephen Roller, Y-Lan Boureau, Jason Weston, Antoine Bordes,

arxiv.org

[1. Introduction]

RNN's memory(encoded by hidden states and weights) is typically too small, and is not compartmentalized enough to accurately remember facts from the past(knowledge is compressed into dense vectors).

e.g., RNNs are known to have difficulty in performing memorization, for example the simple copying task of outputting the same input sequence they have just read.

In this work, we introduce a class of models called memory networks that attempt to rectify this problem.

The model is then trained to learn how to operate effectively with the memory component.

[2. Memory Networks]

m: an array of objects(array of vectors or an array of strings) indexed by m_i

four(potentially learned) components: I, G, O, R

I: (input feature map) - converts the incoming input to the internal feature representation.

G: (generalization) - updates old memories given the new input. We call this as there is an opportunity for the network to compress and generalize its memories at this stage for some intended future use.

O: (output feature map) - produces a new output(in the feature representation space), given the new input and the current memory state.

R: (response) - converts the output into the response format desired. e.g., a textual response or an action.

input x: word, sentence, image, audio signal

1. Convert x to an internal feature representation I(x)

2. Update memories m_i given the new input: m_i = G(m_i,I(x),m), ∀i.

3. Compute output features o given the new input and the memory: o = O(I(x),m).

4. Finally, decode output features o to give the final response: r = R(o).

memories are also stored at test time, but the model parameters of I, G, O and R are not updated.

I,G,O,R can use any existing ideas from the machine learning literature, e.g., make use of your favorite models(SVMs, decision trees, etc.).

I component: standard pre-processing.

G component: The simplest form of G is to store I(x) in a “slot” in the memory: m_H(x) = I(x),

H(.)는 slot을 고르는 함수.

G가 m의 index H(x)를 update함.

but all other parts of the memory remain untouched.

'Machine Learning > NLP-UGRP' 카테고리의 다른 글

Memory network (0)	2020.07.21
[논문정리] StyleNet: Generating Attractive Visual Captions with Styles (0)	2020.07.13
[논문 정리]End-To-End Memory Networks (0)	2020.07.12
데이터 전처리 (0)	2020.07.07
GloVe(글로브) 모델 (0)	2020.07.05

Yu-gyoung 유경

Yu-gyoung 유경

태그

최근글

댓글

공지사항

아카이브

'Machine Learning > NLP-UGRP' 카테고리의 다른 글

관련글

티스토리툴바