This is the slides I used for a talk I did recently in our reading group. The slides, particularly the *Attention* part, was based on one of Quoc Le’s talks on the same topic. I couldn’t come up with any better visual than what he did.

It has been quite a while since the last time I look at this topic, unfortunately I never managed to fully anticipate its beauty. Seq2Seq is one of those *simple-ideas-that-actually-work* in Deep Learning, which opened up a whole lot of possibilities and enabled many interesting work in the field.

A friend of mine did Variational Inference for his PhD, and once he said Variational Inference is one of those *mathematically-beautiful-but-don’t-work *things in Machine Learning.

Indeed, there are stuff like Variational, Bayesian inference, Sum-Product Nets etc… that come with beautiful mathematical frameworks, but don’t really *work* at scale, and stuff like Convolutional nets, GANs, etc.. that are a bit slippery in their mathematical foundation, often empirically discovered, but work really well in practice.

So even though many people might not really like the idea of GANs, for example, but given this “empirical tradition” in Deep Learning literature, probably they are here to stay.

### Like this:

Like Loading...

*Related*

anh ơi cho em hỏi ngoài luồng một chút là hiện tại họ dùng mạng gì để tạo ra các chatbot ạ?

Theo anh biết thì Chatbot trong Facebook Messenger hay Amazon Lex không dùng mạng gì ghê gớm cả, chỉ là 1 hệ rule-based system, với lại 1 mô hình tốt vừa phải cho Entity Recognition thôi.

Em cảm ơn anh. Như vậy họ có sẵn một tập mẫu các câu trả lời cho các chủ đề. Vấn đề chỉ là làm sao nhận dạng ra đúng câu hỏi thôi phải không anh?