IDL Spring 2024: Lecture 13

Показать описание

This is the thirteenth lecture of the 11785 Introduction to Deep Learning course at CMU in which we covered the following topics:

- Time-series data, where the sequence of past inputs carries information about the current inference, require models that consider past values when making the current prediction
- Models that look into a finite past are simply 1D-CNNs
- To look into the infinite past we need recurrence
- The simplest recurrence considers past outputs along with current inputs
- - This ensures that an input affects the output for the indefinite future.
- - These are NARX networks
- Models can also hold the memory of the past internally (rather than through the output)
- - Older "partially recurrent" models stored them through intermediate "memory" variables
- - - Jordan networks use a memory neuron that retains a running average of outputs
- - - Elman networks clone the hidden layer of the network as a memory unit
- "Fully recurrent" networks are state-space models, with a recurrent state.
- - The "state" may be arbitrarily complex
- - These networks are called "Recurrent Neural Networks" (or "RNNs")
- An RNN can be "unrolled" over time into a chain of identical "columns" of computation
- - This essentially forms a very deep shared-parameter network
- To train an RNN we must compute the divergence between the sequence of outputs by the network and the desired sequence of outputs
- - This is not necessarily the sum of the divergences at individual time instants.
- - We will nevertheless need the derivative of the divergence w.r.t. the output of the network at each time instant
- Backpropagation starts at the final output and derivatives are propagated backward through time
- - At each time, loss derivatives for the output at that time are backpropagated and accumulated with derivatives coming backward from later in the sequence
- - Derivative rules for shared-parameter networks apply
- Recurrence can be extended to be bi-directional in cases where sequential (left-to-right) processing is not expected
- Bi-directional networks include "bi-directional blocks", which have two components (subnets), one of which analyzes the input left to right (forward net), and the other analyzes it right to the left (reverse net).
- - The outputs of the two components are concatenated to produce the output of the block
- - During training, the appropriate components of the derivatives at the output of the bidirectional block are "sent" to the individual components
- - - Backpropagation is performed end to beginning for the forward net and from the beginning to the end for the reverse net
- - - The backpropagated derivatives at the inputs to the two subnets are added to form the backpropagated derivative at the input to the block

Carnegie Mellon University Deep Learning

Рекомендации по теме

IDL Spring 2024: Lecture 13

IDL Spring 2024: Lecture 13

IDL Spring 2024: Lecture 16

How much does B.TECH pay?

Seminar in Computer Architecture - L2b: Data Retention in Memory (Spring 2024)

#13 Python Tutorial for Beginners | IDLE Previous Command | Clear Screen?

Python Full Course | Free Python Tutorial for Beginners

How to Install Java JDK 21 on Windows 11

Local Time And Standard Time | Time Zones Explained | Geography NCERT | UPSC Prelims and Mains 2023

NTSB Board Meeting: Norfolk Southern Train Derailment with Subsequent Hazmat Release & Fires

MOVING CHARGES AND MAGNETISM | ONE SHOT | 55 MINS ONLY! | PURELY ENGLISH | #UnitTest1 | Hary's ...

A Complete Guide to 'My Mother at Sixty-Six' - CBSE Class 12 English Literature

GEOGRAPHY ONE SHOT LECTURE FOR SSC CGL 2024 | GK/GS FOR SSC EXAMS 2024 | PARMAR SSC

How To Fix A Rough Idle / Cold Start issues on a BMW E90 N52 Engine - Problem Solved!

Cholesterol & Keto: Which Numbers Matter?

Diplomacy Forum 2022 - Ocean Alliance and Governance: A Rising Tide for Maritime Diplomacy

Revision of IOT Enabling Technology by Mr. A Prashanth

Torque on a current carrying semicircular wire placed in uniform magnetic field.

GEOGRAPHY FOR SSC | EARTH 🌎 & LATITUDES LONGITUDES

What Java Technology can do? | Java Tutorial

Geography for IAS/UPSC, Lec-14, English Medium | Great Circle & Application

Is High LDL Cholesterol Bad on a Keto Diet?

Python tutorial for Beginners [Full Course] Learn Python for Machine Learning & data Sciences. P...

With My Tracking System I Can Find Any Treasure Hidden In This Cultivation World - Manhwa Recap

Mending a Broken Heart: Reimagined Naturopathic Advantages to Heart Failure