Distributed TensorFlow (TensorFlow Dev Summit 2017)

preview_player
Показать описание
TensorFlow gives you the flexibility to scale up to hundreds of GPUs, train models with a huge number of parameters, and customize every last detail of the training process. In this talk, Derek Murray gives you a bottom-up introduction to Distributed TensorFlow, showing all the tools available for harnessing this power.

Further reading:

event: TensorFlow Dev Summit 2017; re_ty: Publish;
Рекомендации по теме
Комментарии
Автор

Contents
Objectives: 3:41
Intro: 4:00
Distbelief inspiration: 5:51
Replication: 7:55
In-graph replication: 8:21
Between-graph replication: 9:54
Variable placement: 11:17
Device placement summary: 14:39
Sessions and servers: 15:14
Fault tolerance: 18:51
High-level APIs: 25:08

AmilaManoj
Автор

22:10, how can the chief work restore the failed PS tasks?

xdxn
Автор

Nice talk. Any pointers to the presentation charts?

raghkripa
Автор

10:38
Is there any difference of a smaller graph between these 2 tasks?
Isn't there the same subgraph (output = ...\ loss = ...) ?
Or how is it transform to 2 (or maybe more) subgraph?

redfishleo
Автор

What if I have Images as input data for between graph training on multiple nodes? Do I need to put image database on each of those workers? Please guide me

thesawatdatta
Автор

A question, Does the Distributed Tensorflow uses Round Robin algorithm.

jugsma
Автор

TensorFlow を使用した複数ノード間 multi-GPU のわかりやすい解説だ.秋葉さんの ChainerMN 解説と一緒に見るのがおすすめ.

ryonakamura
Автор

Awesome stuff :)

Wisdom of Mycroft Holmes(Mark Gattis)

utkarsh_dubey
Автор

just saying dist-keras .. uses spark too ... sorry .. just wanted to leave that here.

chefboyrdee