Using the tf.data API to build input pipelines (TensorFlow Meets)

preview_player
Показать описание


event: TensorFlow Dev Summit 2018; re_ty: Publish; product: TensorFlow - General; fullname: Laurence Moroney, Derek Murray; event: TensorFlow Dev Summit 2018;
Рекомендации по теме
Комментарии
Автор

Very thankful for `tf.data` and the awesome developer team at TensorFlow. Keep up the good work!

marshalhayes
Автор

Please make some tutorials on making input pipelines with tf.data and writing tfrecords and parsing them.Writing and paarsing for Classification data is okay but when you have multiple data for single image like in case of object detection it is really tough and also lacks tutorials.It would be awesome if we had a tutorial for writing, parsing and making iterators for complex data that would be awesome.

kislaykunal
Автор

Thanks a lot. After watching your video I am trying to run a deep learning program that uses estimators written in tensorflow but the gpu utilization is stuck at 8%. I am finding difficult to identify the issue. I used map_and_batch(), prefetch() for data inputpipeline but it doesnot help. How can I improve the GPU utilization. Any pointers are greatly appreciated.
Thank you

PavanKumar-jcqn
Автор

Consider renaming the video. The title can be misleading as you are not talking about its usage, but the latest updates.

dalofeco
Автор

Thanks for the job you do, I appreciate too.

Is it possible to iterate through your dataset as with a generator without explicitly creating TFRecords files?
I found the process too long in comparison to the Keras branch where we can from python generators directly link data to models for training.

1- I think automatize the TFRecords files creation should be a plus, a bit as here (I plan to move deeper in later):

col = [c for c in train.columns if c not in ['id', 'target']]
train_files = tf.gfile.Glob('train.csv')
batch_size = 128
# Input builders
def input_fn_train(): # Returns a tuple of features and labels.

features =
file_pattern=train_files,
batch_size=batch_size,
# creates parsing configuration for tf.parse_example

col,
label_key='target',
label_dtype=tf.string,
weight_column='example-weight'),
reader=tf.RecordIOReader)
labels = features.pop('target')
return features, labels




2- Will the new tf.data succeed in handling data files of any kind of type we use in real life applications (CSV, TXT, JSON, BSON, Images, etc.), so that we avoid previous long separate implementations for each case?

kkjc
Автор

Thanks for the job you do, I appreciate too.


a- What is the comparison between the new tf.data and the Keras branch of TensorFlow?
Indeed using Keras data handlers, I can easily fetch the same data without TFRecords using the NasNet model for example), while with the old version of tf.data I needed several data transformations, TFRecords and a long parser_fn function. Is it possible now to handle an image dataset for example without explicit generators and TFRecords?

b- Will the new tf.data succeed in handling data files of any kind of type we use in real life applications (CSV, TXT, JSON, BSON, Images, etc.), so that we avoid previous long separate implementations for each case?

Please, give some guidelines, I will try later when I will get a bit of free time.

kkjc
Автор

Two irish lads, well i'll be damned.

seamusodualing