Skip to content

This repository is a tutorial targeting how to train a deep neural network model in a higher efficient way. In this repository, we focus on two main frameworks that are Keras and Tensorflow.

Notifications You must be signed in to change notification settings

jiankaiwang/distributed_training

Repository files navigation

Model Training in Keras and Tensorflow

This repository is a tutorial targeting how to train a deep neural network model in a higher efficient way. In this repository, we focus on two main frameworks that are Keras and Tensorflow. Parts of the content (scripts, documents, etc.) were referred from Tensorflow/Model repository and Keras document. The relative framework version please refer to the top description of each script.

Since 2019, Tensorflow has officially released version 2 and lots of new API or functionalities are introduced, e.g. tf.function, etc. In the new version of Tensorflow, the Keras APIs were merged into the Tensorflow Core and were updated to operate the Tensorflow 2 core. However, Tensorflow version 1 is still updating and upgrading so the docs and scripts still remain.

The document to this repository :

The following are the list for the main reference.

Content

Tensorflow 1: 2015 ~ Now

At the beginning of each framework, we introduced the basic model training process and its implementation script as well. After that, we introduced two different types of advanced model training that are (1) in multiple graphics cards (GPUs) and (2) in multiple machines/containers with multiple GPUs.

We used CIFAR-10 image dataset and tried to train a deep neural network model recognizing them (a typical classification problem) as an example.

Keras

Tensorflow

Tensorflow 2: 2019 ~ Now

In Tensorflow 2, the Keras APIs are specified to the Tensorflow.Keras APIs, not the original Keras from https://keras.io/.

Single Worker with Multiple Accelerators

Multiple Workers

Cloud TPU

  • A Training Flow using Tensorflow 2 APIs on the Cloud TPU: TF2_CloudTPU

Saving and Loading a Model

  • Saving and Loading a Model using a Distributed Strategy: ipynb

About

This repository is a tutorial targeting how to train a deep neural network model in a higher efficient way. In this repository, we focus on two main frameworks that are Keras and Tensorflow.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published