could you please clarify if the train.sh file should produce a trained model from scratch or just pre-process the data (if necessary) ?
In case it should do the full training, the given configuration seems a bit low:
- 7 days
From what I've read, p2.xlarge are rather slow nowadays (btw, best article on GPUs for DL => http://timdettmers.com/2017/04/09/which-gpu-for-deep-learning/ ). This is roughly equivalent to 1-1.5 days for a 1080Ti or Titan. Given the dataset size, this looks low (I didn't start training yet, just noticed this point that may be a problem).