Tuesday, March 22, 2016

Instructions to create ImageNet 2012 data


Instructions to create ImageNet 2012 data

Note: you need in total 250GB of available memory on your hard disk

Step1:

mkdir /hdd/ImageNet
cd /hdd/ImageNet

Step2: Download ImageNet data

Step3: decompress ImageNet data

To extract training data mkdir train && mv ILSVRC2012_img_train.tar train/ && cd train
tar -xvf ILSVRC2012_img_train.tar && rm -f ILSVRC2012_img_train.tar
find . -name "*.tar" | while read NAME ; do mkdir -p "${NAME%.tar}"; tar -xvf "${NAME}" -C "${NAME%.tar}"; rm -f "${NAME}"; done
// Make sure to check the completeness of the decompression, you should have 1,281,167 images in train folder
To extract validation data
cd ../ && mkdir val && mv ILSVRC2012_img_val.tar val/ && cd val && tar -xvf ILSVRC2012_img_val.tar

Step4: preprocess ImageNet data

This step requires that you have built the caffe project (either the OpenCL caffe or original caffe in CPU_ONLY mode), because we are going to use some of the scripting tools provided by caffe.
cd data/ilsvrc2012
./get_ilsvrc.sh
cd ../../
vi /example/imagenet/create_imagenet.sh
modify the following variables to point to your ImageNet data dir
TRAIN_DATA_ROOT=/hdd/ImageNet/train
VAL_DATA_ROOT=/hdd/ImageNet/val
then set data resize bool to true:
RESIZE=true
then you are ready to create the lmdb format of ImageNet data, as needed by the trianing! ./examples/imagenet/create_imagenet.sh

No comments: