These days I am constantly experimenting models with different architecture / hyper parameters. Because I am working with lots of images, I realized that it takes quite a bit of time for loading training or validation images every epoch from the disk. Yes, I am using a solid state drive, but it is still slow compared to RAM. To speed up the training, I have been saving images into the RAM in my Python code, which definitely expedites the training.
However, there are two main issues I see. The first is that raw training files are usually in JPEG images, which do not take up much space. However, when I am saving the image data in the Python code, I save the images in the numpy array of bitmap format, which significantly expands the storage space.
The second issue is that I have two GPUs in the server, thus sharing the CPU and RAM. When I am running different models on each GPU, but they share the same set of training data, I would have two copies of the exact same dataset in memory.
So, I was looking for a solution, and I found one that is very easy to implement. On Linux, there is a way to create a RAM disk, which is basically chunk of data saved in RAM but the system treats it as if it is a disk. Basically, I would mount a RAM disk and have the programs access the data from this RAM disk mount location.
Here is how to do it. The instructions below are based on this excellent article. First, create a folder to which you will mount the RAM disk.
$ sudo mkdir /mnt/ramdisk
Next, mount the RAM disk with specified space. For example,
$ sudo mount -t tmpfs -o size=1024m tmpfs /mnt/ramdisk
Finally, copy your training data into this folder and make sure to have your training code point to the new location.
$ cp -r /your/training/data /mnt/ramdisk
Happy training!
** Something to keep in mind **
- This is RAM, so every time your system reboots, the content will be gone
- Make sure that you have enough free RAM so that your data can fit in
No comments:
Post a Comment