Wednesday, July 11, 2018

Pytorch Implementation of BatchNorm

Batch Normalization is a really cool trick to speed up training of very deep and complex neural network. Although Pytorch has its own implementation of this in the backend, I wanted to implement it manually just to make sure that I understand this correctly. Below is my implementation on top of Pytorch's dcgan example (BN class starts at line 103)


Although this implementation is very crude, it seems to work well when tested with this example. To run this, type in
$ python main.py --cuda --dataset cifar10 --dataroot .

Friday, July 6, 2018

Speeding up Numpy with Parallel Processing

Numpy is a bit strange; by default it utilizes all cores available, but the its speed doesn't seem to improve with the number of available cores.
For example try running the code below:


You will probably have to quit it after running it for some time, because it is just TOOOOO slow. Here is my output from a computer equipped with AMD Ryzen 1700x:

$ python test_numpy.py
with 0 procs, elapsed time: 42.59s
with 1 procs, elapsed time: 43.05s
with 2 procs, elapsed time: 658.66s
^C


So, what is the problem? Although I am not sure of the details, it seems that numpy's default multiprocessing library is quite horrible. It is supposed to use all cores efficiently, it in fact creates bottleneck when there are lots of core.

Furthermore, when you want to carry out the same tasks multiple times in parallel, it makes it even worse, since all cores are already busy from a single task. Notice the time for pool.map with 2 procs has just jumped more than 10x!

BTW, I installed numpy from pip, i.e.,

$ pip install numpy

Maybe is would be better if I compile numpy manually and link better BLAS library, but that is just too painful.

Well, the good new is that there is in fact a very simple solution. Try this.
$ export OMP_NUM_THREADS=1
$ export NUMEXPR_NUM_THREADS=1

$ export MKL_NUM_THREADS=1
$ python test_numpy.py

with 0 procs, elapsed time: 26.91s
with 1 procs, elapsed time: 26.95s
with 2 procs, elapsed time: 13.53s
with 3 procs, elapsed time: 9.12s
with 4 procs, elapsed time: 6.82s
with 5 procs, elapsed time: 5.42s
with 6 procs, elapsed time: 4.61s
with 7 procs, elapsed time: 3.91s
with 8 procs, elapsed time: 4.52s
with 9 procs, elapsed time: 4.62s
with 10 procs, elapsed time: 3.42s
with 11 procs, elapsed time: 3.92s
with 12 procs, elapsed time: 3.62s
with 13 procs, elapsed time: 3.42s
with 14 procs, elapsed time: 3.32s
with 15 procs, elapsed time: 3.22s
with 16 procs, elapsed time: 3.12s


With the exact same task, I am seeing more than 13x speed up compared to the previous result!

Simple Tic Toc Alternative in Python

In Matlab, tic, toc functions provide very simple way to display time elapsed. We can create a similar mechanism in Python. Note that the source code below is only tested for Python3.



Very convinient!

Thursday, June 21, 2018

How to Build Audacity on Ubuntu 18.04 LTS

Audacity is a great alternative to Adobe Audition. Here is how to build Audacity on Ubuntu 18.04 LTS, since there is no binary.

First, clone the git repo
$ git clone https://github.com/audacity/audacity.git
$ cd audacity

Next, install necessary packages:
$ sudo apt-get install -y libwxbase3.0-dev libwxgtk3.0-dev zlib1g-dev libasound2-dev libgtk-3-dev

Now, you are ready to configure and compile!
$ ./configure
$ make -j4

That's it!

Saturday, June 9, 2018

Solution to Jupyter Error of Consistent Restart

While I was trying to run jupyter notebook, I noticed some weird error where the notebook keeps restarting with some error. 
    self.asyncio_loop.run_forever()
  File "/usr/lib/python3.5/asyncio/base_events.py", line 340, in run_forever
    raise RuntimeError('Event loop is running.')
RuntimeError: Event loop is running.

After some time searching for solution on Google, here is what I found: I simply need to re-install some of the jupyter-related packages.
$ pip uninstall ipykernel ipython jupyter_client jupyter_core traitlets ipython_genutils -y
$ pip install ipykernel ipython jupyter_client jupyter_core traitlets ipython_genutils

Now, the jupyter notebook runs just fine. I suppose the problem was that I installed some packages that somehow interfered with the already-installed jupyter related packages above and caused the error. After removing and re-installing the jupyter-related packages, jupyter finally seems to run just fine!

Tuesday, May 29, 2018

Function Parameter Hints in Python3

Python variables are not type-specific. For example, you can assign different types of data to the same variable multiple times. For example,

a = 1; a = 'a'; a = 2.3

works fine on Python, but won't work in C/C++, because once the variable type is assigned, it cannot be changed. This makes it much easier to write code in Python, but at the same time it brings about some disadvantages.

For example, with strongly-typed variables, smart editor (i.e., editor with autocompletion feature) can easily guess what options to give out to user as the user is typing. On the other hand, if the variable is not strongly-typed, it is fairly difficult for the editor to figure out what to suggest.

Consider the following code. Try to manually type the code on your editor that has autocompletion feature. You will notice that while typing the definition for the function equal, your editor can't be much of a help in terms of auto-completion.

Now, Consider the revised code where type hints have been added. This time, your editor should be able to figure out what type of variables a and b are, and should provide accurate suggestions to you.
By the way, this awesome feature is only available in Python3; personally, however, this feature alone is enough for anyone to consider transition from Python 2 to Python 3!

Monday, May 28, 2018

Import CMAKE Project on NetBeans

In the previous post, I discussed how to import a CMAKE project for Eclipse, and it wasn't that easy. Today, I will discuss how to import a CMAKE project on NetBeans. Again, I will use cmake-exmaple as an example project.

Clone and download the CMAKE project on your computer.
$ git clone https://github.com/bast/cmake-example.git ~/cmake-example

First, open up NetBeans C++ IDE. If you are running it on Mac OS X, I recommend running it from Terminal
$ /Applications/NetBeans/NetBeans\ 8.2.app/Contents/MacOS/netbeans 

The reason is that if you simply run NetBeans from GUI, your environment variables might not sync.

Next, select File --> New Project --> C/C++ Project with Existing Sources --> Next. Choose Browse the folder which contains your CMAKE project, in this example ~/cmake-example folder. Select Custom under Select Configuration Mode and select Next. Check-box Pre-Build Step is Required. Specify Run in Folder as the project root directory, i.e., ~/cmake-example. Enter the following for Command
cmake -H. -Bbuild

Continue with remaining options and adjust what is necessary. Usually, the default setting should work.

Once the project has been imported, you will need to verify the settings. Right-click the project name cmake-example on the project pane on the left, and select Properties. Make sure in the Build --> Pre-Build option, you have the Command Line is set as cmake -H. -Bbuild and Pre-Build First box is checked.

In the Build --> Make option, the Working Directory should be set as build, since this is where the build will take place. Select Apply and Close

Press <F11> key to build the project. It should build successfully. Before we debug this, we have to make sure to set debug flag in the CMakeLists.txt file by setting CMAKE_CXX_FLAGS.

Open up CMakeLists.txt file in the project root directory, that is ~/cmake-example/CMakeLists.txt and not the file in the src directory. Modify line 35 to look as below:

# project version
set(VERSION_MAJOR 1)
set(VERSION_MINOR 0)
set(VERSION_PATCH 0)
set(CMAKE_CXX_FLAGS '-g')

Now, we are ready to run or debug. Re-build the project, and set breakpoint on src/main.cpp file. You should be able to debug it successfully.

Happy coding!

Tuesday, May 22, 2018

Import CMAKE Project to Eclipse CDT

In this post, I will discuss how to import a CMAKE project to Eclipse CDT. Upon Google search, I ran into this solution, but this did not work for me, so here is what I did instead. I am going to make use of cmake-example git repo project to demonstrate how, but you can easily do this for your own.

Open up Eclipse CDT and select File --> New --> C/C++ Project --> C++ Managed Build --> Next. Enter the project name, say cmake-example, and make sure the project type is Empty Project. Also, select the appropriate Toolchains; this will be Linux GCC or MacOSX GCC. Select Finish.

Go to the project root folder, and we will clone the git repository.
$ cd ~/Eclipse/workspace/cmake-example
$ git init
$ git remote add origin https://github.com/bast/cmake-example.git
$ git fetch
$ git checkout -t origin/master

Let's verify that the project compiles.
$ cmake -H. -BDebug
$ cd Debug
$ make -j4

Make sure that the project builds successfully. Also, note down your $PATH environment variable (highlighted in blue below) to be used later. Your variable may differ from mine.
$ echo $PATH
/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/opt/X11/bin

Now, on Eclipse right-click this project in the Project Explorer pane on the left and select Properties. In the C/C++ Build tab, uncheck both Use default build command and Generate Makefile automatically. Make sure the Build Directory as ${workspace_loc:/cmake-example}/Debug. This is the folder we created with CMAKE in the previous step and built the project. This folder contains CMAKE-generated Makefile, and we are simply asking Eclipse to execute make command in this particular folder.

This is the first step, where the Eclipse simply runs the make command of the Makefile generated from CMAKE. This setup is good if we are not going to edit CMAKE configs anymore. In reality, we probably will need to edit CMAKE configs.

Every time you modify CMakeLists.txt, you will need to re-create Makefile by running the CMAKE command again and again
$ cmake ..

Let's simply automate this build command with Eclipse. This is the second step of this post. Create build.sh in the project folder,
$ vim ~/Eclipse/workspace/cmake-example/build.sh

Simply write down the build commands that you would run from the Debug directory as follows:
cmake ..
make -j4

Now, in the Eclipse open up Project Properties window again. Expand C/C++ Build entry on the left and select Environment. Select Add button, and enter PATH for the Name field and your $PATH environment variable (noted in the previous step) for the Variable field. This is to make sure the Eclipse shell will be able to perform exactly what you can do with your own shell.

In the C/C++ Build tab, enter the following build command in place of make
sh ../build.sh

Select Apply and Close to close the properties window. The project should now successfully build, even if you have modified CMakeLists.txt files.

Lastly, you can modify the run command by selecting Run --> Run Configurations... and browse the executable for C/C++ Application:
~/Eclipse/workspace/cmake-example/Debug/bin/unit_tests

Happy hacking!

Install Ubuntu without a USB Stick

Disclaimer: I recommend that you experiment the method written in this post on a virtual machine first, because it can get quite tricky.

Let's say you want to wipe out your entire system and install Ubuntu. The easiest way is perhaps
1. download Ubuntu Live Image,
2. create a bootable USB stick, and
3. boot from the USB.

Well, if you were like me, who wipe out the entire system often, you will find it quite annoying to locate the USB, create the bootable stick, and so on. Furthermore, what if you don't have a USB stick in possession?

This post is to rescue you in such situations. You can simply download the image and boot from the image stored on your disk! Let's see how we can do this. Some of the references are here and here.

Here is the setup. First, you will need at least two partitions on your disk. One is Linux installation partition, and the other is to hold the iso image. Throughout the post, I am going to assume that your first partition is mounted as / and your second partition is mounted as /data.

You will need to download the Ubuntu Live image to the second partition, say
$ wget http://releases.ubuntu.com/18.04/ubuntu-18.04-desktop-amd64.iso -P /data

Now, you will have the iso image file saved as /data/ubuntu-18.04-desktop-amd64.iso. Make sure that you save the image in the partition other than where the Linux will be installed.

Next, you need to add a grub menu entry.
$ sudo vim /etc/grub.d/40_custom

Your file should look like below:
#!/bin/sh
exec tail -n +3 $0
# This file provides an easy way to add custom menu entries.  Simply type the
# menu entries you want to add after this comment.  Be careful not to change
# the 'exec tail' line above.

menuentry "Ubuntu 18.04 LTS" {
set isofile="/ubuntu-18.04-desktop-amd64.iso"
loopback loop (hd0,2)$isofile
echo "Starting $isofile..."
linux (loop)/casper/vmlinuz boot=casper iso-scan/filename=${isofile} quiet splash
initrd (loop)/casper/initrd.lz
}

Let me go over the system partition scheme once more. The above file applies to the partition scheme where
partition 1: /dev/sda1 --> currently mounted as /; will install Linux on this partition
partition 2: /dev/sda2 --> currently mounted as /data; holds the iso image

Since you downloaded the iso image on the /data directory, this is the root of the second partition. Therefore, this is specified as (hd0,2) in the grub menu entry, corresponding to /dev/sda2; we must omit /data here because /data is just the mount-point in currently-running system and grub won't know anything about it. If you have different partition scheme from mine, you will need to edit the entry accordingly.

Finally, you will need to update grub
$ sudo update-grub

Let's reboot the system and see if we can indeed boot from the iso from the current disk.
$ sudo reboot

Make sure to press and hold <Shift> key while booting up, so that grub entry appears. Otherwise, it is likely that grub menu entry won't even appear.

If you have correctly followed till now, you should be able to boot from the iso image. You should even be able to install Ubuntu on partition 1 using the iso image saved in partition 2. However, you will notice that during the installation, it complains that /isodevice cannot be unmounted. You can resolve this issue by running the following in the terminal within the Live Image system (not your currently installed system):
$ sudo umount -l -r -f /isodevice

After running this command, you should be able to successfully wipe out partition 1 and install Ubuntu 18.04 fresh!

Thursday, April 26, 2018

Dimensionality Reduction and Scattered Data Visualization with MNIST

We live in 3D world, and we can only view scattered data in 1D, 2D, or 3D. Yet, we deal with data that have very large dimension. Consider MNIST dataset, which is considered to be a toy example in deep learning field, consists of 28 X 28 gray images; that is 784 dimensions.

How would this MNIST data look like in 2D or 3D after dimensionality reduction? Let's figure it out! I am going to write the code in Pytorch. I have to say, Pytorch is so much better than other deep learning libraries, such as Theano or Tensorflow. Of course, it is just my personal opinion, so let's not get into this argument here.

What I want to do is to take Pytorch's MNIST example found here, and make some modifications to reduce the data dimension to 2D and plot scattered data. This will be a very good example that shows how to do all the following in Pytorch:
1. Create a custom network
2. Create a custom layer
3. Transfer learning from an existing model
4. Save and load a model

Here is the plot I get from running the code below.


This code is tested on Pytorch 0.3.1.