Pen & Paper


09 Sep 2017

Calibre has been around for over a decade now. It was my primary ebook reader since I first installed it on my Ubuntu nearly 10 years’ ago. Calibre is an exceptional software - free of cost and of ad; it makes me feel morally bound to make a donation to the author. Even after I got my Kindle Voyage, I still use Calibre regularly to manage my book collections, convert books between different formats, send books to devices and of course, read. I never felt the need to explore Calibre’s many other functionalities made possible by its plugins. That was until last week when I was looking for a digital version of a Chinese history book [國史大綱] for my grandchildren and attempting to build myself a small collection of Chinese literature.

Calibre has two powerful plugin functions that I had not used before but I will likely need them for my Chinese ebooks. One is to translate simplified Chinese into traditional Chinese and vice versa. The other one is to remove DRM. The translation plugin was fast when I tested it (took only a few seconds to convert an entire book). I haven’t finished proof-reading yet but so far, the translated text looks perfect. The second plugin is to strip copyright protection. This will have to be done before any purchased ebooks can be loaded into various devices. This plugin has to be tested on the actual file as DRM protection is not the same for different file formats.

I had expected an easier seach as [國史大綱] is not some obscure history book. It was first published nearly 80 years’ ago in 1940 and had had several printings. However, the search experience that I had was probably typical for any one living overseas and trying to find and purchase a digital copy of many Chinese literature.

The problems were multiplied manyfolds due to: 1. my Kindle is tied to which offers only a very limited choices of Chinese books. [國史大綱] is unavailable in either Amazon UK or Amazon US. No hard copy. No Kindle copy. 2. I set up a new account in with the hope that I may be able to buy books there. No, is probably tracking user’s IP - it wouldn’t allow me to make any purchase as soon as it found out that my IP address did not originate from China. 3. I could probably go through the trouble to try using with a VPN service, a fake mainland address and gift cards for payment but I wasn’t sure if changing my Kindle country from UK to China was irreversible. I don’t want to get stuck in the Chinese site as the bulk of my reading is English. Risk is too high for the uncertainty. Not worth it. 4. I tried sites in Hong Kong (超閱網 and Taiwan (PubU, Readmoo, 博客來, 臺灣商務) but soon realized that none of these sites would allow digital download. Their ebooks are only good for reading in their own online reader. This allows them to exercise a tigher control over copyright but at the expense of limiting access of their ebooks. Without a digital download, I cannot read it offline or in my preferred reader. I did install and try out their android reading app (超閱網 and Readmoo). The experience was awful because I didn’t have a android tablet and reading books on a phone is a torture. 5. 國史大綱 was first published in 1940 and is most likely still copyrighted. It depends on the country you are in and where and when the book was first published. Books in the public domain can be found in many websites. Gutenberg is one of them. It has a decent catalog of Chinese classic literature in various popular ebook formats. Recommended although I couldn’t find what I was looking for there. 6. Googling “國史大綱下載” or “國史大綱電子書” or its various combination returned a few pages of links. Exploring and clicking on these links are inherently risky because of potential virus. I wouldn’t recommend doing so under a Windows system which is almost always the target of any potential virus attack. Out of desperation, I did take the chance to download a few files but from a VM on Debian after carefully eliminating some questionable links from the Google search results. I also used Calibre in the virtual machine to import the files and examined each of them to make sure they were safe.

Here are the files and you are welcome. Do let me know of any errors especially on the translated text and I’ll do some necessary editing:

Android App

11 Sep 2016

Day 1

A couple of minor issues prior to installing Android Studio on my LMDE 2.

  1. I have more than one version of OpenJDK installed. Android Studio requires version >= 1.8.

    Use `sudo update-alternatives --config java` to change the default JAVA version
  2. I had already defined JAVA_HOME in my .bashrc to use JRE when I installed the EC2 command line tool, Android Studio requires JDK so I had to rename the original JAVA_HOME to JRE_HOME and re-defined JAVA_HOME to point to jdk/bin instead of jre/bin. This was simply done by removing “/jre” at the end of the original string.

    My original “JAVA_HOME”


    has now becomes


Android Studio booted up successfully after installation but with a couple of warning messages:

  1. MaxPermSize option is removed from java version 1.8 and is not supported

  2. OpenJDK shows intermittent performance and UI issues. We recommend using the Oracle JRE/JDK

Item 1 seemed to have appeared only once during the initial boot-up. It did not cause any problem. I assume this can be safely ignored or, alternatively, the option can be commented out in the configuration file studio64-vmoptions found inside the android-stuido/bin folder.

I did some quick research on the net about Item 2. OpenJDK doesn’t seem to create any major issues for people continuing using it for Android Studio. Considering the fact that Google and Oracle are currently hotly engaged in a law suit regarding Android/Java, I have the feeling that Android Studio will eventually ditch Oracle’s JRE/JDK. I decided against installing the Oracle JDK unless some performance issues pop up after.

Finally, the default UI theme IntelliJ looks terrible in my machine. The navigation bar is almost intelligible. Changing the theme to GTK+ under “File->Settings->Appearance & Behaviour->Appearance->UI Options” helps.

Github provides free hosting of a static website for each user and for each project. These websites are generated using Jekyll which will process all files uploaded to an individual user account (master branch of a special repo named or any project repos with a gh-pages branch.

This short guide will walk through the steps in setting up of a local “Jekyll” installation so that the website can be designed and its content reviewed before committing to Github.



  1. In configuring Jekyll, it is important to bear in mind that Github’s user page is located at root ( while its project page is served from a sub-directory ( The parameter baseurl in *_config.yml* should be set correspondingly as follow:
    • an empty string "" for a user page and
    • "/NAME_OF_REPO" for a project page
  2. To preview the project website locally, it is necessary to temporarily override the baseurl setting using jekyll serve --baseurl "" so that the pages can be found at “localhost:4000”.
  3. The Jekyll engine at Github only supports a selected set of plugins and it may not be the same version as the local installation. To avoid possible conflicts, put gem 'github-pages' in a Gemfile in the repo’s root directory followed by a command bundle install.
  4. Put _site into .gitignore. The folder is generated for the local Jekyll installation only. Jekyll will re-build the site every time following a new upload or commit.

Creating Project Pages

  1. Clone an existing repo
    • $ git clone
  2. Create and switch to gh-pages branch
    • $ cd repo
    • $ git checkout --orphan gh-pages
  3. Remove all existing files
    • $ git rm -rf .
  4. Scaffold new Jekyll site
    • $ jekyll new .
  5. Make changes and create content
  6. Publish to Github
    • $ git commit -am "new content added"
    • $ git push origin gh-pages

I have created a set of slides for this post using slidify.

undefined method "getConverterImpl" (Jekyll)

14 Dec 2015

I use a “markdown” tag to render my “about” page {% markdown %} in this jekyll blog. The ‘about’ page is written in markdown but is seldom changed. It is unnecessary to run it through the jekyll engine to process it every time when a new posting is made. The ‘markdown’ tag is created using the following gem plugin (named with a .rb extension and to be put into a _plugins sub-directory under the jekyll root).

  Jekyll tag to include Markdown text from _includes directory preprocessing with Liquid.
    {% markdown <filename> %}
module Jekyll
  class MarkdownTag < Liquid::Tag
    def initialize(tag_name, text, tokens)
      @text = text.strip
    def render(context)
      tmpl = File.join Dir.pwd, "_includes", @text
      site = context.registers[:site]
      converter = site.getConverterImpl(Jekyll::Converters::Markdown)
      tmpl = (Liquid::Template.parse tmpl).render site.site_payload
      html = converter.convert(tmpl)
Liquid::Template.register_tag('markdown', Jekyll::MarkdownTag)

Upgrading from jekyll 2.5.3 to jekyll 3.0.1 will produce an error pointing to an undefined method getConverterImpl. After some googling, I have found that the method getConverterImpl has been changed to find_converter_instance in jekyll 3. Changing the following line in the above script will remove the error.

##  converter = site.getConverterImpl(Jekyll::Converters::Markdown)
    converter = site.find_converter_instance(Jekyll::Converters::Markdown)

Github Project Pages Using Jekyll

15 Oct 2015


Github provides free hosting of a static website for each user and for each project. These websites are generated using Jekyll which will process all files uploaded to an individual user account (master branch of a special repo named or any project repos with a gh-pages branch.

This short guide will walk through the steps in setting up of a local “Jekyll” installation so that the website can be designed and its content reviewed before committing to Github.



  1. In configuring Jekyll, it is important to bear in mind that Github’s user page is located at root ( while its project page is served from a sub-directory ( The parameter baseurl in *_config.yml* should be set correspondingly as follow:
    • an empty string "" for a user page and
    • "/NAME_OF_REPO" for a project page
  2. To preview the project website locally, it is necessary to temporarily override the baseurl setting using jekyll serve --baseurl "" so that the pages can be found at “localhost:4000”.
  3. The Jekyll engine at Github only supports a selected set of plugins and it may not be the same version as the local installation. To avoid possible conflicts, put gem 'github-pages' in a Gemfile in the repo’s root directory followed by a command bundle install.
  4. Put _site into .gitignore. The folder is generated for the local Jekyll installation only. Jekyll will re-build the site every time following a new upload or commit.

Creating Project Pages

  1. Clone an existing repo
    • $ git clone
  2. Create and switch to gh-pages branch
    • $ cd repo
    • $ git checkout --orphan gh-pages
  3. Remove all existing files
    • $ git rm -rf .
  4. Scaffold new Jekyll site
    • $ jekyll new .
  5. Make changes and create content
  6. Publish to Github
    • $ git commit -am "new content added"
    • $ git push origin gh-pages

I have created a set of slides for this post using slidify.

updating R and its installed packages

09 Oct 2015

I have been using a script to automatically reinstate all the installed packages following a R upgrade. This is my preferred method to avoid manually re-installing the packages one by one. The script is simple and works with certainty under Windows or Linux. An alternative is to preserve the existing “library” folder and copy it over the newly created “library” folder after the upgrade is finished. Under Windows, this should not be a problem as by default, there is but one “library”. In Linux, however, the “library” is kept in several locations and may easily become another source of headache. .libPaths() will display the library locations.

There are two simple scripts involved to automate the package re-installation process. Firstly, source the following script before the upgrade to keep track of all the installed packages. The records will be written down in a file packageList.Rdata.

package.list <- installed.packages()[,"Package"]
save(package.list, file="packageList.Rdata")

Secondly, source the following script after the upgrade to load the records. These are compared with what are already present in the newly installed library. The missing packages will be reinstalled.

for (
    d in setdiff(package.list, installed.packages()[,"Package"])

Debian Live (Persistence)

04 Jun 2015

Burning a live cd image onto a USB stick and configuring it for persistent storage is not a complicated task. I had Chromixium done in less than 10 minutes using unebootin. But it turned out to be not as straight forward for Debian (Jessie). Debian doesn’t recommend using unetbootin and the fact that all the top search results returned by Google for ‘Debian live usb persistent’ were outdated instructions clearly didn’t help.

There are conflicting instructions between Debian Live Manual 1.x and 3.x.

  1. Most results returned by Google are based on 1.x which suggest a live-rw persistent partition. The correct label for the persistent partition should be persistence.
  2. The proper boot parameter to use is persistence and NOT persistent as many have suggested.
  3. Last but not least, the persistence partition will not be recognized by Debian unless it has a persistence.conf file in its root directory.
  4. For full persistence, echo "/ union" >> /path/to/persistence partition/persistence.conf.

cmus - a cli audio player

22 May 2015

The use of online music streaming services has almost eliminated my need for a software music player. But when there are times that I want to play a few of my favorites in a less compromising sound quality than an online music service has to offer, I’ll always opt for cmus, a tiny player that will browse through my nearly 300GB digital music collection with ease, handles flac and can be easily configured to do scrobbling.

Installation is not more than a simple apt-get install cmus command as it is available in the main repository. It should work out of the box in most cases but for system using ALSA, the default settings will have to be changed to get sound. Press 7 within cmus, find the variables and change them accordingly as below:

> dsp.alsa.device       `default`    
>    `Master`    
> mixer.alsa.device     `default`    
> output_plugin         `alsa`

I use cmusfm for scrobbling. Installation is done by cloning the repo and build it.

git clone

Upon initialization cmusfm init, it will produce an error about not being able to write to its config file in .config\cmus\cmusfm.conf. I just created one manually using touch cmusfm.conf and rerun the cmusfm init command. Finally, cmus must be told to use cmusfm. This is done by going to the cmus setting tab once more. Press 7, find and change the following variable:

> status_display_program    `cmusfm`    


04 May 2015

Chromixium is a linux distro that recently came out of beta. I tried it out over the holiday weekend and was impressed. The distro has the appearance of a Chrome OS but is in fact powered by Ubuntu trusty with Openbox windows manager. Combining the simplicity of a chromebook GUI and the power of Ubuntu, it is a promising niche player among the many linux distros that are targeting the desktop users.

I installed it on a USB stick (live CD + a persistent partition) using UNetBootin. The entire installation process took about 10 minutes and I had a fully functional computer on a stick that I can carry anywhere.

I also installed s3ql to make use of Amazon S3 storage in case there are files that need a more permanent place for storage than a USB stick can provide. s3ql is available in the official Ubuntu repository. It basically turns my Chromixium on a 16GB USB stick into a computer with unlimited storage by mounting a S3 bucket in the local file system.

Rmarkdown for Scientific Papers

14 Mar 2015

  library(knitcitations); library(bibtex); cleanbib()
  cite_options(citation_format = "pandoc", check.entries=FALSE)
  write.bibtex(c(citation("bibtex"), citation("knitr")[1], citation("knitcitations"), 
    citation("xtable"), citation("RefManageR"), citation("rmarkdown")), file="init.bib")
  bib <- read.bibtex("init.bib")
  bib.1 <- read.bibtex("citavi.bib")


This document was created based on a .Rmd template from a blog post by (Keil 2017). The R package “knitr” (Xie 2015) is used to convert the Rmd file to html, pdf or word format. “knitr” has its strengths in reproducible research but it is not designed to produce citations for a scientific paper. The .Rmd template makes use of an xml file for formatting citation styles, “knitcitations” (Boettiger 2017) and “bibtex” (Francois 2017) for generating citations from DOI lookup or bibtex entries.


The template gives examples of writing mathematical equations using $\LaTeX$, formatting R outputs using either knitr::kable (Xie 2015) or “xtable” (Dahl 2016), producing graphic plots using the base “plot” function and generating citations using automatic DOI lookup or “bibtex” (Francois 2017) with knitcitations::citep and knitcitations::citet (Boettiger 2017).


Rmarkdown (Allaire et al. 2017) has full documentation for its syntax. Statistical analysis with plots and tables can be easily created in a .Rmd file by embedding and running “R code chunks” while math equations are produced using $\TeX$ or $\LaTeX$. Rmarkdown (v2) has built-in support for citation as it is based on Pandoc, but it does not have automatic DOI lookup and is better suited to work in conjunction with a citation manager from which bibliography files can be generated and exported for its use.


Inline equations are enclosed by $ with no space following or preceding. A separate paragraph for equations is enclosed by $$ following/preceding with a single space. Sharelatex has detailed documentation for creating mathematical expressions.

The binomial coefficient is defined as

$$ \binom{n}{k} = \frac{n!}{k!(n-k)!} $$
\[ ... \]

$$ \binom{n}{k} = \frac{n!}{k!(n-k)!} $$

These are all Greek α, β, θ0, ε2, η, λ2, μ, τ, σ

In least squares prediction models, we estimate β0, β1, β2, ...βn by minimizing the RSS

\[ RSS=\sum_{i=1}^{n} \Big( y_i - \beta_0 - \sum_{j=1}^{p}\beta_{j}x_{ij}\Big)^2 \]

$$ RSS=\sum_{i=1}^{n} \Big( y_i - \beta_0 - \sum_{j=1}^{p}\beta_{j}x_{ij}\Big)^2 $$


fit <- lm(wage ~ poly(age, 4), data=Wage)
kable(summary(fit)$coef, digits=2, caption="This is a 4th degree polynomial. Coef output knitr::kable")
This is a 4th degree polynomial. Coef output knitr::kable
Estimate Std. Error t value Pr(>|t|)
(Intercept) 111.70 0.73 153.28 0.00
poly(age, 4)1 447.07 39.91 11.20 0.00
poly(age, 4)2 -478.32 39.91 -11.98 0.00
poly(age, 4)3 125.52 39.91 3.14 0.00
poly(age, 4)4 -77.91 39.91 -1.95 0.05


## fit<-lm(Wage$wage~poly(Wage$age,4),data=Wage)
age.grid<-seq(from=agelims[1], to=agelims[2])
preds <- predict(fit, newdata=list(age=age.grid), se=TRUE)
se.bands <- cbind(preds$fit+2*preds$, preds$fit-2*preds$
## par(mfrow=c(1,2), mar=c(4.5,4.5,1,1),oma=c(0,0,4,0))
plot(age, wage, xlim=agelims, cex=.5, col="darkgrey")
title("Degree-4 Polynomial", outer=F)
lines(age.grid, preds$fit, lwd=2, col="blue")
matlines(age.grid, se.bands, lwd=1, col="blue", lty=3)
Fig. 1 - Degree-4 Polynomial. Relationship between Wage and Age (data(Wage) in ILSR. The dotted lines are 95% confidence intervals.
Fig. 1 - Degree-4 Polynomial. Relationship between Wage and Age (data(Wage) in ILSR. The dotted lines are 95% confidence intervals.


In “The Elements of Statistical Learning”, Hastie, Tibshirani, and Friedman (2009) explain with practical examples the application of ridge regression/lasso. The book covers some advanced materials in data mining, inference and prediction. For a less technical treatment of the same subjects, “An Introduction of Statistical Learning” (James et al. 2013) should be a good start.

The two types of citation above are respectively generated by

citet(bib.1[["Hastie.2009"]]) and
citep("DOI 10.1007/978-1-4614-7138-7")

citet and citep may refer to either DOI or a bibtex entry, citet(bib.1[["Hastie.2009"]]) generates Hastie, Tibshirani, and Friedman (2009) where “bib.1”" is a R object created by bib.1<-read.bibtex("name of bibliography file") and “Hastie.2009” is the bibtex entry ID.


Plots, tables, math equations and citations are indispensible elements of any scientific papers. The Rmd template is a quick and convenient way to produce them. The finished Rmd file can then be “knited” to html, pdf or word format for submission in RStudio. To publish this in a jekyll blog, what I did was to knit it to html and include the html file in a post.

Finally, the reference list below is produced by using “bibtex” to write out all citations made in the paper write.bibtex(file="references.bib"). reference.bib and a style file are declared in the front matter.


Allaire, JJ, Joe Cheng, Yihui Xie, Jonathan McPherson, Winston Chang, Jeff Allen, Hadley Wickham, Aron Atkins, Rob Hyndman, and Ruben Arslan. 2017. Rmarkdown: Dynamic Documents for R.

Boettiger, Carl. 2017. Knitcitations: Citations for ’Knitr’ Markdown Files.

Dahl, David B. 2016. Xtable: Export Tables to LaTeX or HTML.

Francois, Romain. 2017. Bibtex: Bibtex Parser.

Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. 2009. The Elements of Statistical Learning. Springer Series in Statistics. Dordrecht: Springer.

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning. Springer New York. doi:10.1007/978-1-4614-7138-7.

Keil, Petr. 2017. “» Simple Template for Scientific Manuscripts in R Markdown.” Petr Keil.

Xie, Yihui. 2015. Dynamic Documents with R and Knitr. 2nd ed. Boca Raton, Florida: Chapman; Hall/CRC.

Getting Data with R

08 Jan 2015

Reading url

if (!file.exists("testdata")) {
fileUrl <- ""
download.file(fileUrl, destfile="./cam.csv", method="curl")
list.files("./") <- date()

Reading Flat Files

    file, header, sep, row.names, nrows
    quote="", na.strings, nrows, skip

Reading Excel Files

read.xlsx("./cam.xlsx", sheetIndex=1, header=TRUE)
    colIndex <- 2:3
    rowIndex <- 1:5
Notes for installing package “xlsx”
  1. may need to reconfigure “java” if installation fails
  2. if more than one version of java runtime, use sudo update-alternatives --config java to choose the default version
  3. reconfigure R to use the default version sudo R CMD javareconf

Reading XML

doc <- xmlTreeParse(fileUrl, useInternal=TRUE)
rootNode <-xmlRoot(doc)

rootNode[[1]]          #Double [] to retrieve item of a list

Using xmlSApply to extract

xmlSApply(rootNode, xmlValue)


/node   # Top level node
//node  # at any level
node[@attr-name="bob"] # node with attrible name='bob'

xpathSApply(rootNode, "//name", xmlValue)
xpathSApply(rootNode, "//price", xmlValue)

fileUrl <- ""
doc <- htmlTreeParse(fileUrl, useInternal=TRUE)   # html instead of xml
scores <- xpathSApply(doc, "//li[@class='score']", xmlValue)
teams <- xpathSApply(doc, "//li[@class='team-name']", xmlValue)
  1. Extracting data from XML
  2. Short Into to XML Pkg

Reading JSON Files

jsonData <- fromJSON("")

myjson <- toJSON(iris, pretty=TRUE)

iris2 <- fromJSON(myjson)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa
  1. R-blogger jsonlite

Reading mySQL


dbfile <-dbConnect(MySQL(), user="username", host="localhost")
dbData <-dbGetQuery(dbfile, "show databases;")


Leek, J; Peng, R & Caffo, B (2015). “Getting and Cleaning Data” [Lecture Slides]. Retrieved from

eircom F1000 Modem

14 Dec 2014

The XyXEL F1000 modem that comes with eircom eFibre offers a wide range of funtionality. An average user will probably not use any advanced features except USB share. The user guide, however, seems to have explained all the modem’s features sufficiently well with the exception of USB share that almost everyone may find useful. There is no detail to show how to access a USB drive once it is plugged into the modem. The USB light on the modem goes green but it is not shown up in the file explorer. Googling didn’t help either.

The USB share is a network drive. For Windows users, it can be accessed by mapping it as such. It can be reconnected every time Windows starts if this is one’s preference. To map a network drive, go to “Computer” in the explorer window and click on the option “Map Network Drive”


Choose a “Drive Letter” and point it to \\\usbkey\ where should be the modem’s IP. Windows will prompt for username and password (which should be the same username and password that are used to access the F1000 modem’s user interface. Enter the credential and that’s all there is to it.

For Linux, the address is smb://admin@ Obviously, you will need smbclient to do that or you may wish to give “gigolo” a try. GNOME users may need gvfs-smb to enable access to Samba shares in the file explorer.

Making Your Browser to Trust a Self-signed SSL Cert

29 Nov 2014

Enabling https connection on LAMP can be easily done with 2 commands:

  1. sudo a2enmod ssl
  2. sudo a2ensite default-ssl

Restart the apache server sudo service apache2 restart and we will have a secure connection.

However, whether we use openssl to create a self-signed certificate or use the default “snakeoil” certificate, we will get a browser warning about an untrusted ssl certificate when we visit our site. The browser will only trust a SSL cert that is signed by a recognized CA. Since “we” are not recognized as a trusted issuer, the self-signed SSL certificate that we have created is not deemed untrustworthy (despite the fact that we are the owner of the server and we know we can trust our own server). To get rid of the browser warning, we can either pay to get a SSL certificate from a recognized CA or do the following to get the browser to trust our self-signed SSL certificate. The main tool is openssl. It does not matter whether we perform the steps on the host or on a local computer. What is important is to know where to put the “key” and the “cert” after they are created. For Windows users, there is a similar tool on IIS that can be used to create a self-signed cert but in order to follow the steps below, it may just be easier to ssh to the host and use openssl.

  1. Create a root key openssl genrsa -out root.key 2048
    (this is the main key that will be used to create all trusted certs.)
  2. Create root cert openssl req -x509 -new -nodes -key root.key -days 1800 -out root.pem
    (answer the prompts so that the information can be embeded in your certificate.)
  3. Create a host key for your apache server openssl genrsa -out apache.key 2048
  4. Create a certificate signing request (csr) for your host certificate
    openssl req -new -key apache.key -out apache.csr
    (use the domain name as the “Comman Name”.)
  5. Sign the csr using the root.key
    openssl x509 -req -in apache.csr -CA root.pem -CAkey root.key -CAcreateserial -out apache.crt -days 1500
    (-days here should be equal or less than that of the root cert.)
  6. Repeat step 3-5 to generate additional key (apacheX.key), csr (apacheX.csr) and crt (apacheX.crt) if there are other servers that need a SSL cert.

Now all the necessary key and cert have been generated. All we need is to put them into the right place.

  1. Import root.pem to the browser. For Chrome, look in Settings -> Advanced Settings -> HTTPS/SSL -> Manage Certificates -> Authorities -> Import. For Firefox, look in Preferences -> Advanced -> View Certificates -> Authorities -> Import.
  2. Any devices accessing the apache server should import the same root.pem.
  3. On the apache server, create a folder sudo mkdir /etc/ssl/localcerts
    and move the key and cert to the newly created folder sudo cp apache.* /etc/ssl/localcerts/
  4. Make the apache.* less open sudo chmod 600 /etc/ssl/localcerts/apache.*
  5. Enable ssl sudo a2enmod ssl
  6. Enable the default-ssl virtual host sudo a2ensite default-ssl
  7. Edit /etc/apache2/sites-available/default-ssl.conf. Change the settings of SSLCertificateFile and SSLCertificateKeyFile to point to “apache.crt” and “apache.key”. In our case, it should point to /etc/ssl/localcerts/apache.crt and /etc/ssl/localcerts/apache.key respectively.
  8. Restart apache sudo service apache2 restart


19 Jul 2014

It was pure luck that I came across Shiny-server two years’ ago when I was looking for an easy way to deploy a note-taking application to a class of about 20 students. I was asking half of the class to use the note-taking application and the other half to use hand-writing for their lecture note-taking in order to collect some statistics for subsequent analysis. Shiny-server 0.1, still in beta, was just released and I knew immediately after reading its introduction that it fit perfectly well with what I intended to do for my dissertation. Simply put, R is for the desktop and Shiny-server will not only broadcast what we do in R to the whole world, but also enables interaction.

Shiny-server is under active development by RStudio. I recently revisited its github repo and it is now version 1.2. I didn’t encountered any problems when I installed the beta version in an Ubuntu instance on Azure. The many revisions in the interim would no doubt have made an already great app better, although I haven’t looked into details what improvements the many revisions have made. One obvious change is that installation of Shiny-server no longer requires npm (and therefore, no node.js). It tends to simplify the installation process, and means one less reason not to check it out if you ever need to make a presentation of anything involving numbers. It will also make an ideal teaching tool from high school math to post-graduate advanced statistics.

Shiny-server does not yet support the Windows platform. A Linux VM will be the best place to learn and play with Shiny-server for a Windows user. The installation will be simplest for an Ubuntu VM because a Shiny-server binary is available (though I must say that I hate ‘Unity’, Ubuntu’s default desktop). For other Linux distros (except Ubuntu 12.04+ or CentOS(RedHat)), installation will require compilation from source code. It is a little tricker, for example in Archlinux, it is necessary to trick the python environment variable before compilation. If you are new to Linux and only wants a VM for testing Shiny-server, use Ubuntu (replace ‘Unity’ with ‘Mate’ if you don’t mind a little extra work to make the desktop much more usable).

Fonts - Infinality

08 Jul 2014

Font rendering was perhaps one of the bigger issues faced by Linux users in the past; the sharp and crispy fonts generated by Windows’ Cleartype are proprietary stuff. Linux users had to tweak their fontconfig to get something close to what Windows can produce. Importing/borrowing the fonts from a Windows OS helps but won’t exactly do the job as proper hinting and anti-aliasing depend on font sizes and screen resolution.

Most Linux distros nowadays produce decent screen fonts by default. There is really no need to do any tweaking anymore. But if you have to stare at the screen most of the time during the day for work or for fun, you may want to check out the infinality font patch. The result will be nothing short of breathtaking. Guaranteed. And you don’t need to make a pre and post-installation screen shot to notice the difference.

To install infinality in Arch Linux, follow the detailed instruction in the Arch Wiki. Just remember to add and sign the developer’s keyID.

To install infinality in LMDE or Debian (x86_64 only), follow the instruction as detailed in this forum post. You may need to sudo apt-get install build-essential devscripts fakeroot if these are not already on your system.

I use ‘Noto Sans’ and ‘Noto Serif’ in my Chrome browser and ‘Inconsolata’ in my terminal console and editor. These are available free in Google Fonts and they work very well with infinality.

Virtual Machines

07 Jul 2014

Spinning up a virtual machine (VM) on a pc can be easily done without much computing knowledge. Hyper-V is bundled with Windows 8 (x86_64), VMware Player and Virtualbox are free. For users of older Windows, there is always Virtual PC 2007 although it is a little outdated. As a VM is completely isolated from the host on which it is installed, it is ideal for testing/developing software. It eliminates the risk of messing up the existing operating system. Students are often asked to install 3rd party trial software during the course of their study, whether it’s for learning how to use the software or for a couple of exercises. This should preferrable be done in a VM as the simple process of installing/uninstalling a program can mess up the OS. I remember that one of my classmates was unable to boot into her Windows laptop after she had installed the LAMP/Moodle bundle required by one of the courses we took.

I used a Linux VM in Azure to distribute my research artefact. If you want to check out Azure or EC2 to build a VM in the cloud but are not particularly comfortable with the command line interface, a VM on a local pc will be a perfect first-step to begin the learning process. Bear in mind that creating a private cloud with a dozen of VMs in a local machine is rather painless and costs nothing while Azure or EC2 charges by the hours.

I use Virtualbox running on Mint LMDE. While I have not completely ditched Windows (still need it for playing games or Netflix), my experience with Mint LMDE is so good that I have changed it as my default boot. Now I run Windows as a VM inside Virtualbox if I want games or Netflix and there is no need to boot to Windows on the physical disk at all. (There is but one got-cha in running Netflix in a Windows VM inside Virtualbox - DON’T emulate more than one cpu for the guest machine).

Most Linux distro should run equally well as a VMware or Virtualbox guest. There is no practical difference as far as performance and ease of use are concerned. VMware workstation used to be free and now only VMware Player is available free for non-commerical users. VMware Player’s functionality is limited in comparision with Virtualbox or Hyper-V. As for Hyper-V, one main drawback I have found is that it does not sync to the host display. The only way to get to full screen is through remote desktop which seems a clumsy way to achieve what VMware or Virtualbox can do with an extension.

EC2 or S3

21 Apr 2014

For a blog like ‘Pen & Paper’ that doesn’t generate much traffic and speed isn’t too big a consideration’, S3, Amazon’s simple storage, may provide a more sensible web platform as far as cost is concerned. I use EC2 for playing with shiny and learning app development, I can easily turn it off to minimize recurring cost. To keep it on 24/7 just for hosting a blog isn’t cost effective. I wasn’t considering S3 when I was moving my Wordpress blog because I thought I could not git push to S3 but now I have found s3_website which enables s3_website push with more than a few configuration options.

To use my private sub-domain in EC2, I would simply create an A Record pointing to an EC2’s public IP. This is not feasible for S3 as the domain alias of S3 buckets are all managed by Amazon’s Route 53. To use a private sub-domain for S3 buckets, a Route 53 subscription seems to be the only option.

Amazon has detailed documentation for setting up Route 53 but it took me some time to figure out how to maintain the root domain as is and to move only the sub-domain to Route 53. To do so, I’d have to keep the current DNS server for the root domain unchanged but to create separate NS records for the subdomain using the 4 DNS server names shown as the ‘Delegation Set’ in the Hosted Zone Details in Route 53. After this, wait. I didn’t see any changes until several hours later.

From Wordpress to Jekyll

19 Apr 2014

I just took down my Wordpress blog and gave it a facelift with Jekyll & Bootstrap. Jekyll is efficient & flexible as a blogging platform - it uses markdown and doesn’t require a database; Bootstrap is an awesome web designing tool. I also took the opportunity to move from a shared host to EC2.

Jekyll can be installed on EC2 or on a local machine. I choose the latter option as I wanted to use Git for deployment. The idea is to simply git push ec2 as soon as I finish writing this and it should appear as a blog post. No need to ssh or sftp.

Here are the steps to set up Jekyll and Git for deployment to EC2:

On EC2

  1. set up a bare repo. This is where I will git push my posts or where my collaborators (if any) will git pull.

    $ mkdir ~/myblog.git && cd ~/myblog.git
    $ git init --bare --shared
  2. set up a directory in my $home and use it for the Git Work Tree. How this directory is sym linked to the server root will decide what url people will use to access the ‘Jekyll SITE’ (see Back to EC2).

    $ mkdir ~/ABC
    $ cat > hooks/post-receive
    export GIT_WORK_TREE
    git checkout -f
    $ chmod a+x hooks/post-receive
    $ mv hooks/post-update.sample hooks/post-update
    $ chmod a+x post-update

    That’s almost it for the EC2 config except sym linking the Jekyll files.

On Local Machine

  1. Set up Jekyll and init it as a git repo.

    $ jekyll new ec2_blog
    $ cd ec2_blog
    $ git init
    $ git add * --all
    $ git commit -a -m "getting ready for first push" 
    $ git add remote ec2 ssh://( EIP)/home/(

    First push:

    $ git push ec2 +master:refs/heads/master

    All subsequent push:

    $ git push ec2 

On Domain Name Registrar

To use for the EC2 instance, point its A Record to the EC2 instance’s public IP (use EIP, if possible, for a more permanent setup). To use a sub-domain, set up and point its A Record to the EC2 instance’s public IP.

Back to EC2

All Jekyll files will appear in the directory ~/ABC after the first push. Now consider the sym linking:

  1. If I wish to access the Jekyll blog using ‘’, sym link everything in the ~/ABC/_site directory to the DocumentRoot (typically /var/www)

    $ cd /var/www
    $ sudo ln -s ~/ABC/_site/index.html ./
    $ sudo ln -s ~/ABC/_site/css ./
  2. If I wish to access the Jekyll blog using ‘’, sys link the _site directory to /var/www/blog

    $ sudo ln -s ~/ABC/_site /var/www/blog


    Jekyll, by default, will render its pages from root. To use this set up, it is necessary to set the baseurl variable in _config.yml and to add the {{ site.baseurl }} to all links referring to the root. This will likely apply to all files in the directory _layout and _include which may be pointing to stylesheets (css) and/or script files (js):

    Add the following line to \_config.yml in the local machine
    baseurl: /blog

    This is however intended for the EC2 instance. To preview Jekyll on the local machine, I have to override this with the –baseurl switch and reset it to an empty string.

    $ jekyll serve --baseurl ""
  3. To reach the Jekyll blog using a sub-domain (my current setup), I’d sys link the *_site* to /var/www/blog same as in item 2 above. But instead of introducing the baseurl variable, I’d add a virtual host using ‘’ as the ServerName and ‘/var/www/blog’ as the DocumentRoot. Open the default.conf file in /etc/apache2/site-available/, make the following changes and save it as a new file vhost_jekyll.conf.

    DocumentRoot /var/www/blog
    <Directory /var/www/blog>
    ... keep everything here as is
    $ sudo ln -s /etc/apache2/site-availble/vhost_jekyll.conf /etc/apache2/site-enabled/vhost_jekyll.conf
    $ sudo /etc/init.d/apache2 restart

Blogging with R

16 May 2013

I used knitr, shiny server and R markdown to build my research artefact, a cloud-based note-taking application. Now that the disseration is out of the way, I’m able to focus on completing the developement of my note-taking application in the coming months.

yihui has a few words of wisdom to all “brave professors” - students should be submitting their papers or assignments in R + knitr instead of boring Word documents. So true! Yes, R + knitr is exciting, fast and flexible. Everything I’m doing in this blog post is done in simple text, including the elegant Scatterplot Matrix below:

Scatterplot Matrix

The matrix summaries the assoication observed among the test variables, ie, number of words recorded in lecture notes (Word), number of keyword captured (Keyword) and quiz results (Score).

plot of chunk scatterplot

One of the most interesting findings in my study was that the number of words recorded in lecture notes was negatively correlated with the test scores in the “Pen and Paper” group but the pair of variables had a positive correlation in the “Computer Application” group. Technology has completely reversed the relationship between “number of words in lecture notes” and “test scores”. Below is the correlation matrix with P-value (again, all done in text with exactly two lines of codes):

Correlation Matrix & P-values

# P-value - Pen and Paper
##          Word Keyword Score
## Word     1.00    0.25 -0.60
## Keyword  0.25    1.00  0.13
## Score   -0.60    0.13  1.00
## n= 9 
## P
##         Word   Keyword Score 
## Word           0.5176  0.0854
## Keyword 0.5176         0.7311
## Score   0.0854 0.7311
# P-value - Computer App
##         Word Keyword Score
## Word    1.00    0.54  0.55
## Keyword 0.54    1.00  0.50
## Score   0.55    0.50  1.00
## n= 14 
## P
##         Word   Keyword Score 
## Word           0.0486  0.0415
## Keyword 0.0486         0.0676
## Score   0.0415 0.0676


08 Feb 2013

Got my 9-month free pass to Microsoft’s Azure Cloud. Wasted the entire week messing with virtual machines in the cloud.

Microsoft Azure offers a 3-month free pass to the general public but the trial period is too short for the research. To get the 9-month free pass (over $1,500 in value), join the Azure Imagine Cup competition by taking a qualifying quiz. Azure is ideal for buidling web-based apps for mobile phones and/or Windows 8 tablets. Even if the research artefact is not web-based, it is easier to deploy/test the software using a cloud service. It is not very practical to ask students who participates on a reaseach to install a piece of test software on the college’s desktop computers or on their own computing devices (be it a mobile phone, a tablet or a laptop).