[Update on SkyMap JHub] Go from querying >400,000 RNAseq profiles to analysis without coding

We have done significant updates to the JupyterHub since we last posted. QUICKSTART: CLICK HERE to use our JupyterHub to go from reprocessed omic-data to publication figures in < 2 minutes We now offer an FTP site for bulk download of the reprocessed SkyMap data. We now offer a graphical user interface so that even … Continue reading [Update on SkyMap JHub] Go from querying >400,000 RNAseq profiles to analysis without coding

Design rationale for SkyMap JupyterHub: How can a Jupyter notebook extract the expression levels or allelic read counts from > 400,000 sequencing runs in seconds?

This blog post covers some of the rationales that I put into when designing SkyMap, the project which involves making >400,000 sequencing runs accessible to everyone. This post could be informative to you when you are designing your next Big Data application in Bioinformatics. I have listed some of the problems that I faced and … Continue reading Design rationale for SkyMap JupyterHub: How can a Jupyter notebook extract the expression levels or allelic read counts from > 400,000 sequencing runs in seconds?

Computer vs. Human: From a computer nerd who went into bioinformatics

A lot of people ask me how I went from computer science to bioinformatics. Actually, the two fields aren’t that different. Computers store long-term data in a disk with 0’s and 1’s, while cells store long-term data in DNA as A, C, G, and T. Computers store transient data in the cache or RAM, while … Continue reading Computer vs. Human: From a computer nerd who went into bioinformatics

Getting results from Big Data without the Big Infrastructure problem: Cloud + Docker + Kubernete

Imagine what would happen if every day during lunch time, you had to consciously coordinate all of the steps in digestion: breaking down the food in your stomach, pushing the food through your intestines, and telling yourself to stop feeling hungry after eating. You would spend the entire day coordinating your digestive system! Eating is … Continue reading Getting results from Big Data without the Big Infrastructure problem: Cloud + Docker + Kubernete

The PhD versus Online Dating

A Ph.D. is very much like a marriage with your advisor, as suggested by this PHD Comics post: After all, the term “Ph.D.” stands for Doctor of Philosophy, where the word “philosophy” is composed of the Latin roots philo- (love) and -sophos ("wisdom."). So maybe there are some skills transferable from your love life to … Continue reading The PhD versus Online Dating

Buying computing infrastructure vs adopting the Cloud

The recurrent question in the data-intensive workplace often revolves around which computing infrastructure to use. In the past four years as a bioinformatics Ph.D. student, I have both received and offered solicited and unsolicited advice regarding computing infrastructures using my prior experience in high-performance computing lab and current expertise in data analytics. This blog post … Continue reading Buying computing infrastructure vs adopting the Cloud