profile

Hi! I'm Tommy Tang

Tutorial made for you: from fastq to differentially expressed genes

Published about 2 months ago • 1 min read

Hi Bioinformatics lovers,

How is your weekend going? We had a leaking toilet, so it is a little hectic with three kids. Anyway, you should expect a newsletter from me every Saturday morning.

Today is a little delayed.

For those who want to learn bulk RNAseq analysis, check out these two tutorials:

  1. How to preprocess GEO bulk RNAseq data with salmon https://divingintogeneticsandgenomics.com/post/how-to-preprocess-geo-bulk-rnaseq-data-with-salmon/
  2. Downstream of bulk RNAseq: read in salmon output using tximport and then DESeq2 https://divingintogeneticsandgenomics.com/post/downstream-of-bulk-rnaseq-read-in-salmon-output-using-tximport-and-then-deseq2/

Unpopular opinion in bioinformatics from me: knowing the tools out there is more crucial than starting from scratch. Most problems you face have been tackled before. Don’t reinvent the wheel; save time by using existing tools.

Moreover, the established packages will typically have fewer errors than a quick and dirty implementation. However, do not blindly trust any package either.

For a comprehensive guide on staying up-to-date with the latest advancements in this field, check out my video: "6 Proven Strategies to Keep Pace with Evolving Trends in Bioinformatics" here.

Other resources from this week:

  1. Bioinformaticians are impatient. Conda is notorious slow... then I found mamba https://github.com/mamba-org/mamba, then I found pixi https://prefix.dev/blog/pixi_a_fast_conda_alternative: Blazing fast cross-platform package management for teams by the creators of the mamba package manager.
  2. Snakemake new feature: Updating existing output files https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#updating-existing-output-files
  3. Understanding the Path Variable in Unix Systems
  4. Align multiple ggplot2 plots by axis
  5. "Variant scores calculated by GATK did not clearly distinguish true positives from false positives in the vast majority of cases, implying that hard-filtering with GATK could be challenging.
  6. DeepSomatic is an extension of deep learning-based variant caller DeepVariant
  7. ExpoSeq: simplified analysis of high-throughput sequencing data from antibody discovery campaigns | Bioinformatics Advances | Oxford Academic

Happy Learning!

Tommy

Let's connect on twitter and Linkedin!

Hi! I'm Tommy Tang

I am a computational biologist with six years of wet lab experience and over ten years of computation experience. I will help you to learn computational skills to tame astronomical data and derive insights. Check out the resources I offer below and sign up for my newsletter! https://github.com/crazyhottommy/getting-started-with-genomics-tools-and-resources

Share this page