Tagged: rna-seq
60 posts found
What FastQC Reports Actually Tell You (And What Beginners Miss)
A senior bioinformatician walks through the FastQC sections that real beginners miss, with screenshots and decisions to make at each step.
From Wet Lab to Dry Lab: A Realistic Map of What to Learn First
A practical skill sequence for wet-lab biologists learning RNA-seq analysis: what to prioritise, what to safely skip, and what to outsource while you build.
Why Most Published GO Analyses Are Statistically Wrong
A 2022 PLOS Computational Biology study found 43% of GO enrichment analyses skip multiple test correction. Here is what that means and how to do it right.
Self-Service RNA-Seq For Labs Without A Bioinformatician
If your lab sequences more than it analyzes, here is what self-service RNA-seq looks like, what is safe to automate, and where you still need a human.
STAR vs Salmon vs HISAT2: When To Use Each (With Working Code)
STAR, Salmon, and HISAT2 each have a distinct use case. A practical comparison with working commands, real runtime and memory numbers, and DEG concordance data.
How To Submit RNA-Seq Results That Reviewers Cannot Reject
Reviewers reject RNA-seq papers for predictable reasons: missing FDR correction, version-less methods, inaccessible data. A checklist that prevents it.
Salmon From FASTQ to Counts: A Complete Pseudoalignment Tutorial
A complete Salmon tutorial with decoy-aware indexing, quantification flags explained, tximport into R, DESeq2 integration, and QC checks at every step.
How To Write an RNA-Seq Methods Section Reviewers Accept
A reviewer-proof RNA-seq methods section is shorter than you think but far more specific. Templates, required elements, and what reviewers always flag missing.
The Reproducibility Crisis in Bulk RNA-Seq: What Actually Breaks
Half of published RNA-seq pipelines fail when someone else tries to run them. A practitioner view of what breaks and how to build for reproducibility.
Differential Expression in Python with PyDESeq2: A Tutorial
PyDESeq2 brings DESeq2 statistics to Python. A complete tutorial covering model fitting, validation against R DESeq2, volcano plots, and enrichment export.
Why Reproducibility Should Not Be Optional in RNA-Seq Pipelines
Run snapshots, version pinning, and locked parameters should be the default, not a feature. A practitioner case for reproducibility-first RNA-seq platforms.
Bulk RNA-Seq for Bacteria: Operons and Why nf-core Breaks
Most bulk RNA-seq pipelines fail silently on bacterial data. Here is what changes for operons, GTF feature mismatches, and DE analysis in prokaryotes.
Publication-Ready RNA-Seq Plots in ggplot2: Volcano, Heatmap, PCA
Reviewer-ready RNA-seq plots in R: volcano with gene labels, z-score heatmap with annotation bars, PCA with variance explained, and journal export settings.
ORA vs GSEA: A Side-by-Side Tutorial in R with clusterProfiler
ORA and GSEA answer different questions. A working clusterProfiler tutorial with FDR correction, proper backgrounds, and side-by-side result interpretation.
The One-Bioinformatician Problem: Stop Being The Bottleneck
If you are the only bioinformatician serving multiple PIs, you are the bottleneck. Here is how to scale with templates, self-service, and clear handoffs.
Why Your DESeq2 Log2 Fold Change Cutoff Of Zero Is Wrong
Filtering DEGs at log2FC greater than zero returns half your genome. How to choose a defensible cutoff, apply lfcShrink, and avoid the GO-term explosion.
Nextflow vs No-Code Platforms: The Right Tool For Your Lab
Nextflow is powerful and steep. No-code platforms are fast and constrained. A clear decision framework for which fits your lab today, and when to use both.
GTF and GFF Files: Why They Hurt and How To Tame Them
GTF and GFF files from the same database often disagree, prokaryotic files lack exon features, AGAT fixes some and breaks others. A practical field guide.
Industrial Bioinformatics Is Still In Its Infancy
Most commercial bioinformatics runs on academic instincts. A senior practitioner view on what industry needs and the engineering practices that close the gap.
Your First Nextflow Pipeline for RNA-Seq (Without Losing Your Mind)
A minimal Nextflow DSL2 RNA-seq pipeline in under 80 lines: three processes, channel wiring, Docker config, and how to read the execution report and DAG output.
Reducing GO Term Redundancy: simplify, rrvgo, and What Works
After enrichment you get hundreds of overlapping GO terms. A tutorial on clusterProfiler simplify, rrvgo, REVIGO, and a custom uniqueness-score fallback.
Pathway Enrichment Analysis: GSEA and ORA in R and Python
Tutorial for pathway enrichment analysis: GSEA with clusterProfiler and fgsea in R, ORA with enrichGO, and the Python equivalent using gseapy prerank and enrichr. Covers MSigDB Hallmark, KEGG, and GO sets.
Why Deterministic Pipelines Beat AI-Generated Ones for RNA-Seq
AI bioinformatics pipelines feel fast until you check the outputs. Here is when to trust AI, when to verify it, and when to use a deterministic platform.
fastp vs Trimmomatic vs BBDuk: A Benchmark on RNA-Seq Reads
A side-by-side benchmark of fastp, Trimmomatic, and BBDuk on paired-end RNA-seq data: speed, post-trim quality, mapping rate, and downstream DEG impact.
From Count Matrix to Volcano Plot: A DESeq2 Walkthrough in R
A complete DESeq2 tutorial in R: loading counts, building the design formula, running DE, applying lfcShrink, generating a volcano plot, and exporting results.
DESeq2 Contrasts: Multiple Conditions and Multi-Factor Designs
Three conditions, paired designs, two-factor experiments, and time courses: how to build the design formula, specify contrasts, and avoid common mistakes.
RNA-Seq Plots: Volcano, MA, and Heatmap in R and Python
Tutorial for publication-ready RNA-seq visualization: volcano plots with ggplot2 and ggrepel, MA plots, and DEG heatmaps with pheatmap and seaborn. Includes 300 dpi export for journals.
Bulk RNA-Seq Deconvolution: CIBERSORTx and MuSiC Tutorial
Estimate cell type proportions from bulk RNA-seq using CIBERSORTx and MuSiC. Reference selection, batch correction, validation, and result interpretation.
Bulk RNA-Seq Is Not Dead: When To Use It Over scRNA-Seq
Single-cell RNA-seq dominates conferences but bulk RNA-seq remains the right tool for most experiments. A decision framework for choosing your modality.
Detecting and Correcting Batch Effects in Bulk RNA-Seq
A tutorial on spotting batch effects with PCA, modeling them in DESeq2, and when to reach for ComBat-Seq, RUVSeq, or sva instead. Real code, real plots.
What The 2025-2026 Bioinformatics Hiring Shift Means For Your Workflow
Entry-level pipeline jobs are vanishing and AI-skilled senior roles are rising. What the 2025-2026 hiring shift signals about structuring RNA-seq work.
How to Run Differential Expression in Python with PyDESeq2
Complete PyDESeq2 tutorial: build a count matrix from Salmon output, fit a DeseqDataSet, run Wald tests, apply apeGLM shrinkage, and export DEG results in Python. No R required.
How to Run DESeq2 in R: From Salmon Counts to DEG Results
Complete DESeq2 tutorial in R: import Salmon quant.sf files with tximeta, build a DESeqDataSet, run the Wald test, apply apeglm shrinkage, and export a ranked DEG table.
How to Build a Counts Matrix from featureCounts and Salmon in Python
Python tutorial: parse featureCounts output, aggregate Salmon quant.sf files, build a tx2gene map from a GTF, round estimated counts, and save a DESeq2-ready integer count matrix with pandas.
How to Run STAR Alignment for Bulk RNA-Seq (Step-by-Step)
Complete STAR alignment tutorial: download genome and GTF, build a genome index with the right sjdbOverhang, run paired-end alignment, generate GeneCounts, and load counts into R for DESeq2.
How to Build a Salmon Index and Quantify Bulk RNA-Seq Reads
Step-by-step Salmon tutorial: download GENCODE references, build a decoy-aware index, run salmon quant with gcBias and seqBias on all samples, and verify mapping rates before DESeq2.
How to Run FASTQ Quality Control with FastQC, fastp, and MultiQC
Full pipeline tutorial for bulk RNA-seq QC: run FastQC on raw reads, trim adapters with fastp, rerun QC, and aggregate reports with MultiQC. Includes parallel processing and how to read results.
How to Download RNA-Seq Data from GEO and SRA Using sra-tools and pysradb
Step-by-step tutorial for downloading bulk RNA-seq FASTQ files from GEO and SRA. Covers prefetch, fasterq-dump, pysradb metadata extraction, batch downloads, and fixes for common errors.
How to Make Volcano Plots and MA Plots in R: ggplot2 and EnhancedVolcano
Step-by-step R tutorial for publication-quality volcano plots and MA plots from DESeq2 results. Covers ggplot2 from scratch, ggrepel gene labeling, EnhancedVolcano, and plot interpretation.
PCA and Clustering for RNA-Seq QC in Python: Spot Outliers Before DESeq2
Python tutorial: normalize RNA-seq counts, run PCA with scikit-learn, plot interactively with plotly, build a sample distance heatmap, and detect outliers before differential expression.
Differential Expression Analysis in Python with PyDESeq2: A Complete Tutorial
Run DESeq2 differential expression analysis entirely in Python using PyDESeq2. Learn DeseqDataSet, DeseqStats, apeglm shrinkage, multi-factor designs, and pandas result filtering.
How to Run DESeq2: A Complete Walkthrough from Count Matrix to Results
Step-by-step DESeq2 tutorial in R: build a DESeqDataSet, understand size factors and dispersion, run DESeq(), interpret results columns, apply lfcShrink with apeglm, and filter DEGs.
How to Set Up a Bulk RNA-Seq Analysis Environment on Ubuntu and macOS
Step-by-step guide to installing Miniforge, conda, bioconda, R 4.4, and DESeq2 for bulk RNA-seq analysis. Reproducible environments, version pinning, and fixes for common install errors.
How to Quantify RNA-Seq Reads with Salmon: Index, Quant, and Import to R
Step-by-step Salmon RNA-seq tutorial: build a decoy-aware index, run salmon quant on paired-end reads, understand quant.sf output, and import into DESeq2 with tximport.
Importing Salmon Output into R: tximeta, tximport, and DESeq2 Setup
Complete R tutorial for importing Salmon quant.sf files with tximeta and tximport. Build a tx2gene table, fix ID mismatch errors, and set up a DESeqDataSet for multi-factor designs.
Why Cell Line RNA-Seq Experiments Fail: Passage, Mycoplasma, and Culture Batch Effects
Passage number drift, undetected mycoplasma, serum lot changes, and pseudoreplication silently corrupt cell line RNA-seq. Here is what each problem looks like and how to prevent it.
STAR vs HISAT2 vs Salmon: Which Aligner Should You Use?
STAR does full genome alignment. HISAT2 uses less memory. Salmon skips alignment entirely. Here is what each approach actually means for your RNA-seq results and when each one is the right call.
What Is GSEA and Why Does It Beat a Simple DEG List
Gene Set Enrichment Analysis finds coordinated pathway signals that gene-by-gene testing misses. Here is how the algorithm works, what the output means, and how to run it with fgsea and clusterProfiler in R.
What Actually Happens to Your RNA Sample Before It Becomes Data
From tissue extraction to FASTQ file: a clear breakdown of RNA-seq library prep, sequencing chemistry, and what goes wrong at each step.
When to Use edgeR vs DESeq2 vs limma-voom
DESeq2, edgeR, and limma-voom all test for differential expression but use different statistical models, different normalization, and different assumptions. Here is when each one wins.
Understanding Your QC Report: What FastQC and MultiQC Are Telling You
A module-by-module guide to reading FastQC and MultiQC output for RNA-seq data — what each plot means, which failures matter, and which you can safely ignore.
How DESeq2 Actually Works (Without the Math Overload)
The negative binomial model, size factors, dispersion shrinkage, and what each output column really means — a clear explanation of DESeq2 for working researchers.
Batch Effects: The Silent Killer of RNA-Seq Studies
What batch effects are, how they arise, how to detect them with PCA, and when to use ComBat-seq vs limma removeBatchEffect vs a design covariate to correct them.
What Is a Count Matrix and Why Does It Matter
Raw counts, TPM, FPKM, and DESeq2 normalized values all represent gene expression differently. Here is what each one is, why the differences matter, and which to use for each downstream task.
Experimental Design Mistakes That Kill Your Differential Expression Analysis
Replicates, confounders, paired designs, and pseudoreplication: the experimental design decisions that determine whether your DESeq2 results are trustworthy before you touch the data.
Why Your Choice of Reference Genome Changes Your Results
GENCODE, Ensembl, UCSC, and RefSeq annotate the same genome differently. Here is how annotation choice affects RNA-seq alignment, quantification, and which genes appear significant.
Trimming Adapters with Trimmomatic and fastp: A Side-by-Side Walkthrough
When adapter trimming helps, when it hurts, and how to run Trimmomatic and fastp on RNA-seq data with the parameter choices that actually matter.
How to Run FastQC and MultiQC on Raw RNA-Seq Reads
A hands-on guide to automating RNA-seq QC across dozens of samples using FastQC and MultiQC, with bash and Python scripts for parsing and flagging failures.
Raw Reads to Counts: The Bulk RNA-Seq Pipeline Explained
A practical breakdown of every computational step in bulk RNA-seq: from FASTQ quality control through trimming, alignment, and quantification to your final count matrix.
Batch Effects Will Ruin Your RNA-Seq Results
Batch effects silently corrupt bulk RNA-seq data. Learn how to detect them, why they happen, and which correction methods actually work.