Year 2019

From EvoBioLabatHunter
Jump to navigation Jump to search

TD Project

  • Credit: Christopher Panlasigui
  • Brief computational/statistical steps:
  1. Quantile normalization between all replicates (with log2 transformatuon)
  2. Linear model among the 3 groups
  3. Select top significant genes
  4. Map gene names
  5. Cluster and show by interactive heatmaps in R
# Quantile normalization
library(preprocessCore)
w2.mat <- as.matrix(w2[,3:17]) #convert original data to matrix
w2.mat.norm <- normalize.quantiles.robust(w2.mat, copy = TRUE, use.log2 = TRUE)
rownames(w2.mat.norm) <- w2$Geneid

# linear model
fits <- lmList(log.counts ~ group | geneID, data=w2.norm.melt)
lm.sum <- lapply(fits, function(x){
  out<-summary(x);
  fstat <- out$fstatistic;
  pf(fstat[1], fstat[2], fstat[3], lower.tail=F)
})
p.df <- data.frame(gene=names(fits), p.val=as.numeric(lm.sum))
w2.out <- cbind(w2[,1:2], w2.mat.norm, p.df[,1:2])

# heatmap
heatmaply(td.mat, scale = "none" , cexRow = 0.50, colors = colorspace::diverge_hsv(16), branches_lwd = 0.3)
  • Results
  1. Heatmap 1. p<1e-5, showing Fold Change over mean of wild type