1 DESeq2 without removing unwanted variation
<- DESeqDataSetFromMatrix(countData = counts_table,
dds colData = design_df,
design = ~ condition)
<- rowSums(counts(dds)) >= 10
keep <- dds[keep,]
dds <- DESeq(dds) dds
1.1 MA-plots
The threshold used for a dot to be colored in blue in the MA-plots is: p-value adjusted < 0.05.
Almost no Differentially Expressed Genes.
2 Accounting for unwanted variation computed by RUVs in DESeq2 design formula
# merging design_df with phenoData from RUVs, which contains the factors of unwanted variation
pData(ses3)$sample <- rownames(pData(ses3))
<- merge(design_df, pData(ses3)[, c("sample", grep("W_", colnames(pData(ses3)), value = TRUE))], by = "sample", all = FALSE)
design_df rownames(design_df) <- design_df$sample
<- counts_table[, rownames(design_df)] counts_table
# creating DESeqDataset
<- DESeqDataSetFromMatrix(countData = counts_table,
dds_RUVs colData = design_df,
design = ~ W_1 + W_2 + W_3 + condition)
# pre-filtering low count genes
<- rowSums(counts(dds_RUVs)) >= 10
keep <- dds_RUVs[keep,]
dds_RUVs # the standard differential expression analysis steps are wrapped into a single function, DESeq
<- DESeq(dds_RUVs)
dds_RUVs # DESeq results for all comparisons
<- get_results_all_comp(dds_RUVs) res_all_RUVs
2.1 MA-plots
2.1.1 FIGURE
2.2 Gene counts for DEGs of interest
2.3 Ribosomal protein gene (RPG) counts
Gene names of mouse ribosomal proteins (Rpl and Rps) are from RPG database.
To avoid the possible effect of higher injection load on RPG counts, I remove the active samples with the highest number of Btgh molecules from next boxplot:
I test the difference in the distributions shown with last boxplots, using Wilcoxon test:
- pval for active vs dead is 0.3304004
- pval for active vs non-inj is 0.007175
- pval for dead vs non-inj is 0.0616803
So, even if the active vs dead comparison is not significant under threshold pval<0.05, the active vs ctrl is significant while the dead vs ctrl is not.
3 Test for possible slight developmental delay already starting at this stage and causing ribosomal genes downregulation
From the analysis of published dataset GSE66582 (Wu et al., 2016), spanning preimplantation stages from the oocyte to the blastocyst, in OGlcNAc_early_embryo_Formichetti2023/embryo_public_data_GSE66582_GSE76505_reanalysis/src/Rmd/DE_analysis.Rmd I created a list of genes which are >2x up-regulated from the late 2-cell to the 4-cell stage.
There is a slightly lower expression in Btgh-injected embryos. I test the difference in the distributions using Wilcoxon test, after removing C24:
- pval for active vs dead is 0.3877319
- pval for active vs non-inj is 0.0353359
- pval for dead vs non-inj is 0.2982525
4 RPG vs 2- to 4-cell upregulated genes
sessionInfo()
## R version 4.1.3 (2022-03-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.6 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=it_IT.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=it_IT.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=it_IT.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=it_IT.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] dplyr_1.0.10 RUVSeq_1.28.0
## [3] edgeR_3.36.0 limma_3.50.3
## [5] EDASeq_2.28.0 ShortRead_1.52.0
## [7] GenomicAlignments_1.30.0 Rsamtools_2.10.0
## [9] Biostrings_2.62.0 XVector_0.34.0
## [11] BiocParallel_1.28.3 vsn_3.62.0
## [13] ggrepel_0.9.2 pheatmap_1.0.12
## [15] gridExtra_2.3 ggpubr_0.5.0
## [17] DESeq2_1.34.0 SummarizedExperiment_1.24.0
## [19] Biobase_2.54.0 MatrixGenerics_1.6.0
## [21] matrixStats_0.62.0 GenomicRanges_1.46.1
## [23] GenomeInfoDb_1.30.1 IRanges_2.28.0
## [25] S4Vectors_0.32.4 BiocGenerics_0.40.0
## [27] data.table_1.14.6 reshape2_1.4.4
## [29] ggplot2_3.4.0
##
## loaded via a namespace (and not attached):
## [1] backports_1.4.1 aroma.light_3.24.0 BiocFileCache_2.2.1
## [4] plyr_1.8.8 splines_4.1.3 digest_0.6.30
## [7] invgamma_1.1 htmltools_0.5.3 SQUAREM_2021.1
## [10] fansi_1.0.3 magrittr_2.0.3 memoise_2.0.1
## [13] annotate_1.72.0 R.utils_2.12.2 prettyunits_1.1.1
## [16] jpeg_0.1-9 colorspace_2.0-3 blob_1.2.3
## [19] rappdirs_0.3.3 xfun_0.35 crayon_1.5.2
## [22] RCurl_1.98-1.9 jsonlite_1.8.3 genefilter_1.76.0
## [25] survival_3.2-13 glue_1.6.2 gtable_0.3.1
## [28] zlibbioc_1.40.0 DelayedArray_0.20.0 car_3.1-1
## [31] abind_1.4-5 scales_1.2.1 DBI_1.1.3
## [34] rstatix_0.7.1 Rcpp_1.0.9 xtable_1.8-4
## [37] progress_1.2.2 bit_4.0.5 preprocessCore_1.56.0
## [40] truncnorm_1.0-8 httr_1.4.4 RColorBrewer_1.1-3
## [43] ellipsis_0.3.2 farver_2.1.1 pkgconfig_2.0.3
## [46] XML_3.99-0.12 R.methodsS3_1.8.2 sass_0.4.2
## [49] dbplyr_2.2.1 deldir_1.0-6 locfit_1.5-9.6
## [52] utf8_1.2.2 labeling_0.4.2 tidyselect_1.2.0
## [55] rlang_1.0.6 AnnotationDbi_1.56.2 munsell_0.5.0
## [58] tools_4.1.3 cachem_1.0.6 cli_3.4.1
## [61] generics_0.1.3 RSQLite_2.2.18 broom_1.0.1
## [64] evaluate_0.18 stringr_1.4.1 fastmap_1.1.0
## [67] yaml_2.3.6 knitr_1.40 bit64_4.0.5
## [70] purrr_0.3.5 KEGGREST_1.34.0 R.oo_1.25.0
## [73] xml2_1.3.3 biomaRt_2.50.3 compiler_4.1.3
## [76] rstudioapi_0.14 filelock_1.0.2 curl_4.3.3
## [79] png_0.1-7 affyio_1.64.0 ggsignif_0.6.4
## [82] tibble_3.1.8 geneplotter_1.72.0 bslib_0.4.1
## [85] stringi_1.7.8 highr_0.9 GenomicFeatures_1.46.5
## [88] lattice_0.20-45 Matrix_1.5-3 vctrs_0.5.1
## [91] pillar_1.8.1 lifecycle_1.0.3 BiocManager_1.30.22
## [94] jquerylib_0.1.4 cowplot_1.1.1 irlba_2.3.5.1
## [97] bitops_1.0-7 rtracklayer_1.54.0 R6_2.5.1
## [100] BiocIO_1.4.0 latticeExtra_0.6-30 affy_1.72.0
## [103] hwriter_1.3.2.1 codetools_0.2-18 MASS_7.3-55
## [106] assertthat_0.2.1 rjson_0.2.21 withr_2.5.0
## [109] GenomeInfoDbData_1.2.7 parallel_4.1.3 hms_1.1.2
## [112] grid_4.1.3 prettydoc_0.4.1 tidyr_1.2.1
## [115] rmarkdown_2.18 ashr_2.2-54 carData_3.0-5
## [118] mixsqp_0.3-48 interp_1.1-3 restfulr_0.0.15