1 DESeq2 without removing unwanted variation

dds <- DESeqDataSetFromMatrix(countData = counts_table,
                              colData = design_df,
                              design = ~ condition)
keep <- rowSums(counts(dds)) >= 10
dds <- dds[keep,]
dds <- DESeq(dds)

1.1 MA-plots

The threshold used for a dot to be colored in blue in the MA-plots is: p-value adjusted < 0.05.

Almost no Differentially Expressed Genes.

2 Accounting for unwanted variation computed by RUVs in DESeq2 design formula

# merging design_df with phenoData from RUVs, which contains the factors of unwanted variation
pData(ses3)$sample <- rownames(pData(ses3))
design_df <- merge(design_df, pData(ses3)[, c("sample", grep("W_", colnames(pData(ses3)), value = TRUE))], by = "sample", all = FALSE)
rownames(design_df) <- design_df$sample
counts_table <- counts_table[, rownames(design_df)]
# creating DESeqDataset
dds_RUVs <- DESeqDataSetFromMatrix(countData = counts_table,
                                    colData = design_df,
                                    design = ~ W_1 + W_2 + W_3 + condition)
# pre-filtering low count genes
keep <- rowSums(counts(dds_RUVs)) >= 10
dds_RUVs <- dds_RUVs[keep,]
# the standard differential expression analysis steps are wrapped into a single function, DESeq
dds_RUVs <- DESeq(dds_RUVs)
# DESeq results for all comparisons
res_all_RUVs <- get_results_all_comp(dds_RUVs)

2.1 MA-plots

2.1.1 FIGURE

2.2 Gene counts for DEGs of interest

2.3 Ribosomal protein gene (RPG) counts

Gene names of mouse ribosomal proteins (Rpl and Rps) are from RPG database.

To avoid the possible effect of higher injection load on RPG counts, I remove the active samples with the highest number of Btgh molecules from next boxplot:

I test the difference in the distributions shown with last boxplots, using Wilcoxon test:

  • pval for active vs dead is 0.3304004
  • pval for active vs non-inj is 0.007175
  • pval for dead vs non-inj is 0.0616803

So, even if the active vs dead comparison is not significant under threshold pval<0.05, the active vs ctrl is significant while the dead vs ctrl is not.

3 Test for possible slight developmental delay already starting at this stage and causing ribosomal genes downregulation

From the analysis of published dataset GSE66582 (Wu et al., 2016), spanning preimplantation stages from the oocyte to the blastocyst, in OGlcNAc_early_embryo_Formichetti2023/embryo_public_data_GSE66582_GSE76505_reanalysis/src/Rmd/DE_analysis.Rmd I created a list of genes which are >2x up-regulated from the late 2-cell to the 4-cell stage.

There is a slightly lower expression in Btgh-injected embryos. I test the difference in the distributions using Wilcoxon test, after removing C24:

4 RPG vs 2- to 4-cell upregulated genes

sessionInfo()
## R version 4.1.3 (2022-03-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.6 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=it_IT.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=it_IT.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=it_IT.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=it_IT.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] dplyr_1.0.10                RUVSeq_1.28.0              
##  [3] edgeR_3.36.0                limma_3.50.3               
##  [5] EDASeq_2.28.0               ShortRead_1.52.0           
##  [7] GenomicAlignments_1.30.0    Rsamtools_2.10.0           
##  [9] Biostrings_2.62.0           XVector_0.34.0             
## [11] BiocParallel_1.28.3         vsn_3.62.0                 
## [13] ggrepel_0.9.2               pheatmap_1.0.12            
## [15] gridExtra_2.3               ggpubr_0.5.0               
## [17] DESeq2_1.34.0               SummarizedExperiment_1.24.0
## [19] Biobase_2.54.0              MatrixGenerics_1.6.0       
## [21] matrixStats_0.62.0          GenomicRanges_1.46.1       
## [23] GenomeInfoDb_1.30.1         IRanges_2.28.0             
## [25] S4Vectors_0.32.4            BiocGenerics_0.40.0        
## [27] data.table_1.14.6           reshape2_1.4.4             
## [29] ggplot2_3.4.0              
## 
## loaded via a namespace (and not attached):
##   [1] backports_1.4.1        aroma.light_3.24.0     BiocFileCache_2.2.1   
##   [4] plyr_1.8.8             splines_4.1.3          digest_0.6.30         
##   [7] invgamma_1.1           htmltools_0.5.3        SQUAREM_2021.1        
##  [10] fansi_1.0.3            magrittr_2.0.3         memoise_2.0.1         
##  [13] annotate_1.72.0        R.utils_2.12.2         prettyunits_1.1.1     
##  [16] jpeg_0.1-9             colorspace_2.0-3       blob_1.2.3            
##  [19] rappdirs_0.3.3         xfun_0.35              crayon_1.5.2          
##  [22] RCurl_1.98-1.9         jsonlite_1.8.3         genefilter_1.76.0     
##  [25] survival_3.2-13        glue_1.6.2             gtable_0.3.1          
##  [28] zlibbioc_1.40.0        DelayedArray_0.20.0    car_3.1-1             
##  [31] abind_1.4-5            scales_1.2.1           DBI_1.1.3             
##  [34] rstatix_0.7.1          Rcpp_1.0.9             xtable_1.8-4          
##  [37] progress_1.2.2         bit_4.0.5              preprocessCore_1.56.0 
##  [40] truncnorm_1.0-8        httr_1.4.4             RColorBrewer_1.1-3    
##  [43] ellipsis_0.3.2         farver_2.1.1           pkgconfig_2.0.3       
##  [46] XML_3.99-0.12          R.methodsS3_1.8.2      sass_0.4.2            
##  [49] dbplyr_2.2.1           deldir_1.0-6           locfit_1.5-9.6        
##  [52] utf8_1.2.2             labeling_0.4.2         tidyselect_1.2.0      
##  [55] rlang_1.0.6            AnnotationDbi_1.56.2   munsell_0.5.0         
##  [58] tools_4.1.3            cachem_1.0.6           cli_3.4.1             
##  [61] generics_0.1.3         RSQLite_2.2.18         broom_1.0.1           
##  [64] evaluate_0.18          stringr_1.4.1          fastmap_1.1.0         
##  [67] yaml_2.3.6             knitr_1.40             bit64_4.0.5           
##  [70] purrr_0.3.5            KEGGREST_1.34.0        R.oo_1.25.0           
##  [73] xml2_1.3.3             biomaRt_2.50.3         compiler_4.1.3        
##  [76] rstudioapi_0.14        filelock_1.0.2         curl_4.3.3            
##  [79] png_0.1-7              affyio_1.64.0          ggsignif_0.6.4        
##  [82] tibble_3.1.8           geneplotter_1.72.0     bslib_0.4.1           
##  [85] stringi_1.7.8          highr_0.9              GenomicFeatures_1.46.5
##  [88] lattice_0.20-45        Matrix_1.5-3           vctrs_0.5.1           
##  [91] pillar_1.8.1           lifecycle_1.0.3        BiocManager_1.30.22   
##  [94] jquerylib_0.1.4        cowplot_1.1.1          irlba_2.3.5.1         
##  [97] bitops_1.0-7           rtracklayer_1.54.0     R6_2.5.1              
## [100] BiocIO_1.4.0           latticeExtra_0.6-30    affy_1.72.0           
## [103] hwriter_1.3.2.1        codetools_0.2-18       MASS_7.3-55           
## [106] assertthat_0.2.1       rjson_0.2.21           withr_2.5.0           
## [109] GenomeInfoDbData_1.2.7 parallel_4.1.3         hms_1.1.2             
## [112] grid_4.1.3             prettydoc_0.4.1        tidyr_1.2.1           
## [115] rmarkdown_2.18         ashr_2.2-54            carData_3.0-5         
## [118] mixsqp_0.3-48          interp_1.1-3           restfulr_0.0.15