Quickstart Tutorial¶
This tutorial provides a step-by-step guide to running the 3t-seq pipeline for the first time. We will use a small subset of a mouse lung dataset (GSE130735) to demonstrate the "Happy Path" of the pipeline.
1. Environment Setup¶
3t-seq uses Pixi for automated environment management.
a. Install Pixi¶
If you haven't installed Pixi yet, run the following command (for Linux/macOS):
Or follow the instructions on pixi.sh.
b. Clone and Initialize¶
- Clone the repository:
- Initialize Git LFS and pull test data:
!!! tip "EMBL Cluster Users"
On the EMBL cluster, you can load Git LFS with: module load git-lfs.
!!! important
Testing the pipeline requires Large File Storage (LFS) assets. The command above ensures that git-lfs is installed in your local environment and all binary assets (FASTQ, FASTA, etc.) are downloaded.
- Install all dependencies:
2. A 5-Minute Run (Test Data)¶
We provide a built-in integration test dataset and configuration so you can verify your installation immediately.
Run the Pipeline¶
Execute the following command to run the pipeline on the mouse lung subset. This uses the laptop profile, which is optimized for local execution.
pixi run snakemake \
--profile .tests/integration/profiles/laptop \
--configfile .tests/integration/configs/local-references.yaml
Tip
This run will use reference files located in the .tests/integration/references/ directory.
3. Exploring the Results¶
Once the run completes, you can find the outputs in the results/ directory as specified in the configuration.
- Quality Control: Open
results/qc/multiqc/multiqc_report.htmlfor an overview of the run metrics. - Quantification Tables: Gene, TE, and tRNA counts are available in
results/analysis/tables/. - Alignments: Sorted BAM files can be found in
results/alignments/.
Next Steps¶
Now that you've successfully run the pipeline on test data, move to the next sections to learn how to prepare your own data and configure the pipeline for your experiments:
- Preparing Data & Samples: How to organize your FASTQ files and write a Sample Sheet.
- Advanced Profiles: Using the
--profileflag to manage resources and configurations. - Running & Reporting: Scaling up to HPC clusters and generating detailed reports.