Installation
Via pip
Requirements
The following dependencies need to be installed in order to run synphage on your system.
Python 3.11-
A Python package manager such as
Piporuv -
Blast+ >= 2.12.0
Install Python and Blast+ using your package manager of choice, or by downloading an installer appropriate for your system from python.org and from the NCBI respectively.
The Python package manager pip is installed by default with Python, however you may need to upgrade pip to the latest version:
pip install --upgrade pip
Install synphage
synphageis available as a Python package and can be install with the Python package manager pip in an opened terminal window.
# Latest
pip install synphage
# Latest
python -m pip install synphage
Step-by-step installation of synphage in Windows Linux Subsystem:
# Install all build python dependencies
sudo apt install build-essential zlib1g-dev libncurses5-dev libgdm-dev libnss3-dev libss1-dev libsqlite3-dev libreadline-dev libffi-dev curl libbz2-dev
# Get the install package for python
wget https://www.python.org/ftp/python/3.11.9/Python-3.11.9.tgz
# Unpack the tarball file
tar -zxvf Python-3.11.9.tgz
# Build Python
cd Python-3.11.9/
./configure --enable-optimizations # (video: 2:39-3:22)
make -j 2 # (video: 3:27-7:44)
sudo make install # (video: 8:05-8:25)
#Test Python Install
python3.11 -V
# Python installed
cd ..
# Install dependencies
sudo apt install libcairo2-dev pkg-config python3-dev
# Create project folder
mkdir -p ~/synphage_home
cd ~/synphage_home
# Create python environment
python3.11 -m venv .venv
source ./.venv/bin/activate
# Install synphage
pip install synphage
# Install the Blast+ dependency
sudo apt install ncbi-blast+
# Run synphage
mkdir /dagster_home
DAGSTER_HOME=$PWD/dagster_home dagster dev -h 0.0.0.0 -p 3000 -m synphage
Run synphage
-
synphageuses the following environment variables:
-INPUT_DIR: for specifying the path to the folder containing the user'sGenBank files. If not set, this path will be defaulted to the temp folder. This path can also be modified at run time.
-OUTPUT_DIR: for specifying the path to the folder where the data generated during the run will be stored. If not set, this path will be defaulted to the temp folder.
-EMAIL(optional): for connecting to the NCBI database.
-API_KEY(optional): for connecting to the NCBI database and download files.
-DAGSTER_HOME(optional): for storing metadata generated during former run of the pipelineOptional env
EMAILandAPI_KEYare only required for connecting to the NCBI database and downloading GenBank files. If the user only works with local data, these two variables can be ignored.DAGSTER_HOMEis only necessary to keep track of the previous runs and generated metadata. Does not impair data storage if not set.
Setting your env
These variables can be set with a
.envfile located in your working directory (Dagster will automatically load them from the .env file when initialising the pipeline) or can be passed in the terminal before starting to run synphage:INPUT_DIR=path/to/my/data/ OUTPUT_DIR=path/to/synphage/data EMAIL=user.email@email.com API_KEY=UserApiKeyexport INPUT_DIR=<path_to_data_folder> export OUTPUT_DIR=<path_to_synphage_folder> export EMAIL=user.email@email.com export API_KEY=UserApiKey -
Data Input and Output
-
The input data are the GenBank files located in the
INPUT_DIR. However paths to other data location can be passed at run time for loading data from another directory.Warning
- Only a single path can be configured per loading job run.
- The use of special characters in file names, might causes errors downstream.
GenBank file extensions
.gband.gbkare both valid extension for genbank files -
All output data are located in the
OUTPUT_DIRset by the user.
This directory can be reused in future runs if the user needs to process additional sequences or simply generate additional synteny diagrams.Warning
- If no output directory is set, the data folder will be the temporary folder by default. Be aware that the naming convention for the temporary folder (temp/, tmp/, ...) depends on your system.
Tip
The current data directory can be checked in the config panel of the jobs.
-
-
Start synphage via dagster web-based interface
To start synphage run the following command:
dagster dev -h 0.0.0.0 -p 3000 -m synphageTip
As synphage uses dagster-webserver, -h and -p flags are required to visualise the pipeline in your browser:
-h : Host to use for the Dagster webserver
-p : Port to use for the Dagster webserverTo access the webserver, follow the link displayed in your terminal or copy/paste it in your web-browser. In this example:
http://0.0.0.0:3000
Dagster running from the terminal and link to the webserver -
Stop synphage
After completing your work, you can close the web-browser and stop the process running in the terminal with Ctrl+C .
Dagster shutting down
Via synphage docker image
Requirements
The following dependency needs to be installed in order to run synphage Docker Image on your system.
DockerorDocker Desktop
- Install docker desktop from the executable.
- Check the full documentation for docker Linux.
- Install docker desktop from the executable.
- Check the full documentation for docker Mac.
- Install docker desktop from the executable.
- Check the full documentation for docker Windows.
Info
When installing docker from the website, the right version should automatically be selected for your computer.

Pull synphage image
-
Open the docker desktop app and go to
Images.
-
Go to the search bar and search for
synphage.

Note
The latest image will automatically be selected - advised
-
Pull the image.
SelectPulland wait for the download to complete. -
synphage docker image is installed
Note
Your Dashboard might look a bit different depending on the Docker Desktop version and your OS.
# Pull the image from docker hub
docker pull vestalisvirginis/synphage:<tag>
# Check the list of installed Docker Images
docker image ls
<tag> with the latest image tag.
Run synphage container
-
Start the container
-
Open the drop-down menu
Optional settings:
-
Set the
host portto 3000Tip
Setting the port is required to run synphage as it uses a web-interface.
3000 is given as example, any otheravailableport can be used.
Warning
Make sure that the port is available and not already in use (by another running container for example).
-
Set the
Volumes-
Data Output All output data are located in the
/datadirectory of the container.
The output data can be copied after the run from the/datafolder or they can be stored in aDocker Volumethat can be mounted to a new Docker Container and reused in subsequent runs if the user needs to process additional sequences or simply generate additional synteny diagrams.
Create a Docker Volume for your data
Mount your volume to the docker data volume when starting your container
Download the data from the container to you computer -
Dagster home
Metadata generated during the successive runs of the pipeline are stored in/dagsterdirectory.
Setting aDAGSTER_HOMEVolume is only necessary to keep track of the previous runs and generated metadata. It does not impair data storage if not set.
Danger
All the data will be deleted when the container will be removed. If no Volume is mounted to the /data directory and the user do not save the data, data will be lost.
-
-
Set the environment variables (optional)
synphageuses the following environment variables:EMAIL(optional): for connecting to the NCBI database.API_KEY(optional): for connecting to the NCBI database and download files.DAGSTER_HOME(optional): for storing metadata generated during former run of the pipeline

Info
EMAILandAPI_KEYare only required for connecting to the NCBI database and downloading GenBank files. If the user only works with local data, these two variables can be ignored.
-
Press the
Runbutton
Your container is now running. -
Import local GenBank files (optional)
/user_filesis the directory that received users' GenBank files.
For using locally stored GenBank files, the files can be imported or dragged and dropped (depending on your system) into the/user_filesdirectory.
Warning
- The use of special characters in file names, might causes errors downstream.
Note
.gband.gbkare both valid extension for genbank files -
Connect to the web interface
To connect to the web-interface, select the link to the port or copy this link to your web-browser.
-
Stop and remove your container
After completing your work, you can close the web-browser and stop the container. After stopping your container a good practice is to remove it.
Stop the container
Remove the container
-
synphageuses the following environment variables:
-EMAIL(optional): for connecting to the NCBI database.
-API_KEY(optional): for connecting to the NCBI database and download files.Info
EMAILandAPI_KEYare only required for connecting to the NCBI database and downloading GenBank files. If the user only works with local data, these two variables can be ignored.Tip
These variables can be passed in the terminal before starting to run synphage:
export EMAIL=user.email@email.com export API_KEY=UserApiKey -
Start the container
To run the container run the following command line:
docker run -d --rm --name my_phage_box -p 3000:3000 vestalisvirginis/synphage:<tag>Image version
The
<tag>corresponds to the<tag>of the downloaded image.Tip
-
As synphage uses dagster-webserver, -p flag is required to visualise the pipeline in your browser:
-p : [host_port:container_port]
The container_port is fixed to 3000. -
To access the webserver, follow the link displayed in your browser or copy/paste it in your web-browser. In this example:
http://0.0.0.0:3000
Tip
- It is good practice to name your containers to find them easily:
--name - It is also good practice to remove the container at the end of the run. By passing the
--rmflag, the container will be automatically removed after being stopped.
-
-
Set the
Volumes-
Data Output All output data are located in the
/datadirectory of the container.
The output data can be copied after the run from the/datafolder or they can be stored in aDocker Volumethat can be mounted to a new Docker Container and reused in subsequent run if the user needs to process additional sequences or simply generate additional synteny diagrams.# Create volume synphage_data docker volume create synphage_data # Mount the volume to the /data directory in the container docker run -d --rm --name my_phage_box -v synphage_data:/data -p 3000:3000 vestalisvirginis/synphage:<tag>docker cp container-id/data/* your/local/data_directory/ -
Dagster home
Metadata generated during the successive runs of the pipeline are stored in/dagsterdirectory.
Setting aDAGSTER_HOMEVolume is only necessary to keep track of the previous runs and generated metadata. It does not impair data storage if not set.# Create volume synphage_data docker volume create synphage_data docker volume create dagster_home # Mount the volume to the /data directory in the container docker run -d --rm --name my_phage_box -v synphage_data:/data -v dagster_home:/dagster -p 3000:3000 vestalisvirginis/synphage:<tag>
Danger
All the data will be deleted when the container will be removed. If no Volume is mounted to the
/datadirectory and the user do not save the data, data will be lost.Warning
Volume names must be unique. You canot set two volumes wit the same name.
-
-
Import local GenBank files (optional)
/user_filesis the directory that received users GenBank files.
For using locally stored GenBank files, the files can be copied into the/user_filesdirectory.
docker cp path_to_my_gb_files/*.gb* container_id:/user_filesTip
Start first the container and then copy the files into the container.
Warning
- The use of special characters in file names, might causes errors downstream.
GenBank files extensions
.gband.gbkare both valid extension for genbank files -
Connect to the web interface
To connect to the web-interface, select the link to the port or copy this link to your web-browser.http://0.0.0.0:3000
Dagster running at the start of the docker container -
Stop synphage
After completing your work, you can close the web-browser and stop the process running in the terminal with Ctrl+C .
Dagster shutting down and the docker container is stopped and removed automatically