Skip to content

Installation

Via pip

Requirements

The following dependencies need to be installed in order to run synphage on your system.

  • Python 3.11
  • A Python package manager such as Pip or uv

  • Blast+ >= 2.12.0

Install Python and Blast+ using your package manager of choice, or by downloading an installer appropriate for your system from python.org and from the NCBI respectively.
The Python package manager pip is installed by default with Python, however you may need to upgrade pip to the latest version:

pip install --upgrade pip

Install synphage

synphageis available as a Python package and can be install with the Python package manager pip in an opened terminal window.

# Latest
pip install synphage
This will automatically install compatible versions of all Python dependencies.

# Latest
python -m pip install synphage
This will automatically install compatible versions of all Python dependencies.

Step-by-step installation of synphage in Windows Linux Subsystem:

# Install all build python dependencies
sudo apt install build-essential zlib1g-dev libncurses5-dev libgdm-dev libnss3-dev libss1-dev libsqlite3-dev libreadline-dev libffi-dev curl libbz2-dev

# Get the install package for python
wget https://www.python.org/ftp/python/3.11.9/Python-3.11.9.tgz

# Unpack the tarball file
tar -zxvf Python-3.11.9.tgz

# Build Python
cd Python-3.11.9/
./configure --enable-optimizations  # (video: 2:39-3:22)
make -j 2  # (video: 3:27-7:44)
sudo make install   # (video: 8:05-8:25)
#Test Python Install
python3.11 -V
# Python installed
cd ..

# Install dependencies
sudo apt install libcairo2-dev pkg-config python3-dev

# Create project folder
mkdir -p ~/synphage_home
cd ~/synphage_home

# Create python environment
python3.11 -m venv .venv
source ./.venv/bin/activate

# Install synphage
pip install synphage

# Install the Blast+ dependency
sudo apt install ncbi-blast+

# Run synphage
mkdir /dagster_home
DAGSTER_HOME=$PWD/dagster_home dagster dev -h 0.0.0.0 -p 3000 -m synphage

Run synphage

  1. Environment variables

    synphage uses the following environment variables:
    - INPUT_DIR : for specifying the path to the folder containing the user's GenBank files. If not set, this path will be defaulted to the temp folder. This path can also be modified at run time.
    - OUTPUT_DIR: for specifying the path to the folder where the data generated during the run will be stored. If not set, this path will be defaulted to the temp folder.
    - EMAIL (optional): for connecting to the NCBI database.
    - API_KEY (optional): for connecting to the NCBI database and download files.
    - DAGSTER_HOME (optional): for storing metadata generated during former run of the pipeline

    Optional env
    • EMAIL and API_KEY are only required for connecting to the NCBI database and downloading GenBank files. If the user only works with local data, these two variables can be ignored.
    • DAGSTER_HOME is only necessary to keep track of the previous runs and generated metadata. Does not impair data storage if not set.
    Setting your env

    These variables can be set with a .env file located in your working directory (Dagster will automatically load them from the .env file when initialising the pipeline) or can be passed in the terminal before starting to run synphage:

    INPUT_DIR=path/to/my/data/
    OUTPUT_DIR=path/to/synphage/data
    EMAIL=user.email@email.com
    API_KEY=UserApiKey
    
    export INPUT_DIR=<path_to_data_folder>
    export OUTPUT_DIR=<path_to_synphage_folder>
    export EMAIL=user.email@email.com
    export API_KEY=UserApiKey
    
  2. Data Input and Output

    1. Data Input

      The input data are the GenBank files located in the INPUT_DIR. However paths to other data location can be passed at run time for loading data from another directory.

      Warning
      • Only a single path can be configured per loading job run.
      • The use of special characters in file names, might causes errors downstream.
      GenBank file extensions

      .gband .gbk are both valid extension for genbank files

    2. Data Output

      All output data are located in the OUTPUT_DIR set by the user.
      This directory can be reused in future runs if the user needs to process additional sequences or simply generate additional synteny diagrams.

      Warning
      • If no output directory is set, the data folder will be the temporary folder by default. Be aware that the naming convention for the temporary folder (temp/, tmp/, ...) depends on your system.
      Tip

      The current data directory can be checked in the config panel of the jobs.

  3. Start synphage via dagster web-based interface

    To start synphage run the following command:

    dagster dev -h 0.0.0.0 -p 3000 -m synphage
    

    Tip

    As synphage uses dagster-webserver, -h and -p flags are required to visualise the pipeline in your browser:
    -h : Host to use for the Dagster webserver
    -p : Port to use for the Dagster webserver

    To access the webserver, follow the link displayed in your terminal or copy/paste it in your web-browser. In this example:

    http://0.0.0.0:3000
    

    Start dagster
    Dagster running from the terminal and link to the webserver

  4. Stop synphage

    After completing your work, you can close the web-browser and stop the process running in the terminal with Ctrl+C .

    Stop dagster
    Dagster shutting down

Via synphage docker image

Requirements

The following dependency needs to be installed in order to run synphage Docker Image on your system.

  • Docker or Docker Desktop
Info

When installing docker from the website, the right version should automatically be selected for your computer. Docker-Desktop download

Pull synphage image

  1. Open the docker desktop app and go to Images.
    Images

  2. Go to the search bar and search for synphage.
    Search synphage image in DockerHub

    Note

    The latest image will automatically be selected - advised

  3. Pull the image.
    Select Pull and wait for the download to complete.

  4. synphage docker image is installed Installed image

Note

Your Dashboard might look a bit different depending on the Docker Desktop version and your OS.

# Pull the image from docker hub
docker pull vestalisvirginis/synphage:<tag>

# Check the list of installed Docker Images
docker image ls
Replace <tag> with the latest image tag.

Run synphage container

  1. Start the container Start container

  2. Open the drop-down menu Optional settings:
    Optional settings pop-up window

  3. Set the host port to 3000

    Tip

    Setting the port is required to run synphage as it uses a web-interface.
    3000 is given as example, any other available port can be used.

    Port

    Warning

    Make sure that the port is available and not already in use (by another running container for example).

  4. Set the Volumes

    1. Data Output All output data are located in the /data directory of the container.
      The output data can be copied after the run from the /data folder or they can be stored in a Docker Volume that can be mounted to a new Docker Container and reused in subsequent runs if the user needs to process additional sequences or simply generate additional synteny diagrams.

      Create volumes
      Create a Docker Volume for your data
      Volume Output Data
      Mount your volume to the docker data volume when starting your container

      Save data generated in the container
      Download the data from the container to you computer

    2. Dagster home
      Metadata generated during the successive runs of the pipeline are stored in /dagster directory.
      Setting a DAGSTER_HOME Volume is only necessary to keep track of the previous runs and generated metadata. It does not impair data storage if not set.

    Danger

    All the data will be deleted when the container will be removed. If no Volume is mounted to the /data directory and the user do not save the data, data will be lost.

  5. Set the environment variables (optional) synphage uses the following environment variables:

    • EMAIL (optional): for connecting to the NCBI database.
    • API_KEY (optional): for connecting to the NCBI database and download files.
    • DAGSTER_HOME (optional): for storing metadata generated during former run of the pipeline

    Environment variables

    Info
    • EMAIL and API_KEY are only required for connecting to the NCBI database and downloading GenBank files. If the user only works with local data, these two variables can be ignored.
  6. Press the Run button
    Your container is now running.

  7. Import local GenBank files (optional) /user_files is the directory that received users' GenBank files.
    For using locally stored GenBank files, the files can be imported or dragged and dropped (depending on your system) into the /user_files directory.

    Drag and drop genbank files

    Warning
    • The use of special characters in file names, might causes errors downstream.
    Note

    .gband .gbk are both valid extension for genbank files

  8. Connect to the web interface
    To connect to the web-interface, select the link to the port or copy this link to your web-browser. Open the link to the web-interface

  9. Stop and remove your container
    After completing your work, you can close the web-browser and stop the container. After stopping your container a good practice is to remove it.

    Stop container
    Stop the container
    Remove container
    Remove the container

  1. Environment variables

    synphage uses the following environment variables:
    - EMAIL (optional): for connecting to the NCBI database.
    - API_KEY (optional): for connecting to the NCBI database and download files.

    Info

    EMAIL and API_KEY are only required for connecting to the NCBI database and downloading GenBank files. If the user only works with local data, these two variables can be ignored.

    Tip

    These variables can be passed in the terminal before starting to run synphage:

    export EMAIL=user.email@email.com
    export API_KEY=UserApiKey
    
  2. Start the container

    To run the container run the following command line:

    docker run -d --rm --name my_phage_box -p 3000:3000 vestalisvirginis/synphage:<tag>
    

    Image version

    The <tag> corresponds to the <tag> of the downloaded image.

    Tip
    • As synphage uses dagster-webserver, -p flag is required to visualise the pipeline in your browser:
      -p : [host_port:container_port]
      The container_port is fixed to 3000.

    • To access the webserver, follow the link displayed in your browser or copy/paste it in your web-browser. In this example:

      http://0.0.0.0:3000
      

    Tip
    • It is good practice to name your containers to find them easily: --name
    • It is also good practice to remove the container at the end of the run. By passing the --rm flag, the container will be automatically removed after being stopped.
  3. Set the Volumes

    1. Data Output All output data are located in the /data directory of the container.
      The output data can be copied after the run from the /data folder or they can be stored in a Docker Volume that can be mounted to a new Docker Container and reused in subsequent run if the user needs to process additional sequences or simply generate additional synteny diagrams.

      # Create volume synphage_data
      docker volume create synphage_data
      
      # Mount the volume to the /data directory in the container
      docker run -d --rm --name my_phage_box -v synphage_data:/data -p 3000:3000 vestalisvirginis/synphage:<tag>
      
      docker cp container-id/data/* your/local/data_directory/
      
    2. Dagster home
      Metadata generated during the successive runs of the pipeline are stored in /dagster directory.
      Setting a DAGSTER_HOME Volume is only necessary to keep track of the previous runs and generated metadata. It does not impair data storage if not set.

      # Create volume synphage_data
      docker volume create synphage_data
      docker volume create dagster_home
      
      # Mount the volume to the /data directory in the container
      docker run -d --rm --name my_phage_box -v synphage_data:/data -v dagster_home:/dagster -p 3000:3000 vestalisvirginis/synphage:<tag>
      

    Danger

    All the data will be deleted when the container will be removed. If no Volume is mounted to the /data directory and the user do not save the data, data will be lost.

    Warning

    Volume names must be unique. You canot set two volumes wit the same name.

  4. Import local GenBank files (optional) /user_files is the directory that received users GenBank files.
    For using locally stored GenBank files, the files can be copied into the /user_files directory.

    docker cp path_to_my_gb_files/*.gb* container_id:/user_files
    

    Tip

    Start first the container and then copy the files into the container.

    Warning
    • The use of special characters in file names, might causes errors downstream.
    GenBank files extensions

    .gband .gbk are both valid extension for genbank files

  5. Connect to the web interface
    To connect to the web-interface, select the link to the port or copy this link to your web-browser.

    http://0.0.0.0:3000
    

    Start dagster
    Dagster running at the start of the docker container

  6. Stop synphage

    After completing your work, you can close the web-browser and stop the process running in the terminal with Ctrl+C .

    Stop dagster
    Dagster shutting down and the docker container is stopped and removed automatically