computers:bioinfo_server_configuration
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| computers:bioinfo_server_configuration [2023/01/11 11:05] – [HTSeq] hychang | computers:bioinfo_server_configuration [2026/01/22 14:57] (current) – hychang | ||
|---|---|---|---|
| Line 103: | Line 103: | ||
| chkuo@koa[~]$ sudo pip3 install seqmagick | chkuo@koa[~]$ sudo pip3 install seqmagick | ||
| Successfully installed seqmagick-0.7.0 | Successfully installed seqmagick-0.7.0 | ||
| + | |||
| + | # install panda package on 2024/07/29 | ||
| + | hychang@koa[~]$ sudo pip3 install pandas | ||
| + | Successfully installed pandas-1.3.5 pytz-2024.1 | ||
| + | hychang@koa[~]$ sudo pip3 -V | ||
| + | pip 22.2.2 from / | ||
| + | |||
| </ | </ | ||
| Line 364: | Line 371: | ||
| ===== PAML ===== | ===== PAML ===== | ||
| * Phylogenetic Analysis by Maximum Likelihood (PAML) | * Phylogenetic Analysis by Maximum Likelihood (PAML) | ||
| - | * [[http:// | + | * [[http:// |
| <code bash> | <code bash> | ||
| - | chkuo@koa[/ | + | # 2023/ |
| - | chkuo@koa[/ | + | chkuo@koa[/ |
| - | chkuo@koa[/ | + | chkuo@koa[/ |
| - | chkuo@koa[/ | + | chkuo@koa[/ |
| - | chkuo@koa[/ | + | chkuo@koa[/ |
| - | chkuo@koa[/ | + | chkuo@koa[/ |
| - | chkuo@koa[/ | + | chkuo@koa[/ |
| - | chkuo@koa[/ | + | chkuo@koa[/ |
| - | chkuo@koa[/ | + | chkuo@koa[/ |
| </ | </ | ||
| Line 482: | Line 489: | ||
| chkuo@koa[/ | chkuo@koa[/ | ||
| # install mrbayes | # install mrbayes | ||
| - | chkuo@koa[/ | + | # 2024/ |
| - | chkuo@koa[/ | + | # 3.2.6 -> 3.2.7 |
| - | chkuo@koa[/ | + | chkuo@koa[/ |
| - | chkuo@koa[/ | + | chkuo@koa[/ |
| - | chkuo@koa[/ | + | chkuo@koa[/ |
| - | chkuo@koa[/ | + | chkuo@koa[/ |
| - | chkuo@koa[/ | + | chkuo@koa[/ |
| - | chkuo@koa[/ | + | chkuo@koa[/ |
| + | chkuo@koa[/ | ||
| + | chkuo@koa[/ | ||
| + | Version: | ||
| </ | </ | ||
| Line 663: | Line 673: | ||
| ===== bbtools ===== | ===== bbtools ===== | ||
| * BBMap short-read aligner and a bunch of tools (shell scripts) | * BBMap short-read aligner and a bunch of tools (shell scripts) | ||
| - | * [[https:// | + | * [[https:// |
| <code bash> | <code bash> | ||
| Line 800: | Line 810: | ||
| * [[https:// | * [[https:// | ||
| * HTSeq: High-throughput sequence analysis in Python | * HTSeq: High-throughput sequence analysis in Python | ||
| - | * Because using pip install | + | * Using pip install failed, so using apt install.[[https:// |
| <code bash> | <code bash> | ||
| - | hychang@koa: | + | hychang@koa: |
| - | [sudo] password for hychang: | + | # [sudo] password for hychang: |
| - | Reading package lists... Done | + | # Reading package lists... Done |
| - | Building dependency tree | + | # Building dependency tree |
| - | Reading state information... Done | + | # Reading state information... Done |
| - | E: Unable to locate package python3-htseq | + | # E: Unable to locate package python3-htseq |
| - | hychang@koa: | + | hychang@koa: |
| - | 2.0.2 | + | # 2.0.2 |
| + | hychang@koa: | ||
| + | # htseq-count: | ||
| </ | </ | ||
| Line 854: | Line 866: | ||
| </ | </ | ||
| ===== SPAdes ===== | ===== SPAdes ===== | ||
| - | * http://cab.spbu.ru/software/spades/ | + | * [[https://github.com/ablab/spades]] |
| <code bash> | <code bash> | ||
| Line 909: | Line 921: | ||
| ===== Velvet ===== | ===== Velvet ===== | ||
| - | | + | * [[https://github.com/dzerbino/velvet]] |
| - | | + | |
| <code bash> | <code bash> | ||
| Line 954: | Line 965: | ||
| <code bash> | <code bash> | ||
| - | # updated | + | # 2023/09/07; v0.4.9b -> v0.5.0; chkuo |
| + | # 2020/04/29; v0.4.8b -> v0.4.9b; chkuo | ||
| chkuo@koa[/ | chkuo@koa[/ | ||
| - | chkuo@koa[/ | + | chkuo@koa[/ |
| - | chkuo@koa[/ | + | chkuo@koa[/ |
| - | chkuo@koa[/ | + | chkuo@koa[/ |
| - | chkuo@koa[/ | + | chkuo@koa[/ |
| - | Unicycler v0.4.9b | + | Unicycler v0.5.0 |
| </ | </ | ||
| Line 1222: | Line 1234: | ||
| chkuo@koa[~]$ checkm | chkuo@koa[~]$ checkm | ||
| </ | </ | ||
| + | |||
| + | |||
| + | ===== CheckM2 ===== | ||
| + | * Rapid assessment of genome bin quality using machine learning. | ||
| + | * [[https:// | ||
| + | * Before installation. CheckM2 requires Python >3.7 (3.8 preferred), but Ubuntu 18.04 usually comes with Python 3.6 (koa have 3.6 and 3.7). Installing it globally isn’t recommended because its heavy dependencies (e.g., TensorFlow, scikit-learn) can interfere with system Python. The official docs also recommend installing CheckM2 via Conda. I’ll set up an isolated, shared Conda environment so everyone can run CheckM2 without affecting the system or needing a personal Conda installation. | ||
| + | * As the official doc mentioned: The easiest way to install is using Conda in a new environment. However, conda can be very slow when processing requirements for the environment. A much faster and better way to install CheckM2 is to install using mamba and creating a new environment. | ||
| + | == Usage == | ||
| + | * Please follow the official website, **Usage** section: [[https:// | ||
| + | * Note: the output folder should be an empty folder. | ||
| + | |||
| + | <code bash> | ||
| + | # 2025/01/07; hychang | ||
| + | |||
| + | # Use Mamba to create the Shared CheckM2 Environment | ||
| + | (base) hychang@koa[~]$ sudo mkdir -p / | ||
| + | (base) hychang@koa[~]$ sudo ~/ | ||
| + | |||
| + | # activate CheckM2 Environment and check | ||
| + | (base) hychang@koa[~]$ conda activate / | ||
| + | (/ | ||
| + | ...::: CheckM2 v1.1.0 :::... | ||
| + | |||
| + | General usage: | ||
| + | predict | ||
| + | |||
| + | checkm2 predict --threads 30 --input < | ||
| + | testrun | ||
| + | |||
| + | checkm2 testrun --threads 10 | ||
| + | database | ||
| + | |||
| + | Use checkm2 < | ||
| + | |||
| + | # Set up the shared Database. Download database. | ||
| + | (/ | ||
| + | (/ | ||
| + | # Failed because the hard-coded 15-second limit, so download manually by wget. | ||
| + | (/ | ||
| + | (/ | ||
| + | # Extract the database | ||
| + | (/ | ||
| + | |||
| + | # Set the Permanent Path | ||
| + | # Create a small script that will run every time any user logs in. This ensures CheckM2 always knows where its " | ||
| + | (/ | ||
| + | |||
| + | # Paste this exact line into the file and save: | ||
| + | export CHECKM2DB="/ | ||
| + | |||
| + | # apply the change to current session | ||
| + | (/ | ||
| + | # Verify with the Test Run | ||
| + | (/ | ||
| + | # Several error need to fix | ||
| + | |||
| + | # 1. Since this enviroment is using Python 3.8 on Ubuntu 18.04, we need to ensure it has the exact version of CheckM2 that matches the models it expects. | ||
| + | # First need to confirm that we are under (/ | ||
| + | (/ | ||
| + | |||
| + | # Erify the fix | ||
| + | (/ | ||
| + | # There were error | ||
| + | |||
| + | # 2. Protobuf version is too new for the TensorFlow version currently in the environment. | ||
| + | # Fix the Package Conflict: downgrade the protobuf package to a version that TensorFlow 2.x understands. | ||
| + | (/ | ||
| + | # Run the Test Again | ||
| + | (/ | ||
| + | # AttributeError: | ||
| + | |||
| + | # 3. Fix the NumPy Version: In NumPy 1.24, typeDict was officially removed, but older versions of TensorFlow still look for it. To fix this, we need to downgrade NumPy to a stable version (like 1.23.5) that still supports this attribute. | ||
| + | (/ | ||
| + | # Final Permissions Cleanup | ||
| + | (/ | ||
| + | # Verify the Fix | ||
| + | (/ | ||
| + | # Due to it took a long time, terminated the run. Use more thread in next test. | ||
| + | # There is a note on CheckM2 website in "Test run" section: | ||
| + | # It is highly recommended to do a testrun with CheckM2 after installation and database download to ensure everything works successfully. You can test that the CheckM2 # installation was successful using checkm2 testrun. This command should complete in < 5 mins on an average desktop computer. | ||
| + | |||
| + | (/ | ||
| + | # Error. CheckM2 finally found the database file! However, it failed the " | ||
| + | |||
| + | # 4. Delete the " | ||
| + | (/ | ||
| + | -rwxr-xr-x 1 root root 1735095710 Jan 7 18:20 / | ||
| + | (/ | ||
| + | (/ | ||
| + | --2026-01-08 10: | ||
| + | Resolving zenodo.org (zenodo.org)... 188.185.48.75, | ||
| + | Connecting to zenodo.org (zenodo.org)|188.185.48.75|: | ||
| + | HTTP request sent, awaiting response... 200 OK | ||
| + | Length: 1735095710 (1.6G) [application/ | ||
| + | Saving to: ‘checkm2_database.tar.gz’ | ||
| + | |||
| + | checkm2_database.tar.gz | ||
| + | |||
| + | 2026-01-08 10:44:51 (656 KB/s) - ‘checkm2_database.tar.gz’ saved [1735095710/ | ||
| + | # Still the same size 1735095710 bytes, not 1862590464 bytes. | ||
| + | # Extract | ||
| + | (/ | ||
| + | (/ | ||
| + | uniref100.KO.1.dmnd | ||
| + | (/ | ||
| + | (/ | ||
| + | / | ||
| + | (/ | ||
| + | # Error | ||
| + | |||
| + | # 6. The version of CheckM2 you have installed (likely 1.1.0) is hard-coded to expect an older version of the database, but Zenodo is now serving a newer one. | ||
| + | # The " | ||
| + | # (1) Open the file where the checksum logic lives: | ||
| + | (/ | ||
| + | # (2) Find the validation function: find this block: checksum_version_validate_DIAMOND | ||
| + | # (3) Force it to always pass: Change the function so it just says return True immediately. It look like this: | ||
| + | |||
| + | def checksum_version_validate_DIAMOND(self, | ||
| + | ''' | ||
| + | |||
| + | # modify by hychang, to skip " | ||
| + | return True # Add this line here | ||
| + | # (You can leave the rest of the code below it, it will be ignored) | ||
| + | |||
| + | version_hashes = os.path.join(DefaultValues.VERSION_PATH, | ||
| + | ' | ||
| + | ... | ||
| + | |||
| + | # Test | ||
| + | (/ | ||
| + | # Error again | ||
| + | |||
| + | # 7. The Fix: Downgrade Pandas | ||
| + | (/ | ||
| + | # Verify the fix | ||
| + | (/ | ||
| + | 2026-01-08 11: | ||
| + | 2026-01-08 11: | ||
| + | 2026-01-08 11: | ||
| + | To enable the following instructions: | ||
| + | 2026-01-08 11: | ||
| + | [01/08/2026 11:58:20 AM] INFO: Test run: Running quality prediction workflow on test genomes with 8 threads. | ||
| + | [01/08/2026 11:58:20 AM] INFO: Running checksum on test genomes. | ||
| + | [01/08/2026 11:58:20 AM] INFO: Checksum successful. | ||
| + | [01/08/2026 11:58:24 AM] INFO: Calling genes in 3 bins with 8 threads: | ||
| + | Finished processing 3 of 3 (100.00%) bins. | ||
| + | [01/08/2026 11:59:09 AM] INFO: Calculating metadata for 3 bins with 8 threads: | ||
| + | Finished processing 3 of 3 (100.00%) bin metadata. | ||
| + | [01/08/2026 11:59:10 AM] INFO: Annotating input genomes with DIAMOND using 8 threads | ||
| + | [01/08/2026 12:01:38 PM] INFO: Processing DIAMOND output | ||
| + | [01/08/2026 12:01:38 PM] INFO: Predicting completeness and contamination using ML models. | ||
| + | [01/08/2026 12:01:52 PM] INFO: Parsing all results and constructing final output table. | ||
| + | [01/08/2026 12:01:52 PM] INFO: CheckM2 finished successfully. | ||
| + | [01/08/2026 12:01:52 PM] INFO: Test run successful! See README for details. Results: | ||
| + | | ||
| + | TEST1 100.00 | ||
| + | TEST2 | ||
| + | TEST3 | ||
| + | |||
| + | # Successed | ||
| + | |||
| + | # Final step: Create the Wrapper Script | ||
| + | # This script acts as a shortcut. When a user types checkm2, it automatically points to the correct Conda environment and the database file we fixed. | ||
| + | (/ | ||
| + | # Paste this exactly into the editor: | ||
| + | |||
| + | #!/bin/bash | ||
| + | # CheckM2 Global Wrapper | ||
| + | |||
| + | # 1. Add the environment' | ||
| + | export PATH="/ | ||
| + | |||
| + | # 2. Set the database path | ||
| + | export CHECKM2DB="/ | ||
| + | |||
| + | # 3. Run CheckM2 | ||
| + | / | ||
| + | |||
| + | # END this script | ||
| + | |||
| + | # Executable and Verification | ||
| + | (/ | ||
| + | # leave the conda enviroment (/ | ||
| + | (/ | ||
| + | # As a normal user, call CheckM2 | ||
| + | hychang@koa[~]$ checkm2 | ||
| + | 2026-01-22 14: | ||
| + | 2026-01-22 14: | ||
| + | 2026-01-22 14: | ||
| + | To enable the following instructions: | ||
| + | 2026-01-22 14: | ||
| + | ____ _ | ||
| + | / ___| |__ | ||
| + | | | | '_ \ / _ \/ __| |/ / |\/| | __) | | ||
| + | | |___| | | | __/ (__| <| | | |/ __/ | ||
| + | | ||
| + | |||
| + | ...::: CheckM2 v1.0.1 :::... | ||
| + | |||
| + | General usage: | ||
| + | predict | ||
| + | testrun | ||
| + | database | ||
| + | |||
| + | Use checkm2 < | ||
| + | |||
| + | </ | ||
| + | |||
| Line 1393: | Line 1613: | ||
| ===== SignalP ===== | ===== SignalP ===== | ||
| * Predicts the presence and location of signal peptide cleavage sites in amino acid sequences. | * Predicts the presence and location of signal peptide cleavage sites in amino acid sequences. | ||
| - | * Server: | + | * [[https://services.healthtech.dtu.dk/ |
| - | * Download: [[http:// | + | |
| + | |||
| + | |||
| + | <code bash> | ||
| + | # 2024/02/07 | ||
| + | # fast | ||
| + | chkuo@koa[/ | ||
| + | chkuo@koa[/ | ||
| + | chkuo@koa[/ | ||
| + | # slow | ||
| + | chkuo@koa[/ | ||
| + | chkuo@koa[/ | ||
| + | chkuo@koa[/ | ||
| + | |||
| + | chkuo@koa[/ | ||
| + | / | ||
| + | chkuo@koa[/ | ||
| + | SignalP 6.0 Signal peptide prediction tool 6.0h | ||
| + | </ | ||
| <code bash> | <code bash> | ||
| Line 1480: | Line 1717: | ||
| <code bash> | <code bash> | ||
| - | chkuo@koa[/ | + | # 2024/ |
| - | chkuo@koa[/ | + | # note: v1.34 has some error message; use v1.33 for now |
| - | Archive: | + | chkuo@koa[/ |
| - | inflating: fastANI | + | chkuo@koa[/ |
| - | | + | Archive: |
| - | inflating: __MACOSX/ | + | inflating: fastANI |
| - | chkuo@koa[/ | + | chkuo@koa[/ |
| </ | </ | ||
| + | |||
| + | ===== parallel ===== | ||
| + | * GNU parallel is a shell tool for executing jobs in parallel using one or more computers. | ||
| + | * http:// | ||
| + | |||
| + | <code bash> | ||
| + | # 2024/07/29 | ||
| + | hychang@koa[~]$ sudo apt install parallel | ||
| + | hychang@koa[~]$ parallel --version | ||
| + | GNU parallel 20161222 | ||
| + | Copyright (C) 2007, | ||
| + | Ole Tange and Free Software Foundation, Inc. | ||
| + | License GPLv3+: GNU GPL version 3 or later < | ||
| + | This is free software: you are free to change and redistribute it. | ||
| + | GNU parallel comes with no warranty. | ||
| + | |||
| + | Web site: http:// | ||
| + | |||
| + | When using programs that use GNU Parallel to process data for publication | ||
| + | please cite as described in ' | ||
| + | |||
| + | </ | ||
| + | |||
| + | |||
| + | ===== HH-suite ===== | ||
| + | * The HH-suite is an open-source software package for sensitive protein sequence searching based on the pairwise alignment of hidden Markov models (HMMs). | ||
| + | * https:// | ||
| + | |||
| + | <code bash> | ||
| + | # 2024/07/29 | ||
| + | # note: | ||
| + | hychang@koa[~]$ sudo apt install hhsuite | ||
| + | hychang@koa[~]$ hhsearch | ||
| + | |||
| + | HHsearch 3.0.0 (15-03-2015) | ||
| + | Search a database of HMMs with a query alignment or query HMM | ||
| + | (C) Johannes Soeding, Michael Remmert, Andreas Biegert, Andreas Hauser | ||
| + | Soding, J. Protein homology detection by HMM-HMM comparison. Bioinformatics 21:951-960 (2005). | ||
| + | |||
| + | Usage: hhsearch -i query -d database [options] | ||
| + | ... | ||
| + | |||
| + | </ | ||
computers/bioinfo_server_configuration.1673406335.txt.gz · Last modified: by hychang