ncbi-genome-download再び

アメンバー限定記事です

限定記事を読む

アメンバー限定記事を読むには

vContact2　インストール

vContact2をインストール

Python＝3.7.3だと動く。

3.11だと動かなかった。

--c1-bin /path/to/cluster_one-1.0.jar

のオプションをつけてrunする。

R phangornとapeをインストール

まず基本的なこととして

install.packages('phangorn', Dependency = T)

のようにDependencyのインストールも一緒に行う。

libgfortran.so.4: 共有オブジェクトファイルを開けません: そのようなファイルやディレクトリはありません

と怒られた。

sudo apt install gfortran-7

をしてみる。。。

その後再度

install.packages('phangorn', Dependency = T)

したら解決。

apeのインストールでは、

/usr/bin/ld: cannot find -llapack

/usr/bin/ld: cannot find -lblas

と怒られたが、これは

sudo apt-get install libblas-dev liblapack-dev

sudo apt-get install libgsl0-dev

を行ったら、解決した。

何をやっているのか自分で分からない。。。

MinionQC

Rを開いて、

install.packages(c("data.table",
"futile.logger",
"ggplot2",
"optparse",
"plyr",
"readr",
"reshape2",
"scales",
"viridis",
"yaml"))

を実施。

mkdir /home/bin/MinionQC

cd /home/bin/MinionQC

wget https://raw.githubusercontent.com/roblanf/minion_qc/master/MinIONQC.R -O MinIONQC.R

chmod 755 ./MinIONQC.R

1つのランの評価をする場合

MinIONQC.R -i path/to/sequencing_summary.txt -o ./test -p 32 -s TRUE

sequencing_summary.txtはMinionのouput directory内にある。

複数のsequencing_summary.txtを1回で評価する方法もある

output formatはデフォルトがpngだが、jpegなどにも変更可。

GitHub - roblanf/minion_qc: Quality control for MinION sequencing dataQuality control for MinION sequencing data. Contribute to roblanf/minion_qc development by creating an account on GitHub.

github.com

Trycycler

アメンバー限定記事です

限定記事を読む

アメンバー限定記事を読むには

DBGWAS

Installがやや面倒。

Rでapeとかいくつかインストールしておかないといけない。

これを突破出来たら、runは簡単。

./DBGWAS -strains input.txt -nb-cores 32 -newick tree.ph -pt-db prokka_output.faa

pt-dbのオプションには複数のfaaファイルを投入できる。全てのファイルをカンマ区切りの完全パスでファイル名を記載する。

input.txtは3列のタブ区切りだが、ファイルフォーマットに注意。エクセルでファイルを作成した際にはnkfでフォーマットの変更が必要と思われる。

NCBIからrefseqなどをまとめて一括ダウンロード

アメンバー限定記事です

限定記事を読む

アメンバー限定記事を読むには

checkM2

conda activate でcheckM2の環境を立ち上げる

checkm2 predict --threads 64 --input ./contigs/*.fasta --output-directory ./checkm2/

NCBIからaccession no.を用いて配列をダウンロード

ncbi-acc-download

が使える。

condaで容易にインストール可能。

コマンド

たんぱく配列の場合オプション　-m protein　を付ける。

ncbi-acc-download -m protein VSM46567.1

ヌクレオチドなら

ncbi-acc-download -m nucleotide acc_no.

ncbi_acc_download.sh参照

Local Interproscan

Proxyを突破できず、Dockerに頼ることになった。。。

docker run --rm -v /home/user/bin/interpro/interproscan-5.51-85.0/data:/opt/interproscan/data -v /home/user/bin/interpro/input:/input -v /home/user/bin/interpro/output:/output -v /home/user/bin/interpro/temp:/temp interpro/interproscan:5.51-85.0 ./interproscan.sh --input /input/test_go.faa --disable-precalc --formats tsv --outfile /output/my_input_gt3.tsv --tempdir /temp --cpu 64 --goterms --iprlookup

結果は以下の通り

The TSV format presents the match data in columns as follows:

Protein accession (e.g. P51587)
Sequence MD5 digest (e.g. 14086411a2cdf1c4cba63020e1622579)
Sequence length (e.g. 3418)
Analysis (e.g. Pfam / PRINTS / Gene3D)
Signature accession (e.g. PF09103 / G3DSA:2.40.50.140)
Signature description (e.g. BRCA2 repeat profile)
Start location
Stop location
Score - is the e-value (or score) of the match reported by member database method (e.g. 3.1E-52)
Status - is the status of the match (T: true)
Date - is the date of the run
InterPro annotations - accession (e.g. IPR002093)
InterPro annotations - description (e.g. BRCA2 repeat)
(GO annotations (e.g. GO:0005515) - optional column; only displayed if –goterms option is switched on)
(Pathways annotations (e.g. REACT_71) - optional column; only displayed if –pathways option is switched on)

If a value is missing in a column, for example, the match has no InterPro annotation, a ‘-’ is displayed.

<< 前ページ次ページ >>

今日から、俺は、遺伝子解析、始めます。

自分の解析の備忘録。

ncbi-genome-download再び

vContact2　インストール

R phangornとapeをインストール

MinionQC

Trycycler

DBGWAS

NCBIからrefseqなどをまとめて一括ダウンロード

checkM2

NCBIからaccession no.を用いて配列をダウンロード

Local Interproscan