<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
		<id>https://www.appliedbioinformatics.com.au/Edwards/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Agnieszka</id>
		<title>Applied Bioinformatics Group - User contributions [en]</title>
		<link rel="self" type="application/atom+xml" href="https://www.appliedbioinformatics.com.au/Edwards/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Agnieszka"/>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php/Special:Contributions/Agnieszka"/>
		<updated>2026-04-19T06:45:14Z</updated>
		<subtitle>User contributions</subtitle>
		<generator>MediaWiki 1.28.0</generator>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=BOLPANGENOME&amp;diff=937</id>
		<title>BOLPANGENOME</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=BOLPANGENOME&amp;diff=937"/>
				<updated>2016-12-01T23:24:36Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page provides access to Brassica oleracea pangenome data.&lt;br /&gt;
&lt;br /&gt;
Pangenome contigs can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.contigs.13062016.fasta.gz&lt;br /&gt;
&lt;br /&gt;
Pangenome annotation can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.genes.13062016.gff3.gz&lt;br /&gt;
&lt;br /&gt;
Placement of pangenome contigs along the TO1000 chromosomes can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.contigs.placement.13062016.csv.gz&lt;br /&gt;
&lt;br /&gt;
SNPs can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.snps.13062016.tab.gz&lt;br /&gt;
&lt;br /&gt;
SNPs in vcf format can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.snps.13062016.vcf.gz&lt;br /&gt;
&lt;br /&gt;
PAVs in vcf format can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.pav.13062016.vcf.gz&lt;br /&gt;
&lt;br /&gt;
List of TE related genes can be downloaded from:  http://appliedbioinformatics.com.au/download/BOLEPan.TE.realated.genes.13062016.gz&lt;br /&gt;
&lt;br /&gt;
The GBrowse is available from:  http://appliedbioinformatics.com.au/cgi-bin/gb2/gbrowse/BolePan&lt;br /&gt;
&lt;br /&gt;
The The BlastGBrowse is available from:  http://appliedbioinformatics.com.au/gbrowseblast&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=BOLPANGENOME&amp;diff=883</id>
		<title>BOLPANGENOME</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=BOLPANGENOME&amp;diff=883"/>
				<updated>2016-08-08T01:04:32Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page provides access to Brassica oleracea pangenome data.&lt;br /&gt;
&lt;br /&gt;
Pangenome contigs can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.contigs.13062016.fasta.gz&lt;br /&gt;
&lt;br /&gt;
Pangenome annotation can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.genes.13062016.gff3.gz&lt;br /&gt;
&lt;br /&gt;
Placement of pangenome contigs along the TO1000 chromosomes can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.contigs.placement.13062016.csv.gz&lt;br /&gt;
&lt;br /&gt;
SNPs can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.snps.13062016.tab.gz&lt;br /&gt;
&lt;br /&gt;
List of TE related genes can be downloaded from:  http://appliedbioinformatics.com.au/download/BOLEPan.TE.realated.genes.13062016.gz&lt;br /&gt;
&lt;br /&gt;
The GBrowse is available from:  http://appliedbioinformatics.com.au/cgi-bin/gb2/gbrowse/BolePan&lt;br /&gt;
&lt;br /&gt;
The The BlastGBrowse is available from:  http://appliedbioinformatics.com.au/gbrowseblast&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=BOLPANGENOME&amp;diff=871</id>
		<title>BOLPANGENOME</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=BOLPANGENOME&amp;diff=871"/>
				<updated>2016-06-13T08:52:54Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page provides access to Brassica oleracea pangenome data.&lt;br /&gt;
&lt;br /&gt;
Pangenome contigs can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.contigs.13062016.fasta.gz&lt;br /&gt;
&lt;br /&gt;
Pangenome annotation can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.genes.13062016.gff3.gz&lt;br /&gt;
&lt;br /&gt;
Placement of pangenome contigs along the TO1000 chromosomes can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.contigs.placement.13062016.csv.gz&lt;br /&gt;
&lt;br /&gt;
SNPs can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.snps.13062016.tab.gz&lt;br /&gt;
&lt;br /&gt;
List of TE related genes can be downloaded from:  http://appliedbioinformatics.com.au/download/BOLEPan.TE.realated.genes.13062016.gz&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=BOLPANGENOME&amp;diff=870</id>
		<title>BOLPANGENOME</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=BOLPANGENOME&amp;diff=870"/>
				<updated>2016-06-13T08:48:38Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page provides access to Brassica oleracea pangenome data.&lt;br /&gt;
&lt;br /&gt;
Pangenome contigs can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.contigs.13062016.fasta.gz&lt;br /&gt;
&lt;br /&gt;
Pangenome annotation can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.genes.13062016.gff3.gz&lt;br /&gt;
&lt;br /&gt;
Placement of pangenome contigs along the TO1000 chromosomes can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.contigs.placement.13062016.csv.gz&lt;br /&gt;
&lt;br /&gt;
SNPs can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.snps.13062016.tab.gz&lt;br /&gt;
&lt;br /&gt;
List of TE related genes can be downloaded from:  http://appliedbioinformatics.com.au/download/BOLEPan.TE.realated.genes.13062016.gz&lt;br /&gt;
&lt;br /&gt;
Contigs are available for BLAST under:&lt;br /&gt;
&lt;br /&gt;
http://appliedbioinformatics.com.au/gbrowseblast/cgi-bin/index.pl&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=BOLPANGENOME&amp;diff=869</id>
		<title>BOLPANGENOME</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=BOLPANGENOME&amp;diff=869"/>
				<updated>2016-06-13T08:46:18Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: Created page with &amp;quot;This page provides access to Brassica oleracea pangenome data.  Pangenome contigs can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.contigs.13062016.fa...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page provides access to Brassica oleracea pangenome data.&lt;br /&gt;
&lt;br /&gt;
Pangenome contigs can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.contigs.13062016.fasta.gz&lt;br /&gt;
&lt;br /&gt;
Pangenome annotation can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.genes.13062016.gff3.gz&lt;br /&gt;
&lt;br /&gt;
Placement of pangenome contigs along the TO1000 chromosomes can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.contigs.placement.13062016.csv.gz&lt;br /&gt;
&lt;br /&gt;
SNPs can be downloaded from: http://appliedbioinformatics.com.au/download/BOLEPan.snps.13062016.tab.gz&lt;br /&gt;
&lt;br /&gt;
Contigs are available for BLAST under:&lt;br /&gt;
&lt;br /&gt;
http://appliedbioinformatics.com.au/gbrowseblast/cgi-bin/index.pl&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSSynteny&amp;diff=667</id>
		<title>SGSSynteny</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSSynteny&amp;diff=667"/>
				<updated>2014-08-18T04:20:26Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSSynteny depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSSynteny.v0.1.tar.gz SGSSynteny.v0.1.tar.gz] should contain&lt;br /&gt;
*** two main programs: SGSSynteny.v0.1.jar, graph_synteny.v0.1.R&lt;br /&gt;
*** readme file &lt;br /&gt;
*** folder with source code&lt;br /&gt;
&lt;br /&gt;
From now on in this manula SGSSynteny.v0.1.jar and graph_synteny.v0.1.R are referred to as SGSSynteny.jar and graph_synteny.R&lt;br /&gt;
&lt;br /&gt;
To run the programs you have to use full names SGSSynteny.v0.1.jar and graph_synteny.v0.1.R&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSSynteny.tar.gz&lt;br /&gt;
* Unpack SGSSynteny.tar.gz and place SGSSynteny.jar and all the R scripts in chosen directory/directories, for example ./my_synteny&lt;br /&gt;
* Move into ./my_synteny and create SGSSynteny_lib directory (on linux: cd ./my_synteny, mkdir SGSSynteny_lib directory)&lt;br /&gt;
** The name of the lib directory is the name of the .jar file witout .jar extension + _lib, so if you are using SGSSynteny.v0.1.jar the lib directory is SGSSynteny.v0.1_lib&lt;br /&gt;
** The lib directory has to be in '''the same folder as the .jar file'''&lt;br /&gt;
* Download picard-tools (SGSSynteny.jar was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_gene_loss/SGSSynteny_lib&lt;br /&gt;
* Now you are ready to run SGSSynteny&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSSynteny.v0.1.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately - .cluster files&lt;br /&gt;
** File with overall stats - stats.txt&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSSynteny.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to bam file, only folder path, do not specify bam file names here, folder has to contain both .bam and .bai files; has to end with “/” or “\”&lt;br /&gt;
&lt;br /&gt;
bamFileList - comma separated list of all the bam files to be used&lt;br /&gt;
&lt;br /&gt;
gffFile - path to .gff3 file, including file name; has to contain at least genes and exons features&lt;br /&gt;
&lt;br /&gt;
outDirPath - directory for the output files; has to end with “/” or “\”&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
expectCov - expected coverage [null]&lt;br /&gt;
 &lt;br /&gt;
minFracHor - minimum horizontal coverage required to consider genes as syntenic  [0.3]&lt;br /&gt;
&lt;br /&gt;
minCovVer - minimum coverage depth required to consider genes as syntenic [2.0]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes, used `all` for all the chromosomes in .bam file [all]&lt;br /&gt;
&lt;br /&gt;
DBepsilon - Eps value for DBSCAN (radius) [26]&lt;br /&gt;
&lt;br /&gt;
DBmin - minPts value for DBSCAN (min cluster size) [24]&lt;br /&gt;
&lt;br /&gt;
genesOrExons - used whole genes or exons for coverage calculations [exons]&lt;br /&gt;
&lt;br /&gt;
mergeDistance - distance (no of genes) separating clusters for them to be merged [30]&lt;br /&gt;
&lt;br /&gt;
esimateMinCovVer - estimate min coverage depth used for clustering based on x points with highest coverage depth, esimateMinCovVer=0.45 – use 45% of points with highest coverage [null]&lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSSynteny.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
 java -Xmx16g -jar SGSSynteny.jar bamPath=/home/my_bams/ gffFile=/home/references/Bdistachyon_192_gene_exons.gff3 outDirPath=/home/results/ chromosomeList=Bd1,Bd2,Bd3,Bd4,Bd5  bamFileList=my_bam.sorted.bam  DBepsilon=30 DBmin=25 expectCov=500 minCovVer=2.0 minFracHor=0.4&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.cluster files - files with results for each chromosome (files use chromosome names as in .bam files)&lt;br /&gt;
*stats.txt - file with summary information about all genes&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R script.&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_synteny.R&lt;br /&gt;
*.clusters files (either basic or extended) with results from SGSSynteny.jar: Chr1.clusters, Chr2.clusters etc.&lt;br /&gt;
*directory (location) where files with results from SGSSynteny.jar: Chr1.clusters, Chr2.clusters etc. can be found&lt;br /&gt;
&lt;br /&gt;
graph_synteny.R takes three arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .clusters file are located&lt;br /&gt;
&lt;br /&gt;
2. lower limit of the Y axis&lt;br /&gt;
&lt;br /&gt;
3. output path '''ending with /''' &lt;br /&gt;
  &lt;br /&gt;
 Rscript --vanilla graph_synteny.R /home/uqagnieszka/results 0.4 /home/uqagnieszka/graphs/&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSSynteny&amp;diff=666</id>
		<title>SGSSynteny</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSSynteny&amp;diff=666"/>
				<updated>2014-08-18T02:16:36Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSSynteny depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSSynteny.v0.1.tar.gz SGSSynteny.v0.1.tar.gz] should contain&lt;br /&gt;
*** two main programs: SGSSynteny.v0.1.jar, graph_synteny.v0.1.R&lt;br /&gt;
*** readme file &lt;br /&gt;
*** folder with source code&lt;br /&gt;
&lt;br /&gt;
From now on in this manula SGSSynteny.v0.1.jar and graph_synteny.v0.1.R are referred to as SGSSynteny.jar and graph_synteny.R&lt;br /&gt;
&lt;br /&gt;
To run the programs you have to use full names SGSSynteny.v0.1.jar and graph_synteny.v0.1.R&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSSynteny.tar.gz&lt;br /&gt;
* Unpack SGSSynteny.tar.gz and place SGSSynteny.jar and all the R scripts in chosen directory/directories, for example ./my_synteny&lt;br /&gt;
* Move into ./my_synteny and create SGSSynteny_lib directory (on linux: cd ./my_synteny, mkdir SGSSynteny_lib directory)&lt;br /&gt;
** The name of the lib directory is the name of the .jar file witout .jar extension + _lib, so if you are using SGSSynteny.v0.1.jar the lib directory is SGSSynteny.v0.1_lib&lt;br /&gt;
** The lib directory has to be in '''the same folder as the .jar file'''&lt;br /&gt;
* Download picard-tools (SGSSynteny.jar was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_gene_loss/SGSSynteny_lib&lt;br /&gt;
* Now you are ready to run SGSSynteny&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSSynteny.v0.1.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately - .cluster files&lt;br /&gt;
** File with overall stats - stats.csv&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSSynteny.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to bam file, only folder path, do not specify bam file names here, folder has to contain both .bam and .bai files; has to end with “/” or “\”&lt;br /&gt;
&lt;br /&gt;
bamFileList - comma separated list of all the bam files to be used&lt;br /&gt;
&lt;br /&gt;
gffFile - path to .gff3 file, including file name; has to contain at least genes and exons features&lt;br /&gt;
&lt;br /&gt;
outDirPath - directory for the output files; has to end with “/” or “\”&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
expectCov - expected coverage [null]&lt;br /&gt;
 &lt;br /&gt;
minFracHor - minimum horizontal coverage required to consider genes as syntenic  [0.3]&lt;br /&gt;
&lt;br /&gt;
minCovVer - minimum coverage depth required to consider genes as syntenic [2.0]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes, used `all` for all the chromosomes in .bam file [all]&lt;br /&gt;
&lt;br /&gt;
DBepsilon - Eps value for DBSCAN (radius) [26]&lt;br /&gt;
&lt;br /&gt;
DBmin - minPts value for DBSCAN (min cluster size) [24]&lt;br /&gt;
&lt;br /&gt;
genesOrExons - used whole genes or exons for coverage calculations [exons]&lt;br /&gt;
&lt;br /&gt;
mergeDistance - distance (no of genes) separating clusters for them to be merged [30]&lt;br /&gt;
&lt;br /&gt;
esimateMinCovVer - estimate min coverage depth used for clustering based on x points with highest coverage depth, esimateMinCovVer=0.45 – use 45% of points with highest coverage [null]&lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSSynteny.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
 java -Xmx16g -jar SGSSynteny.jar bamPath=/home/my_bams/ gffFile=/home/references/Bdistachyon_192_gene_exons.gff3 outDirPath=/home/results/ chromosomeList=Bd1,Bd2,Bd3,Bd4,Bd5  bamFileList=my_bam.sorted.bam  DBepsilon=30 DBmin=25 expectCov=500 minCovVer=2.0 minFracHor=0.4&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.cluster files - files with results for each chromosome (files use chromosome names as in .bam files)&lt;br /&gt;
*stats.csv - file with summary information about all genes&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R script.&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_synteny.R&lt;br /&gt;
*.clusters files (either basic or extended) with results from SGSSynteny.jar: Chr1.clusters, Chr2.clusters etc.&lt;br /&gt;
*directory (location) where files with results from SGSSynteny.jar: Chr1.clusters, Chr2.clusters etc. can be found&lt;br /&gt;
&lt;br /&gt;
graph_synteny.R takes three arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .clusters file are located&lt;br /&gt;
&lt;br /&gt;
2. lower limit of the Y axis&lt;br /&gt;
&lt;br /&gt;
3. output path '''ending with /''' &lt;br /&gt;
  &lt;br /&gt;
 Rscript --vanilla graph_synteny.R /home/uqagnieszka/results 0.4 /home/uqagnieszka/graphs/&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=647</id>
		<title>SGSGeneLoss</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=647"/>
				<updated>2014-07-10T05:39:41Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSGeneLoss depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
* [http://www.bioconductor.org/packages/release/bioc/html/ggbio.html ggbio]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSGeneLoss.v0.1.tar.gz SGSGeneLoss.v0.1.tar.gz] should contain&lt;br /&gt;
*** three main programs: SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
*** readme file&lt;br /&gt;
*** folder with source code &lt;br /&gt;
&lt;br /&gt;
From now on in this manual SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R are referred to as SGSGeneLoss.jar, graph_chromosomes.R, graph_circles.R&lt;br /&gt;
&lt;br /&gt;
To run the programs you have to use full names SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSGeneLoss.tar.gz&lt;br /&gt;
* Unpack SGSGeneLoss.tar.gz and place SGSGeneLoss.jar and all the R scripts in chosen directory/directories, for example ./my_geneloss&lt;br /&gt;
* Move into ./my_geneloss and create SGSGeneLoss_lib directory (on linux: cd ./my_geneloss, mkdir SGSGeneLoss_lib directory)&lt;br /&gt;
** The name of the lib directory is the name of the .jar file without .jar extension + _lib, so if you are using SGSGeneLoss.v0.1.jar the lib directory is: SGSGeneLoss.v0.1_lib&lt;br /&gt;
** The lib directory has to be in '''the same folder as the .jar file'''&lt;br /&gt;
* Download picard-tools (SGSGeneLoss was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_geneloss/SGSGeneLoss_lib&lt;br /&gt;
* Now you are ready to run SGSGeneLoss&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSGeneLoss.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately - .excov&lt;br /&gt;
** File with overall stats - stats.txt&lt;br /&gt;
** File with summary for all the chromosomes used - chrs.csv (this file is used by one of the R scripts)&lt;br /&gt;
** File with list of genes lost for all the chromosomes - graph.csv (this file is used by one of the R scripts)&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSGeneLoss.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to your bam file/files, has to end with / or \ bamPath=/home/my_bams/&lt;br /&gt;
&lt;br /&gt;
bamFileList - a single .bam file or a comma separated list, only file names, bam and corresponding .bai files have to be in a directory provided in bamPath bamFileList=bam1.bam,bam2.bam&lt;br /&gt;
&lt;br /&gt;
gffFile - location of gff3 file gffFile=/home/my_gffs/annot.gff3&lt;br /&gt;
&lt;br /&gt;
outDirPath - location output directory, has to end with / or \ outDirPath=/home/my_results&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
minCov - minimal coverage threshold to consider position covered [minCov=1]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes to be used for analysis, use all, for all chromosomes [chromosomeList=all]&lt;br /&gt;
&lt;br /&gt;
lostCutoff - coverage cutoff to consider gene as lost for calculating stats [lostCutoff=0.0]&lt;br /&gt;
&lt;br /&gt;
covCats - coverage categories for visualization [cavCats=0,10,20,30,40,70]&lt;br /&gt;
&lt;br /&gt;
extendedFmt - used extended format, additional info included in output files [regular format] &lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSGeneLoss.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Move into directory where SGSGeneLoss.jar is&lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3 outDirPath=/home/uqagnieszka/results/&lt;br /&gt;
 chromosomeList=all&lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam,arabidopsis2.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3&lt;br /&gt;
 outDirPath=/home/uqagnieszka/results/ chromosomeList=Chr1,Chr2 minCov=2 lostCutoff=0.05 covCats=0,2,5,10,20 extendedFmt&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.excov files - files with results for each chromosome (files use chromosome names as in .bam files), files come in two formats basic (default) or extended (extendedFmt)&lt;br /&gt;
**basic format: chromosome,ID,is_lost,start_position,end_postion,frac_exons_covered,frac_gene_covered,ave_cov_depth_exons,cov_cat,ave_cove_depth_gene&lt;br /&gt;
**extended format: contains additional columns with information about each of the exons&lt;br /&gt;
*stats.csv - file with summary information about all genes&lt;br /&gt;
*chrs.csv - file with summary information about chromosomes&lt;br /&gt;
**chr,start,end,len&lt;br /&gt;
*graph.csv - file with list of genes lost as determined by lostCutoff&lt;br /&gt;
**chr,id,start,end&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R scripts.&lt;br /&gt;
&lt;br /&gt;
Two ways of visualization are possible:&lt;br /&gt;
*results per chromosome&lt;br /&gt;
*results for all chromosomes as a circular graph&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_chromosomes.R&lt;br /&gt;
*.excov files (either basic or extended) with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc.&lt;br /&gt;
*directory (location) where files with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc. can be found&lt;br /&gt;
&lt;br /&gt;
graph_chromosomes.R takes two arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .excov file are located  &lt;br /&gt;
&lt;br /&gt;
2. gene loss cutoff&lt;br /&gt;
&lt;br /&gt;
3. output path '''ending with /'''&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_chromosomes.R /home/uqagnieszka/results 0.0 /home/uqagnieszka/graphs/&lt;br /&gt;
&lt;br /&gt;
'''Summary results for all chromosomes, possibly multiple samples:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_circles.R&lt;br /&gt;
*graph.csv from SGSGeneLoss.jar run&lt;br /&gt;
*chrs.csv from SGSGeneLoss.jar run&lt;br /&gt;
*file assigning numeric order to chromosomes (this is done because some chromosomes have complicated names and sorting in ASCII order does not always work) - file should look like this, chromosome names will be replaced by corresponding numbers&lt;br /&gt;
 chrs,no&lt;br /&gt;
 chr1,1&lt;br /&gt;
 chr2,2&lt;br /&gt;
 chr10,10&lt;br /&gt;
&lt;br /&gt;
graph_circles.R takes five arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. file with chromosome info - chrs.csv from SGSGeneLoss.jar run&lt;br /&gt;
&lt;br /&gt;
2. file with chromosome order &lt;br /&gt;
&lt;br /&gt;
3. file with genes lost - graph.csv from SGSGeneLoss.jar run; it can be a comma separated list of multiple files (for example multiple samples). Circles will be drawn in the following order:&lt;br /&gt;
&lt;br /&gt;
first file in the list is the innermost circle, so if you have graph1.csv,graph2.csv,graph3.csv, order of circles will reflect order of files, starting from the inside&lt;br /&gt;
&lt;br /&gt;
4. Output path '''ending with /'''&lt;br /&gt;
&lt;br /&gt;
5. output file&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_circles.R chrs.csv chrs_order.csv graph1.csv,graph2.csv,graph3.csv /home/results/graphs/ out.png&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=641</id>
		<title>SGSGeneLoss</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=641"/>
				<updated>2014-06-23T01:47:21Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSGeneLoss depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
* [http://www.bioconductor.org/packages/release/bioc/html/ggbio.html ggbio]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSGeneLoss.v0.1.tar.gz SGSGeneLoss.v0.1.tar.gz] should contain&lt;br /&gt;
*** three main programs: SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
*** readme file&lt;br /&gt;
*** folder with source code &lt;br /&gt;
&lt;br /&gt;
From now on in this manual SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R are referred to as SGSGeneLoss.jar, graph_chromosomes.R, graph_circles.R&lt;br /&gt;
&lt;br /&gt;
To run the programs you have to use full names SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSGeneLoss.tar.gz&lt;br /&gt;
* Unpack SGSGeneLoss.tar.gz and place SGSGeneLoss.jar and all the R scripts in chosen directory/directories, for example ./my_geneloss&lt;br /&gt;
* Move into ./my_geneloss and create SGSGeneLoss_lib directory (on linux: cd ./my_geneloss, mkdir SGSGeneLoss_lib directory)&lt;br /&gt;
** The name of the lib directory is the name of the .jar file without .jar extension + _lib, so if you are using SGSGeneLoss.v0.1.jar the lib directory is: SGSGeneLoss.v0.1_lib&lt;br /&gt;
** The lib directory has to be in '''the same folder as the .jar file'''&lt;br /&gt;
* Download picard-tools (SGSGeneLoss was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_geneloss/SGSGeneLoss_lib&lt;br /&gt;
* Now you are ready to run SGSGeneLoss&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSGeneLoss.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately - .excov&lt;br /&gt;
** File with overall stats - stats.txt&lt;br /&gt;
** File with summary for all the chromosomes used - chrs.csv (this file is used by one of the R scripts)&lt;br /&gt;
** File with list of genes lost for all the chromosomes - graph.csv (this file is used by one of the R scripts)&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSGeneLoss.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to your bam file/files, has to end with / or \ bamPath=/home/my_bams/&lt;br /&gt;
&lt;br /&gt;
bamFileList - a single .bam file or a comma separated list, only file names, bam and corresponding .bai files have to be in a directory provided in bamPath bamFileList=bam1.bam,bam2.bam&lt;br /&gt;
&lt;br /&gt;
gffFile - location of gff3 file gffFile=/home/my_gffs/annot.gff3&lt;br /&gt;
&lt;br /&gt;
outDirPath - location output directory, has to end with / or \ outDirPath=/home/my_results&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
minCov - minimal coverage threshold to consider position covered [minCov=1]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes to be used for analysis, use all, for all chromosomes [chromosomeList=all]&lt;br /&gt;
&lt;br /&gt;
lostCutoff - coverage cutoff to consider gene as lost for calculating stats [lostCutoff=0.0]&lt;br /&gt;
&lt;br /&gt;
covCats - coverage categories for visualization [cavCats=0,10,20,30,40,70]&lt;br /&gt;
&lt;br /&gt;
extendedFmt - used extended format, additional info included in output files [regular format] &lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSGeneLoss.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Move into directory where SGSGeneLoss.jar is&lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3 outDirPath=/home/uqagnieszka/results/&lt;br /&gt;
 chromosomeList=all&lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam, arabidopsis2.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3&lt;br /&gt;
 outDirPath=/home/uqagnieszka/results/ chromosomeList=Chr1,Chr2 minCov=2 lostCutoff=0.05 covCats=0,2,5,10,20 extendedFmt&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.excov files - files with results for each chromosome (files use chromosome names as in .bam files), files come in two formats basic (default) or extended (extendedFmt)&lt;br /&gt;
**basic format: chromosome,ID,is_lost,start_position,end_postion,frac_exons_covered,frac_gene_covered,ave_cov_depth_exons,cov_cat,ave_cove_depth_gene&lt;br /&gt;
**extended format: contains additional columns with information about each of the exons&lt;br /&gt;
*stats.csv - file with summary information about all genes&lt;br /&gt;
*chrs.csv - file with summary information about chromosomes&lt;br /&gt;
**chr,start,end,len&lt;br /&gt;
*graph.csv - file with list of genes lost as determined by lostCutoff&lt;br /&gt;
**chr,id,start,end&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R scripts.&lt;br /&gt;
&lt;br /&gt;
Two ways of visualization are possible:&lt;br /&gt;
*results per chromosome&lt;br /&gt;
*results for all chromosomes as a circular graph&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_chromosomes.R&lt;br /&gt;
*.excov files (either basic or extended) with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc.&lt;br /&gt;
*directory (location) where files with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc. can be found&lt;br /&gt;
&lt;br /&gt;
graph_chromosomes.R takes two arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .excov file are located  &lt;br /&gt;
&lt;br /&gt;
2. gene loss cutoff&lt;br /&gt;
&lt;br /&gt;
3. output path '''ending with /'''&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_chromosomes.R /home/uqagnieszka/results 0.0 /home/uqagnieszka/graphs/&lt;br /&gt;
&lt;br /&gt;
'''Summary results for all chromosomes, possibly multiple samples:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_circles.R&lt;br /&gt;
*graph.csv from SGSGeneLoss.jar run&lt;br /&gt;
*chrs.csv from SGSGeneLoss.jar run&lt;br /&gt;
*file assigning numeric order to chromosomes (this is done because some chromosomes have complicated names and sorting in ASCII order does not always work) - file should look like this, chromosome names will be replaced by corresponding numbers&lt;br /&gt;
 chrs,no&lt;br /&gt;
 chr1,1&lt;br /&gt;
 chr2,2&lt;br /&gt;
 chr10,10&lt;br /&gt;
&lt;br /&gt;
graph_circles.R takes five arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. file with chromosome info - chrs.csv from SGSGeneLoss.jar run&lt;br /&gt;
&lt;br /&gt;
2. file with chromosome order &lt;br /&gt;
&lt;br /&gt;
3. file with genes lost - graph.csv from SGSGeneLoss.jar run; it can be a comma separated list of multiple files (for example multiple samples). Circles will be drawn in the following order:&lt;br /&gt;
&lt;br /&gt;
first file in the list is the innermost circle, so if you have graph1.csv,graph2.csv,graph3.csv, order of circles will reflect order of files, starting from the inside&lt;br /&gt;
&lt;br /&gt;
4. Output path '''ending with /'''&lt;br /&gt;
&lt;br /&gt;
5. output file&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_circles.R chrs.csv chrs_order.csv graph1.csv,graph2.csv,graph3.csv /home/results/graphs/ out.png&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSSynteny&amp;diff=640</id>
		<title>SGSSynteny</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSSynteny&amp;diff=640"/>
				<updated>2014-06-23T01:45:18Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSSynteny depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSSynteny.v0.1.tar.gz SGSSynteny.v0.1.tar.gz] should contain&lt;br /&gt;
*** two main programs: SGSSynteny.v0.1.jar, graph_synteny.R&lt;br /&gt;
*** readme file &lt;br /&gt;
*** folder with source code&lt;br /&gt;
&lt;br /&gt;
From now on in this manula SGSSynteny.v0.1.jar and graph_synteny.v0.1.R are referred to as SGSSynteny.jar and graph_synteny.R&lt;br /&gt;
&lt;br /&gt;
To run the programs you have to use full names SGSSynteny.v0.1.jar and graph_synteny.v0.1.R&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSSynteny.tar.gz&lt;br /&gt;
* Unpack SGSSynteny.tar.gz and place SGSSynteny.jar and all the R scripts in chosen directory/directories, for example ./my_synteny&lt;br /&gt;
* Move into ./my_synteny and create SGSSynteny_lib directory (on linux: cd ./my_synteny, mkdir SGSSynteny_lib directory)&lt;br /&gt;
** The name of the lib directory is the name of the .jar file witout .jar extension + _lib, so if you are using SGSSynteny.v0.1.jar the lib directory is SGSSynteny.v0.1_lib&lt;br /&gt;
** The lib directory has to be in '''the same folder as the .jar file'''&lt;br /&gt;
* Download picard-tools (SGSSynteny.jar was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_gene_loss/SGSSynteny_lib&lt;br /&gt;
* Now you are ready to run SGSSynteny&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSSynteny.v0.1.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately - .cluster files&lt;br /&gt;
** File with overall stats - stats.csv&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSSynteny.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to bam file, only folder path, do not specify bam file names here, folder has to contain both .bam and .bai files; has to end with “/” or “\”&lt;br /&gt;
&lt;br /&gt;
bamFileList - comma separated list of all the bam files to be used&lt;br /&gt;
&lt;br /&gt;
gffFile - path to .gff3 file, including file name; has to contain at least genes and exons features&lt;br /&gt;
&lt;br /&gt;
outDirPath - directory for the output files; has to end with “/” or “\”&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
expectCov - expected coverage [null]&lt;br /&gt;
 &lt;br /&gt;
minFracHor - minimum horizontal coverage required to consider genes as syntenic  [0.3]&lt;br /&gt;
&lt;br /&gt;
minCovVer - minimum coverage depth required to consider genes as syntenic [2.0]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes, used `all` for all the chromosomes in .bam file [all]&lt;br /&gt;
&lt;br /&gt;
DBepsilon - Eps value for DBSCAN (radius) [26]&lt;br /&gt;
&lt;br /&gt;
DBmin - minPts value for DBSCAN (min cluster size) [24]&lt;br /&gt;
&lt;br /&gt;
genesOrExons - used whole genes or exons for coverage calculations [exons]&lt;br /&gt;
&lt;br /&gt;
mergeDistance - distance (no of genes) separating clusters for them to be merged [30]&lt;br /&gt;
&lt;br /&gt;
esimateMinCovVer - estimate min coverage depth used for clustering based on x points with highest coverage depth, esimateMinCovVer=0.45 – use 45% of points with highest coverage [null]&lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSSynteny.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
 java -Xmx16g -jar SGSSynteny.jar bamPath=/home/my_bams/ gffFile=/home/references/Bdistachyon_192_gene_exons.gff3 outDirPath=/home/results/ chromosomeList=Bd1,Bd2,Bd3,Bd4,Bd5  bamFileList=my_bam.sorted.bam  DBepsilon=30 DBmin=25 expectCov=500 minCovVer=2.0 minFracHor=0.4&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.cluster files - files with results for each chromosome (files use chromosome names as in .bam files)&lt;br /&gt;
*stats.csv - file with summary information about all genes&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R script.&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_synteny.R&lt;br /&gt;
*.clusters files (either basic or extended) with results from SGSSynteny.jar: Chr1.clusters, Chr2.clusters etc.&lt;br /&gt;
*directory (location) where files with results from SGSSynteny.jar: Chr1.clusters, Chr2.clusters etc. can be found&lt;br /&gt;
&lt;br /&gt;
graph_synteny.R takes three arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .clusters file are located&lt;br /&gt;
&lt;br /&gt;
2. lower limit of the Y axis&lt;br /&gt;
&lt;br /&gt;
3. output path '''ending with /''' &lt;br /&gt;
  &lt;br /&gt;
 Rscript --vanilla graph_synteny.R /home/uqagnieszka/results 0.4 /home/uqagnieszka/graphs/&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSSynteny&amp;diff=639</id>
		<title>SGSSynteny</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSSynteny&amp;diff=639"/>
				<updated>2014-06-23T01:40:07Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSSynteny depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSSynteny.v0.1.tar.gz SGSSynteny.v0.1.tar.gz] should contain&lt;br /&gt;
*** two main programs: SGSSynteny.v0.1.jar, graph_synteny.R&lt;br /&gt;
*** readme file &lt;br /&gt;
*** folder with source code&lt;br /&gt;
&lt;br /&gt;
From now on in this manula SGSSynteny.v0.1.jar and graph_synteny.v0.1.R are referred to as SGSSynteny.jar and graph_synteny.R&lt;br /&gt;
&lt;br /&gt;
To run the programs you have to use full names SGSSynteny.v0.1.jar and graph_synteny.v0.1.R&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSSynteny.tar.gz&lt;br /&gt;
* Unpack SGSSynteny.tar.gz and place SGSSynteny.jar and all the R scripts in chosen directory/directories, for example ./my_synteny&lt;br /&gt;
* Move into ./my_synteny and create SGSSynteny_lib directory (on linux: cd ./my_synteny, mkdir SGSSynteny_lib directory)&lt;br /&gt;
** The name of the lib directory is the name of the .jar file witout .jar extension + _lib, so if you are using SGSSynteny.v0.1.jar the lib directory is SGSSynteny.v0.1_lib&lt;br /&gt;
** The lib directory has to be in '''the same folder as the .jar file'''&lt;br /&gt;
* Download picard-tools (SGSSynteny.jar was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_gene_loss/SGSSynteny_lib&lt;br /&gt;
* Now you are ready to run SGSSynteny&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSSynteny.v0.1.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately - .cluster files&lt;br /&gt;
** File with overall stats - stats.csv&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSSynteny.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to bam file, only folder path, do not specify bam file names here, folder has to contain both .bam and .bai files; has to end with “/” or “\”&lt;br /&gt;
&lt;br /&gt;
bamFileList - comma separated list of all the bam files to be used&lt;br /&gt;
&lt;br /&gt;
gffFile - path to .gff3 file, including file name; has to contain at least genes and exons features&lt;br /&gt;
&lt;br /&gt;
outDirPath - directory for the output files; has to end with “/” or “\”&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
expectCov - expected coverage [null]&lt;br /&gt;
 &lt;br /&gt;
minFracHor - minimum horizontal coverage required to consider genes as syntenic  [0.3]&lt;br /&gt;
&lt;br /&gt;
minCovVer - minimum coverage depth required to consider genes as syntenic [2.0]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes, used `all` for all the chromosomes in .bam file [all]&lt;br /&gt;
&lt;br /&gt;
DBepsilon - Eps value for DBSCAN (radius) [26]&lt;br /&gt;
&lt;br /&gt;
DBmin - minPts value for DBSCAN (min cluster size) [24]&lt;br /&gt;
&lt;br /&gt;
genesOrExons - used whole genes or exons for coverage calculations [exons]&lt;br /&gt;
&lt;br /&gt;
mergeDistance - distance (no of genes) separating clusters for them to be merged [30]&lt;br /&gt;
&lt;br /&gt;
esimateMinCovVer - estimate min coverage depth used for clustering based on x points with highest coverage depth, esimateMinCovVer=0.45 – use 45% of points with highest coverage [null]&lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSSynteny.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
 java -Xmx16g -jar SGSSynteny.jar bamPath=/home/my_bams/ gffFile=/home/references/Bdistachyon_192_gene_exons.gff3 outDirPath=/home/results/ chromosomeList=Bd1,Bd2,Bd3,Bd4,Bd5  bamFileList=my_bam.sorted.bam  DBepsilon=30 DBmin=25 expectCov=500 minCovVer=2.0 minFracHor=0.4&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.cluster files - files with results for each chromosome (files use chromosome names as in .bam files)&lt;br /&gt;
*stats.csv - file with summary information about all genes&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R script.&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_synteny.R&lt;br /&gt;
*.clusters files (either basic or extended) with results from SGSSynteny.jar: Chr1.clusters, Chr2.clusters etc.&lt;br /&gt;
*directory (location) where files with results from SGSSynteny.jar: Chr1.clusters, Chr2.clusters etc. can be found&lt;br /&gt;
&lt;br /&gt;
graph_synteny.R takes three arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .clusters file are located&lt;br /&gt;
&lt;br /&gt;
2. lower limit of the Y axis&lt;br /&gt;
&lt;br /&gt;
3. output path '''ending with /''' &lt;br /&gt;
  &lt;br /&gt;
 Rscript --vanilla graph_synteny.R /home/uqagnieszka/results 0.4 /home/uqagnieszka/graphs&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSSynteny&amp;diff=638</id>
		<title>SGSSynteny</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSSynteny&amp;diff=638"/>
				<updated>2014-06-23T01:39:44Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSSynteny depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSSynteny.v0.1.tar.gz SGSSynteny.v0.1.tar.gz] should contain&lt;br /&gt;
*** four main programs: SGSSynteny.v0.1.jar, graph_synteny.R&lt;br /&gt;
*** readme file &lt;br /&gt;
*** folder with source code&lt;br /&gt;
&lt;br /&gt;
From now on in this manula SGSSynteny.v0.1.jar and graph_synteny.v0.1.R are referred to as SGSSynteny.jar and graph_synteny.R&lt;br /&gt;
&lt;br /&gt;
To run the programs you have to use full names SGSSynteny.v0.1.jar and graph_synteny.v0.1.R&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSSynteny.tar.gz&lt;br /&gt;
* Unpack SGSSynteny.tar.gz and place SGSSynteny.jar and all the R scripts in chosen directory/directories, for example ./my_synteny&lt;br /&gt;
* Move into ./my_synteny and create SGSSynteny_lib directory (on linux: cd ./my_synteny, mkdir SGSSynteny_lib directory)&lt;br /&gt;
** The name of the lib directory is the name of the .jar file witout .jar extension + _lib, so if you are using SGSSynteny.v0.1.jar the lib directory is SGSSynteny.v0.1_lib&lt;br /&gt;
** The lib directory has to be in '''the same folder as the .jar file'''&lt;br /&gt;
* Download picard-tools (SGSSynteny.jar was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_gene_loss/SGSSynteny_lib&lt;br /&gt;
* Now you are ready to run SGSSynteny&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSSynteny.v0.1.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately - .cluster files&lt;br /&gt;
** File with overall stats - stats.csv&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSSynteny.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to bam file, only folder path, do not specify bam file names here, folder has to contain both .bam and .bai files; has to end with “/” or “\”&lt;br /&gt;
&lt;br /&gt;
bamFileList - comma separated list of all the bam files to be used&lt;br /&gt;
&lt;br /&gt;
gffFile - path to .gff3 file, including file name; has to contain at least genes and exons features&lt;br /&gt;
&lt;br /&gt;
outDirPath - directory for the output files; has to end with “/” or “\”&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
expectCov - expected coverage [null]&lt;br /&gt;
 &lt;br /&gt;
minFracHor - minimum horizontal coverage required to consider genes as syntenic  [0.3]&lt;br /&gt;
&lt;br /&gt;
minCovVer - minimum coverage depth required to consider genes as syntenic [2.0]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes, used `all` for all the chromosomes in .bam file [all]&lt;br /&gt;
&lt;br /&gt;
DBepsilon - Eps value for DBSCAN (radius) [26]&lt;br /&gt;
&lt;br /&gt;
DBmin - minPts value for DBSCAN (min cluster size) [24]&lt;br /&gt;
&lt;br /&gt;
genesOrExons - used whole genes or exons for coverage calculations [exons]&lt;br /&gt;
&lt;br /&gt;
mergeDistance - distance (no of genes) separating clusters for them to be merged [30]&lt;br /&gt;
&lt;br /&gt;
esimateMinCovVer - estimate min coverage depth used for clustering based on x points with highest coverage depth, esimateMinCovVer=0.45 – use 45% of points with highest coverage [null]&lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSSynteny.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
 java -Xmx16g -jar SGSSynteny.jar bamPath=/home/my_bams/ gffFile=/home/references/Bdistachyon_192_gene_exons.gff3 outDirPath=/home/results/ chromosomeList=Bd1,Bd2,Bd3,Bd4,Bd5  bamFileList=my_bam.sorted.bam  DBepsilon=30 DBmin=25 expectCov=500 minCovVer=2.0 minFracHor=0.4&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.cluster files - files with results for each chromosome (files use chromosome names as in .bam files)&lt;br /&gt;
*stats.csv - file with summary information about all genes&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R script.&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_synteny.R&lt;br /&gt;
*.clusters files (either basic or extended) with results from SGSSynteny.jar: Chr1.clusters, Chr2.clusters etc.&lt;br /&gt;
*directory (location) where files with results from SGSSynteny.jar: Chr1.clusters, Chr2.clusters etc. can be found&lt;br /&gt;
&lt;br /&gt;
graph_synteny.R takes three arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .clusters file are located&lt;br /&gt;
&lt;br /&gt;
2. lower limit of the Y axis&lt;br /&gt;
&lt;br /&gt;
3. output path '''ending with /''' &lt;br /&gt;
  &lt;br /&gt;
 Rscript --vanilla graph_synteny.R /home/uqagnieszka/results 0.4 /home/uqagnieszka/graphs&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSSynteny&amp;diff=637</id>
		<title>SGSSynteny</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSSynteny&amp;diff=637"/>
				<updated>2014-06-23T01:22:49Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSSynteny depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSSynteny.v0.1.tar.gz SGSSynteny.v0.1.tar.gz] should contain&lt;br /&gt;
*** four main programs: SGSSynteny.v0.1.jar, graph_synteny.R&lt;br /&gt;
*** readme file &lt;br /&gt;
*** folder with source code&lt;br /&gt;
&lt;br /&gt;
From now on in this manula SGSSynteny.v0.1.jar and graph_synteny.v0.1.R are referred to as SGSSynteny.jar and graph_synteny.R&lt;br /&gt;
&lt;br /&gt;
To run the programs you have to use full names SGSSynteny.v0.1.jar and graph_synteny.v0.1.R&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSSynteny.tar.gz&lt;br /&gt;
* Unpack SGSSynteny.tar.gz and place SGSSynteny.jar and all the R scripts in chosen directory/directories, for example ./my_synteny&lt;br /&gt;
* Move into ./my_synteny and create SGSSynteny_lib directory (on linux: cd ./my_synteny, mkdir SGSSynteny_lib directory)&lt;br /&gt;
** The name of the lib directory is the name of the .jar file witout .jar extension + _lib, so if you are using SGSSynteny.v0.1.jar the lib directory is SGSSynteny.v0.1_lib&lt;br /&gt;
** The lib directory has to be in '''the same folder as the .jar file'''&lt;br /&gt;
* Download picard-tools (SGSSynteny.jar was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_gene_loss/SGSSynteny_lib&lt;br /&gt;
* Now you are ready to run SGSSynteny&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSSynteny.v0.1.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately - .cluster files&lt;br /&gt;
** File with overall stats - stats.csv&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSSynteny.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to bam file, only folder path, do not specify bam file names here, folder has to contain both .bam and .bai files; has to end with “/” or “\”&lt;br /&gt;
&lt;br /&gt;
bamFileList - comma separated list of all the bam files to be used&lt;br /&gt;
&lt;br /&gt;
gffFile - path to .gff3 file, including file name; has to contain at least genes and exons features&lt;br /&gt;
&lt;br /&gt;
outDirPath - directory for the output files; has to end with “/” or “\”&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
expectCov - expected coverage [null]&lt;br /&gt;
 &lt;br /&gt;
minFracHor - minimum horizontal coverage required to consider genes as syntenic  [0.3]&lt;br /&gt;
&lt;br /&gt;
minCovVer - minimum coverage depth required to consider genes as syntenic [2.0]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes, used `all` for all the chromosomes in .bam file [all]&lt;br /&gt;
&lt;br /&gt;
DBepsilon - Eps value for DBSCAN (radius) [26]&lt;br /&gt;
&lt;br /&gt;
DBmin - minPts value for DBSCAN (min cluster size) [24]&lt;br /&gt;
&lt;br /&gt;
genesOrExons - used whole genes or exons for coverage calculations [exons]&lt;br /&gt;
&lt;br /&gt;
mergeDistance - distance (no of genes) separating clusters for them to be merged [30]&lt;br /&gt;
&lt;br /&gt;
esimateMinCovVer - estimate min coverage depth used for clustering based on x points with highest coverage depth, esimateMinCovVer=0.45 – use 45% of points with highest coverage [null]&lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSSynteny.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
 java -Xmx16g -jar SGSSynteny.jar bamPath=/home/my_bams/ gffFile=/home/references/Bdistachyon_192_gene_exons.gff3 outDirPath=/home/results/ chromosomeList=Bd1,Bd2,Bd3,Bd4,Bd5  bamFileList=my_bam.sorted.bam  DBepsilon=30 DBmin=25 expectCov=500 minCovVer=2.0 minFracHor=0.4&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.cluster files - files with results for each chromosome (files use chromosome names as in .bam files)&lt;br /&gt;
*stats.csv - file with summary information about all genes&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R script.&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_synteny.R&lt;br /&gt;
*.clusters files (either basic or extended) with results from SGSSynteny.jar: Chr1.clusters, Chr2.clusters etc.&lt;br /&gt;
*directory (location) where files with results from SGSGeneLoss.jar: Chr1.clusters, Chr2.clusters etc. can be found&lt;br /&gt;
&lt;br /&gt;
graph_synteny.R takes three arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .clusters file are located&lt;br /&gt;
&lt;br /&gt;
2. lower limit of the Y axis&lt;br /&gt;
&lt;br /&gt;
3. output path '''ending with /''' &lt;br /&gt;
  &lt;br /&gt;
 Rscript --vanilla graph_synteny.R /home/uqagnieszka/results 0.4 /home/uqagnieszka/graphs&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSSynteny&amp;diff=636</id>
		<title>SGSSynteny</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSSynteny&amp;diff=636"/>
				<updated>2014-06-23T01:22:05Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSSynteny depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSSynteny.v0.1.tar.gz SGSSynteny.v0.1.tar.gz] should contain&lt;br /&gt;
*** four main programs: SGSSynteny.v0.1.jar, graph_synteny.R&lt;br /&gt;
*** readme file &lt;br /&gt;
*** folder with source code&lt;br /&gt;
&lt;br /&gt;
From now on in this manula SGSSynteny.v0.1.jar and graph_synteny.v0.1.R are referred to as SGSSynteny.jar and graph_synteny.R&lt;br /&gt;
&lt;br /&gt;
To run the programs you have to use full names SGSSynteny.v0.1.jar and graph_synteny.v0.1.R&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSSynteny.tar.gz&lt;br /&gt;
* Unpack SGSSynteny.tar.gz and place SGSSynteny.jar and all the R scripts in chosen directory/directories, for example ./my_synteny&lt;br /&gt;
* Move into ./my_synteny and create SGSSynteny_lib directory (on linux: cd ./my_synteny, mkdir SGSSynteny_lib directory)&lt;br /&gt;
** The name of the lib directory is the name of the .jar file witout .jar extension + _lib, so if you are using SGSSynteny.v0.1.jar the lib directory is SGSSynteny.v0.1_lib&lt;br /&gt;
** The lib directory has to be in '''the same folder as the .jar file'''&lt;br /&gt;
* Download picard-tools (SGSSynteny.jar was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_gene_loss/SGSSynteny_lib&lt;br /&gt;
* Now you are ready to run SGSSynteny&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSSynteny.v0.1.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately - .cluster files&lt;br /&gt;
** File with overall stats - stats.csv&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSSynteny.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to bam file, only folder path, do not specify bam file names here, folder has to contain both .bam and .bai files; has to end with “/” or “\”&lt;br /&gt;
&lt;br /&gt;
bamFileList - comma separated list of all the bam files to be used&lt;br /&gt;
&lt;br /&gt;
gffFile - path to .gff3 file, including file name; has to contain at least genes and exons features&lt;br /&gt;
&lt;br /&gt;
outDirPath - directory for the output files; has to end with “/” or “\”&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
expectCov - expected coverage [null]&lt;br /&gt;
 &lt;br /&gt;
minFracHor - minimum horizontal coverage required to consider genes as syntenic  [0.3]&lt;br /&gt;
&lt;br /&gt;
minCovVer - minimum coverage depth required to consider genes as syntenic [2.0]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes, used `all` for all the chromosomes in .bam file [all]&lt;br /&gt;
&lt;br /&gt;
DBepsilon - Eps value for DBSCAN (radius) [26]&lt;br /&gt;
&lt;br /&gt;
DBmin - minPts value for DBSCAN (min cluster size) [24]&lt;br /&gt;
&lt;br /&gt;
genesOrExons - used whole genes or exons for coverage calculations [exons]&lt;br /&gt;
&lt;br /&gt;
mergeDistance - distance (no of genes) separating clusters for them to be merged [30]&lt;br /&gt;
&lt;br /&gt;
esimateMinCovVer - estimate min coverage depth used for clustering based on x points with highest coverage depth, esimateMinCovVer=0.45 – use 45% of points with highest coverage [null]&lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSSynteny.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
 java -Xmx16g -jar SGSSynteny.jar bamPath=/home/my_bams/ gffFile=/home/references/Bdistachyon_192_gene_exons.gff3 outDirPath=/home/results/ chromosomeList=Bd1,Bd2,Bd3,Bd4,Bd5  bamFileList=my_bam.sorted.bam  DBepsilon=30 DBmin=25 expectCov=500 minCovVer=2.0 minFracHor=0.4&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.cluster files - files with results for each chromosome (files use chromosome names as in .bam files)&lt;br /&gt;
*stats.csv - file with summary information about all genes&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R script.&lt;br /&gt;
&lt;br /&gt;
Two ways of visualization are possible:&lt;br /&gt;
*results per chromosome&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_synteny.R&lt;br /&gt;
*.clusters files (either basic or extended) with results from SGSSynteny.jar: Chr1.clusters, Chr2.clusters etc.&lt;br /&gt;
*directory (location) where files with results from SGSGeneLoss.jar: Chr1.clusters, Chr2.clusters etc. can be found&lt;br /&gt;
&lt;br /&gt;
graph_synteny.R takes three arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .clusters file are located&lt;br /&gt;
&lt;br /&gt;
2. lower limit of the Y axis&lt;br /&gt;
&lt;br /&gt;
3. output path '''ending with /''' &lt;br /&gt;
  &lt;br /&gt;
 Rscript --vanilla graph_synteny.R /home/uqagnieszka/results 0.4 /home/uqagnieszka/graphs&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSSynteny&amp;diff=635</id>
		<title>SGSSynteny</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSSynteny&amp;diff=635"/>
				<updated>2014-06-23T01:20:53Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSSynteny depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSSynteny.v0.1.tar.gz SGSSynteny.v0.1.tar.gz] should contain&lt;br /&gt;
*** four main programs: SGSSynteny.v0.1.jar, graph_synteny.R&lt;br /&gt;
*** readme file &lt;br /&gt;
*** folder with source code&lt;br /&gt;
&lt;br /&gt;
From now on in this manula SGSSynteny.v0.1.jar and graph_synteny.v0.1.R are referred to as SGSSynteny.jar and graph_synteny.R&lt;br /&gt;
&lt;br /&gt;
To run the programs you have to use full names SGSSynteny.v0.1.jar and graph_synteny.v0.1.R&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSSynteny.tar.gz&lt;br /&gt;
* Unpack SGSSynteny.tar.gz and place SGSSynteny.jar and all the R scripts in chosen directory/directories, for example ./my_synteny&lt;br /&gt;
* Move into ./my_synteny and create SGSSynteny_lib directory (on linux: cd ./my_synteny, mkdir SGSSynteny_lib directory)&lt;br /&gt;
** The name of the lib directory is the name of the .jar file witout .jar extension + _lib, so if you are using SGSSynteny.v0.1.jar the lib directory is SGSSynteny.v0.1_lib&lt;br /&gt;
** The lib directory has to be in '''the same folder as the .jar file'''&lt;br /&gt;
* Download picard-tools (SGSSynteny.jar was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_gene_loss/SGSSynteny_lib&lt;br /&gt;
* Now you are ready to run SGSGeneLoss&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSSynteny.v0.1.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately - .cluster files&lt;br /&gt;
** File with overall stats - stats.csv&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSSynteny.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to bam file, only folder path, do not specify bam file names here, folder has to contain both .bam and .bai files; has to end with “/” or “\”&lt;br /&gt;
&lt;br /&gt;
bamFileList - comma separated list of all the bam files to be used&lt;br /&gt;
&lt;br /&gt;
gffFile - path to .gff3 file, including file name; has to contain at least genes and exons features&lt;br /&gt;
&lt;br /&gt;
outDirPath - directory for the output files; has to end with “/” or “\”&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
expectCov - expected coverage [null]&lt;br /&gt;
 &lt;br /&gt;
minFracHor - minimum horizontal coverage required to consider genes as syntenic  [0.3]&lt;br /&gt;
&lt;br /&gt;
minCovVer - minimum coverage depth required to consider genes as syntenic [2.0]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes, used `all` for all the chromosomes in .bam file [all]&lt;br /&gt;
&lt;br /&gt;
DBepsilon - Eps value for DBSCAN (radius) [26]&lt;br /&gt;
&lt;br /&gt;
DBmin - minPts value for DBSCAN (min cluster size) [24]&lt;br /&gt;
&lt;br /&gt;
genesOrExons - used whole genes or exons for coverage calculations [exons]&lt;br /&gt;
&lt;br /&gt;
mergeDistance - distance (no of genes) separating clusters for them to be merged [30]&lt;br /&gt;
&lt;br /&gt;
esimateMinCovVer - estimate min coverage depth used for clustering based on x points with highest coverage depth, esimateMinCovVer=0.45 – use 45% of points with highest coverage [null]&lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSSynteny.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
 java -Xmx16g -jar SGSSynteny.jar bamPath=/home/my_bams/ gffFile=/home/references/Bdistachyon_192_gene_exons.gff3 outDirPath=/home/results/ chromosomeList=Bd1,Bd2,Bd3,Bd4,Bd5  bamFileList=my_bam.sorted.bam  DBepsilon=30 DBmin=25 expectCov=500 minCovVer=2.0 minFracHor=0.4&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.cluster files - files with results for each chromosome (files use chromosome names as in .bam files)&lt;br /&gt;
*stats.csv - file with summary information about all genes&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R script.&lt;br /&gt;
&lt;br /&gt;
Two ways of visualization are possible:&lt;br /&gt;
*results per chromosome&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_synteny.R&lt;br /&gt;
*.clusters files (either basic or extended) with results from SGSSynteny.jar: Chr1.clusters, Chr2.clusters etc.&lt;br /&gt;
*directory (location) where files with results from SGSGeneLoss.jar: Chr1.clusters, Chr2.clusters etc. can be found&lt;br /&gt;
&lt;br /&gt;
graph_synteny.R takes three arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .clusters file are located&lt;br /&gt;
&lt;br /&gt;
2. lower limit of the Y axis&lt;br /&gt;
&lt;br /&gt;
3. output path '''ending with /''' &lt;br /&gt;
  &lt;br /&gt;
 Rscript --vanilla graph_synteny.R /home/uqagnieszka/results 0.4 /home/uqagnieszka/graphs&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=634</id>
		<title>SGSGeneLoss</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=634"/>
				<updated>2014-06-23T00:53:16Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSGeneLoss depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
* [http://www.bioconductor.org/packages/release/bioc/html/ggbio.html ggbio]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSGeneLoss.v0.1.tar.gz SGSGeneLoss.v0.1.tar.gz] should contain&lt;br /&gt;
*** three main programs: SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
*** readme file&lt;br /&gt;
*** folder with source code &lt;br /&gt;
&lt;br /&gt;
From now on in this manual SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R are referred to as SGSGeneLoss.jar, graph_chromosomes.R, graph_circles.R&lt;br /&gt;
&lt;br /&gt;
To run the programs you have to use full names SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSGeneLoss.tar.gz&lt;br /&gt;
* Unpack SGSGeneLoss.tar.gz and place SGSGeneLoss.jar and all the R scripts in chosen directory/directories, for example ./my_geneloss&lt;br /&gt;
* Move into ./my_geneloss and create SGSGeneLoss_lib directory (on linux: cd ./my_geneloss, mkdir SGSGeneLoss_lib directory)&lt;br /&gt;
** The name of the lib directory is the name of the .jar file without .jar extension + _lib, so if you are using SGSGeneLoss.v0.1.jar the lib directory is: SGSGeneLoss.v0.1_lib&lt;br /&gt;
** The lib directory has to be in '''the same folder as the .jar file'''&lt;br /&gt;
* Download picard-tools (SGSGeneLoss was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_geneloss/SGSGeneLoss_lib&lt;br /&gt;
* Now you are ready to run SGSGeneLoss&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSGeneLoss.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately - .excov&lt;br /&gt;
** File with overall stats - stats.txt&lt;br /&gt;
** File with summary for all the chromosomes used - chrs.csv (this file is used by one of the R scripts)&lt;br /&gt;
** File with list of genes lost for all the chromosomes - graph.csv (this file is used by one of the R scripts)&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSGeneLoss.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to your bam file/files, has to end with / or \ bamPath=/home/my_bams/&lt;br /&gt;
&lt;br /&gt;
bamFileList - a single .bam file or a comma separated list, only file names, bam and corresponding .bai files have to be in a directory provided in bamPath bamFileList=bam1.bam,bam2.bam&lt;br /&gt;
&lt;br /&gt;
gffFile - location of gff3 file gffFile=/home/my_gffs/annot.gff3&lt;br /&gt;
&lt;br /&gt;
outDirPath - location output directory, has to end with / or \ outDirPath=/home/my_results&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
minCov - minimal coverage threshold to consider position covered [minCov=1]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes to be used for analysis, use all, for all chromosomes [chromosomeList=all]&lt;br /&gt;
&lt;br /&gt;
lostCutoff - coverage cutoff to consider gene as lost for calculating stats [lostCutoff=0.0]&lt;br /&gt;
&lt;br /&gt;
covCats - coverage categories for visualization [cavCats=0,10,20,30,40,70]&lt;br /&gt;
&lt;br /&gt;
extendedFmt - used extended format, additional info included in output files [regular format] &lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSGeneLoss.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Move into directory where SGSGeneLoss.jar is&lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3 outDirPath=/home/uqagnieszka/results/&lt;br /&gt;
 chromosomeList=all&lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam, arabidopsis2.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3&lt;br /&gt;
 outDirPath=/home/uqagnieszka/results/ chromosomeList=Chr1,Chr2 minCov=2 lostCutoff=0.05 covCats=0,2,5,10,20 extendedFmt&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.excov files - files with results for each chromosome (files use chromosome names as in .bam files), files come in two formats basic (default) or extended (extendedFmt)&lt;br /&gt;
**basic format: chromosome,ID,is_lost,start_position,end_postion,frac_exons_covered,frac_gene_covered,ave_cov_depth_exons,cov_cat,ave_cove_depth_gene&lt;br /&gt;
**extended format: contains additional columns with information about each of the exons&lt;br /&gt;
*stats.csv - file with summary information about all genes&lt;br /&gt;
*chrs.csv - file with summary information about chromosomes&lt;br /&gt;
**chr,start,end,len&lt;br /&gt;
*graph.csv - file with list of genes lost as determined by lostCutoff&lt;br /&gt;
**chr,id,start,end&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R scripts.&lt;br /&gt;
&lt;br /&gt;
Two ways of visualization are possible:&lt;br /&gt;
*results per chromosome&lt;br /&gt;
*results for all chromosomes as a circular graph&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_chromosomes.R&lt;br /&gt;
*.excov files (either basic or extended) with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc.&lt;br /&gt;
*directory (location) where files with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc. can be found&lt;br /&gt;
&lt;br /&gt;
graph_chromosomes.R takes two arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .excov file are located  &lt;br /&gt;
&lt;br /&gt;
2. gene loss cutoff&lt;br /&gt;
&lt;br /&gt;
3. output path '''ending with /'''&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_chromosomes.R /home/uqagnieszka/results 0.0 /home/uqagnieszka/graphs&lt;br /&gt;
&lt;br /&gt;
'''Summary results for all chromosomes, possibly multiple samples:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_circles.R&lt;br /&gt;
*graph.csv from SGSGeneLoss.jar run&lt;br /&gt;
*chrs.csv from SGSGeneLoss.jar run&lt;br /&gt;
*file assigning numeric order to chromosomes (this is done because some chromosomes have complicated names and sorting in ASCII order does not always work) - file should look like this, chromosome names will be replaced by corresponding numbers&lt;br /&gt;
 chrs,no&lt;br /&gt;
 chr1,1&lt;br /&gt;
 chr2,2&lt;br /&gt;
 chr10,10&lt;br /&gt;
&lt;br /&gt;
graph_circles.R takes five arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. file with chromosome info - chrs.csv from SGSGeneLoss.jar run&lt;br /&gt;
&lt;br /&gt;
2. file with chromosome order &lt;br /&gt;
&lt;br /&gt;
3. file with genes lost - graph.csv from SGSGeneLoss.jar run; it can be a comma separated list of multiple files (for example multiple samples). Circles will be drawn in the following order:&lt;br /&gt;
&lt;br /&gt;
first file in the list is the innermost circle, so if you have graph1.csv,graph2.csv,graph3.csv, order of circles will reflect order of files, starting from the inside&lt;br /&gt;
&lt;br /&gt;
4. Output path '''ending with /'''&lt;br /&gt;
&lt;br /&gt;
5. output file&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_circles.R chrs.csv chrs_order.csv graph1.csv,graph2.csv,graph3.csv /home/results/graphs/ out.png&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSSynteny&amp;diff=633</id>
		<title>SGSSynteny</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSSynteny&amp;diff=633"/>
				<updated>2014-06-23T00:14:47Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSSynteny depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSSynteny.v0.1.tar.gz SGSSynteny.v0.1.tar.gz] should contain&lt;br /&gt;
*** four main programs: SGSSynteny.v0.1.jar, graph_synteny.R&lt;br /&gt;
*** readme file &lt;br /&gt;
*** folder with source code&lt;br /&gt;
&lt;br /&gt;
From now on in this manula SGSSynteny.v0.1.jar and graph_synteny.v0.1.R are referred to as SGSSynteny.jar and graph_synteny.R&lt;br /&gt;
&lt;br /&gt;
To run the programs you have to use full names SGSSynteny.v0.1.jar and graph_synteny.v0.1.R&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSSynteny.tar.gz&lt;br /&gt;
* Unpack SGSSynteny.tar.gz and place SGSSynteny.jar and all the R scripts in chosen directory/directories, for example ./my_synteny&lt;br /&gt;
* Move into ./my_synteny and create SGSSynteny_lib directory (on linux: cd ./my_synteny, mkdir SGSSynteny_lib directory)&lt;br /&gt;
** The name of the lib directory is the name of the .jar file witout .jar extension + _lib, so if you are using SGSSynteny.v0.1.jar the lib directory is SGSSynteny.v0.1_lib&lt;br /&gt;
** The lib directory has to be in '''the same folder as the .jar file'''&lt;br /&gt;
* Download picard-tools (SGSSynteny.jar was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_gene_loss/SGSSynteny_lib&lt;br /&gt;
* Now you are ready to run SGSGeneLoss&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSSynteny.v0.1.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately - .cluster files&lt;br /&gt;
** File with overall stats - stats.csv&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSSynteny.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to bam file, only folder path, do not specify bam file names here, folder has to contain both .bam and .bai files; has to end with “/” or “\”&lt;br /&gt;
&lt;br /&gt;
bamFileList - comma separated list of all the bam files to be used&lt;br /&gt;
&lt;br /&gt;
gffFile - path to .gff3 file, including file name; has to contain at least genes and exons features&lt;br /&gt;
&lt;br /&gt;
outDirPath - directory for the output files; has to end with “/” or “\”&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
expectCov - expected coverage [null]&lt;br /&gt;
 &lt;br /&gt;
minFracHor - minimum horizontal coverage required to consider genes as syntenic  [0.3]&lt;br /&gt;
&lt;br /&gt;
minCovVer - minimum coverage depth required to consider genes as syntenic [2.0]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes, used `all` for all the chromosomes in .bam file [all]&lt;br /&gt;
&lt;br /&gt;
DBepsilon - Eps value for DBSCAN (radius) [26]&lt;br /&gt;
&lt;br /&gt;
DBmin - minPts value for DBSCAN (min cluster size) [24]&lt;br /&gt;
&lt;br /&gt;
genesOrExons - used whole genes or exons for coverage calculations [exons]&lt;br /&gt;
&lt;br /&gt;
mergeDistance - distance (no of genes) separating clusters for them to be merged [30]&lt;br /&gt;
&lt;br /&gt;
esimateMinCovVer - estimate min coverage depth used for clustering based on x points with highest coverage depth, esimateMinCovVer=0.45 – use 45% of points with highest coverage [null]&lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSSynteny.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.cluster files - files with results for each chromosome (files use chromosome names as in .bam files)&lt;br /&gt;
*stats.csv - file with summary information about all genes&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R script.&lt;br /&gt;
&lt;br /&gt;
Two ways of visualization are possible:&lt;br /&gt;
*results per chromosome&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_synteny.R&lt;br /&gt;
*.clusters files (either basic or extended) with results from SGSSynteny.jar: Chr1.clusters, Chr2.clusters etc.&lt;br /&gt;
*directory (location) where files with results from SGSGeneLoss.jar: Chr1.clusters, Chr2.clusters etc. can be found&lt;br /&gt;
&lt;br /&gt;
graph_synteny.R takes three arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .clusters file are located&lt;br /&gt;
&lt;br /&gt;
2. lower limit of the Y axis&lt;br /&gt;
&lt;br /&gt;
3. output path '''ending with /''' &lt;br /&gt;
  &lt;br /&gt;
 Rscript --vanilla graph_synteny.R /home/uqagnieszka/results 0.4 /home/uqagnieszka/graphs&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=632</id>
		<title>SGSGeneLoss</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=632"/>
				<updated>2014-06-23T00:11:29Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSGeneLoss depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
* [http://www.bioconductor.org/packages/release/bioc/html/ggbio.html ggbio]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSGeneLoss.v0.1.tar.gz SGSGeneLoss.v0.1.tar.gz] should contain&lt;br /&gt;
*** three main programs: SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
*** readme file&lt;br /&gt;
*** folder with source code &lt;br /&gt;
&lt;br /&gt;
From now on in this manual SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R are referred to as SGSGeneLoss.jar, graph_chromosomes.R, graph_circles.R&lt;br /&gt;
&lt;br /&gt;
To run the programs you have to use full names SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSGeneLoss.tar.gz&lt;br /&gt;
* Unpack SGSGeneLoss.tar.gz and place SGSGeneLoss.jar and all the R scripts in chosen directory/directories, for example ./my_geneloss&lt;br /&gt;
* Move into ./my_geneloss and create SGSGeneLoss_lib directory (on linux: cd ./my_geneloss, mkdir SGSGeneLoss_lib directory)&lt;br /&gt;
** The name of the lib directory is the name of the .jar file without .jar extension + _lib, so if you are using SGSGeneLoss.v0.1.jar the lib directory is: SGSGeneLoss.v0.1_lib&lt;br /&gt;
** The lib directory has to be in '''the same folder as the .jar file'''&lt;br /&gt;
* Download picard-tools (SGSGeneLoss was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_geneloss/SGSGeneLoss_lib&lt;br /&gt;
* Now you are ready to run SGSGeneLoss&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSGeneLoss.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately - .excov&lt;br /&gt;
** File with overall stats - stats.txt&lt;br /&gt;
** File with summary for all the chromosomes used - chrs.csv (this file is used by one of the R scripts)&lt;br /&gt;
** File with list of genes lost for all the chromosomes - graph.csv (this file is used by one of the R scripts)&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSGeneLoss.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to your bam file/files, has to end with / or \ bamPath=/home/my_bams/&lt;br /&gt;
&lt;br /&gt;
bamFileList - a single .bam file or a comma separated list, only file names, bam and corresponding .bai files have to be in a directory provided in bamPath bamFileList=bam1.bam,bam2.bam&lt;br /&gt;
&lt;br /&gt;
gffFile - location of gff3 file gffFile=/home/my_gffs/annot.gff3&lt;br /&gt;
&lt;br /&gt;
outDirPath - location output directory, has to end with / or \ outDirPath=/home/my_results&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
minCov - minimal coverage threshold to consider position covered [minCov=1]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes to be used for analysis, use all, for all chromosomes [chromosomeList=all]&lt;br /&gt;
&lt;br /&gt;
lostCutoff - coverage cutoff to consider gene as lost for calculating stats [lostCutoff=0.0]&lt;br /&gt;
&lt;br /&gt;
covCats - coverage categories for visualization [cavCats=0,10,20,30,40,70]&lt;br /&gt;
&lt;br /&gt;
extendedFmt - used extended format, additional info included in output files [regular format] &lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSGeneLoss.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Move into directory where SGSGeneLoss.jar is&lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3 outDirPath=/home/uqagnieszka/results/&lt;br /&gt;
 chromosomeList=all&lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam, arabidopsis2.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3&lt;br /&gt;
 outDirPath=/home/uqagnieszka/results/ chromosomeList=Chr1,Chr2 minCov=2 lostCutoff=0.05 covCats=0,2,5,10,20 extendedFmt&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.excov files - files with results for each chromosome (files use chromosome names as in .bam files), files come in two formats basic (default) or extended (extendedFmt)&lt;br /&gt;
**basic format: chromosome,ID,is_lost,start_position,end_postion,frac_exons_covered,frac_gene_covered,ave_cov_depth_exons,cov_cat,ave_cove_depth_gene&lt;br /&gt;
**extended format: contains additional columns with information about each of the exons&lt;br /&gt;
*stats.csv - file with summary information about all genes&lt;br /&gt;
*chrs.csv - file with summary information about chromosomes&lt;br /&gt;
**chr,start,end,len&lt;br /&gt;
*graph.csv - file with list of genes lost as determined by lostCutoff&lt;br /&gt;
**chr,id,start,end&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R scripts.&lt;br /&gt;
&lt;br /&gt;
Two ways of visualization are possible:&lt;br /&gt;
*results per chromosome&lt;br /&gt;
*results for all chromosomes as a circular graph&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_chromosomes.R&lt;br /&gt;
*.excov files (either basic or extended) with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc.&lt;br /&gt;
*directory (location) where files with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc. can be found&lt;br /&gt;
&lt;br /&gt;
graph_chromosomes.R takes two arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .excov file are located  &lt;br /&gt;
&lt;br /&gt;
2. gene loss cutoff&lt;br /&gt;
&lt;br /&gt;
3. output path '''ending with /'''&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_chromosomes.R /home/uqagnieszka/results 0.0 /home/uqagnieszka/graphs&lt;br /&gt;
&lt;br /&gt;
'''Summary results for all chromosomes, possibly multiple samples:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_circles.R&lt;br /&gt;
*graph.csv from SGSGeneLoss.jar run&lt;br /&gt;
*chrs.csv from SGSGeneLoss.jar run&lt;br /&gt;
*file assigning numeric order to chromosomes (this is done because some chromosomes have complicated names and sorting in ASCII order does not always work) - file should look like this, chromosome names will be replaced by corresponding numbers&lt;br /&gt;
 chrs,no&lt;br /&gt;
 chr1,1&lt;br /&gt;
 chr2,2&lt;br /&gt;
 chr10,10&lt;br /&gt;
&lt;br /&gt;
graph_circles.R takes five arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. file with chromosome info - chrs.csv from SGSGeneLoss.jar run&lt;br /&gt;
&lt;br /&gt;
2. file with chromosome order &lt;br /&gt;
&lt;br /&gt;
3. file with genes lost - graph.csv from SGSGeneLoss.jar run; it can be a comma separated list of multiple files (for example multiple samples). Circles will be drawn in the following order:&lt;br /&gt;
&lt;br /&gt;
first file in the list is the innermost circle, so if you have graph1.csv,graph2.csv,graph3.csv, order of circles will reflect order of files, starting from the inside&lt;br /&gt;
&lt;br /&gt;
4. Output path '''ending with /'''&lt;br /&gt;
&lt;br /&gt;
5. output file&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_circles.R chrs.csv chrs_order.csv graph1.csv,graph2.csv,graph3.csv out.png&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=631</id>
		<title>SGSGeneLoss</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=631"/>
				<updated>2014-06-18T03:11:53Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSGeneLoss depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
* [http://www.bioconductor.org/packages/release/bioc/html/ggbio.html ggbio]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSGeneLoss.v0.1.tar.gz SGSGeneLoss.v0.1.tar.gz] should contain&lt;br /&gt;
*** three main programs: SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
*** readme file&lt;br /&gt;
*** folder with source code &lt;br /&gt;
&lt;br /&gt;
From now on in this manual SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R are referred to as SGSGeneLoss.jar, graph_chromosomes.R, graph_circles.R&lt;br /&gt;
&lt;br /&gt;
To run the programs you have to use full names SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSGeneLoss.tar.gz&lt;br /&gt;
* Unpack SGSGeneLoss.tar.gz and place SGSGeneLoss.jar and all the R scripts in chosen directory/directories, for example ./my_geneloss&lt;br /&gt;
* Move into ./my_geneloss and create SGSGeneLoss_lib directory (on linux: cd ./my_geneloss, mkdir SGSGeneLoss_lib directory)&lt;br /&gt;
** The name of the lib directory is the name of the .jar file without .jar extension + _lib, so if you are using SGSGeneLoss.v0.1.jar the lib directory is: SGSGeneLoss.v0.1_lib&lt;br /&gt;
** The lib directory has to be in '''the same folder as the .jar file'''&lt;br /&gt;
* Download picard-tools (SGSGeneLoss was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_geneloss/SGSGeneLoss_lib&lt;br /&gt;
* Now you are ready to run SGSGeneLoss&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSGeneLoss.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately - .excov&lt;br /&gt;
** File with overall stats - stats.txt&lt;br /&gt;
** File with summary for all the chromosomes used - chrs.csv (this file is used by one of the R scripts)&lt;br /&gt;
** File with list of genes lost for all the chromosomes - graph.csv (this file is used by one of the R scripts)&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSGeneLoss.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to your bam file/files, has to end with / or \ bamPath=/home/my_bams/&lt;br /&gt;
&lt;br /&gt;
bamFileList - a single .bam file or a comma separated list, only file names, bam and corresponding .bai files have to be in a directory provided in bamPath bamFileList=bam1.bam,bam2.bam&lt;br /&gt;
&lt;br /&gt;
gffFile - location of gff3 file gffFile=/home/my_gffs/annot.gff3&lt;br /&gt;
&lt;br /&gt;
outDirPath - location output directory, has to end with / or \ outDirPath=/home/my_results&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
minCov - minimal coverage threshold to consider position covered [minCov=1]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes to be used for analysis, use all, for all chromosomes [chromosomeList=all]&lt;br /&gt;
&lt;br /&gt;
lostCutoff - coverage cutoff to consider gene as lost for calculating stats [lostCutoff=0.0]&lt;br /&gt;
&lt;br /&gt;
covCats - coverage categories for visualization [cavCats=0,10,20,30,40,70]&lt;br /&gt;
&lt;br /&gt;
extendedFmt - used extended format, additional info included in output files [regular format] &lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSGeneLoss.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Move into directory where SGSGeneLoss.jar is&lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3 outDirPath=/home/uqagnieszka/results/&lt;br /&gt;
 chromosomeList=all&lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam, arabidopsis2.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3&lt;br /&gt;
 outDirPath=/home/uqagnieszka/results/ chromosomeList=Chr1,Chr2 minCov=2 lostCutoff=0.05 covCats=0,2,5,10,20 extendedFmt&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.excov files - files with results for each chromosome (files use chromosome names as in .bam files), files come in two formats basic (default) or extended (extendedFmt)&lt;br /&gt;
**basic format: chromosome,ID,is_lost,start_position,end_postion,frac_exons_covered,frac_gene_covered,ave_cov_depth_exons,cov_cat,ave_cove_depth_gene&lt;br /&gt;
**extended format: contains additional columns with information about each of the exons&lt;br /&gt;
*stats.csv - file with summary information about all genes&lt;br /&gt;
*chrs.csv - file with summary information about chromosomes&lt;br /&gt;
**chr,start,end,len&lt;br /&gt;
*graph.csv - file with list of genes lost as determined by lostCutoff&lt;br /&gt;
**chr,id,start,end&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R scripts.&lt;br /&gt;
&lt;br /&gt;
Two ways of visualization are possible:&lt;br /&gt;
*results per chromosome&lt;br /&gt;
*results for all chromosomes as a circular graph&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_chromosomes.R&lt;br /&gt;
*.excov files (either basic or extended) with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc.&lt;br /&gt;
*directory (location) where files with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc. can be found&lt;br /&gt;
&lt;br /&gt;
graph_chromosomes.R takes two arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .excov file are located  &lt;br /&gt;
&lt;br /&gt;
2. gene loss cutoff&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_chromosomes.R /home/uqagnieszka/results 0.0&lt;br /&gt;
&lt;br /&gt;
'''Summary results for all chromosomes, possibly multiple samples:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_circles.R&lt;br /&gt;
*graph.csv from SGSGeneLoss.jar run&lt;br /&gt;
*chrs.csv from SGSGeneLoss.jar run&lt;br /&gt;
*file assigning numeric order to chromosomes (this is done because some chromosomes have complicated names and sorting in ASCII order does not always work) - file should look like this, chromosome names will be replaced by corresponding numbers&lt;br /&gt;
 chrs,no&lt;br /&gt;
 chr1,1&lt;br /&gt;
 chr2,2&lt;br /&gt;
 chr10,10&lt;br /&gt;
&lt;br /&gt;
graph_circles.R takes four arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. file with chromosome info - chrs.csv from SGSGeneLoss.jar run&lt;br /&gt;
&lt;br /&gt;
2. file with chromosome order &lt;br /&gt;
&lt;br /&gt;
3. file with genes lost - graph.csv from SGSGeneLoss.jar run; it can be a comma separated list of multiple files (for example multiple samples). Circles will be drawn in the following order:&lt;br /&gt;
&lt;br /&gt;
first file in the list is the innermost circle, so if you have graph1.csv,graph2.csv,graph3.csv, order of circles will reflect order of files, starting from the inside&lt;br /&gt;
&lt;br /&gt;
4. output file&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_circles.R chrs.csv chrs_order.csv graph1.csv,graph2.csv,graph3.csv out.png&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSSynteny&amp;diff=630</id>
		<title>SGSSynteny</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSSynteny&amp;diff=630"/>
				<updated>2014-06-18T03:06:43Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: Created page with &amp;quot;== What does SGSSynteny depend on? == SGSGeneLoss depends on the following: * [http://www.java.com/en/ Java 1.6] or higher * [http://www.r-project.org/ R/3.1.0] * [http://sourcef...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSSynteny depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSSynteny.v0.1.tar.gz SGSSynteny.v0.1.tar.gz] should contain&lt;br /&gt;
*** four main programs: SGSSynteny.v0.1.jar, graph_synteny.R&lt;br /&gt;
*** readme file &lt;br /&gt;
*** folder with source code&lt;br /&gt;
&lt;br /&gt;
From now on in this manula SGSSynteny.v0.1.jar and graph_synteny.v0.1.R are referred to as SGSSynteny.jar and graph_synteny.R&lt;br /&gt;
&lt;br /&gt;
To run the programs you have to use full names SGSSynteny.v0.1.jar and graph_synteny.v0.1.R&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSSynteny.tar.gz&lt;br /&gt;
* Unpack SGSSynteny.tar.gz and place SGSSynteny.jar and all the R scripts in chosen directory/directories, for example ./my_synteny&lt;br /&gt;
* Move into ./my_synteny and create SGSSynteny_lib directory (on linux: cd ./my_synteny, mkdir SGSSynteny_lib directory)&lt;br /&gt;
** The name of the lib directory is the name of the .jar file witout .jar extension + _lib, so if you are using SGSSynteny.v0.1.jar the lib directory is SGSSynteny.v0.1_lib&lt;br /&gt;
** The lib directory has to be in '''the same folder as the .jar file'''&lt;br /&gt;
* Download picard-tools (SGSSynteny.jar was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_gene_loss/SGSSynteny_lib&lt;br /&gt;
* Now you are ready to run SGSGeneLoss&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSSynteny.v0.1.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately - .cluster files&lt;br /&gt;
** File with overall stats - stats.csv&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSSynteny.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to bam file, only folder path, do not specify bam file names here, folder has to contain both .bam and .bai files; has to end with “/” or “\”&lt;br /&gt;
&lt;br /&gt;
bamFileList - comma separated list of all the bam files to be used&lt;br /&gt;
&lt;br /&gt;
gffFile - path to .gff3 file, including file name; has to contain at least genes and exons features&lt;br /&gt;
&lt;br /&gt;
outDirPath - directory for the output files; has to end with “/” or “\”&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
expectCov - expected coverage [null]&lt;br /&gt;
 &lt;br /&gt;
minFracHor - minimum horizontal coverage required to consider genes as syntenic  [0.3]&lt;br /&gt;
&lt;br /&gt;
minCovVer - minimum coverage depth required to consider genes as syntenic [2.0]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes, used `all` for all the chromosomes in .bam file [all]&lt;br /&gt;
&lt;br /&gt;
DBepsilon - Eps value for DBSCAN (radius) [26]&lt;br /&gt;
&lt;br /&gt;
DBmin - minPts value for DBSCAN (min cluster size) [24]&lt;br /&gt;
&lt;br /&gt;
genesOrExons - used whole genes or exons for coverage calculations [exons]&lt;br /&gt;
&lt;br /&gt;
mergeDistance - distance (no of genes) separating clusters for them to be merged [30]&lt;br /&gt;
&lt;br /&gt;
esimateMinCovVer - estimate min coverage depth used for clustering based on x points with highest coverage depth, esimateMinCovVer=0.45 – use 45% of points with highest coverage [null]&lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSSynteny.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.cluster files - files with results for each chromosome (files use chromosome names as in .bam files)&lt;br /&gt;
*stats.csv - file with summary information about all genes&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R script.&lt;br /&gt;
&lt;br /&gt;
Two ways of visualization are possible:&lt;br /&gt;
*results per chromosome&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_synteny.R&lt;br /&gt;
*.clusters files (either basic or extended) with results from SGSSynteny.jar: Chr1.clusters, Chr2.clusters etc.&lt;br /&gt;
*directory (location) where files with results from SGSGeneLoss.jar: Chr1.clusters, Chr2.clusters etc. can be found&lt;br /&gt;
&lt;br /&gt;
graph_synteny.R takes one arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .clusters file are located &lt;br /&gt;
  &lt;br /&gt;
 Rscript --vanilla graph_synteny.R /home/uqagnieszka/results&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=629</id>
		<title>SGSGeneLoss</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=629"/>
				<updated>2014-06-18T03:05:41Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSGeneLoss depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
* [http://www.bioconductor.org/packages/release/bioc/html/ggbio.html ggbio]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSGeneLoss.v0.1.tar.gz SGSGeneLoss.v0.1.tar.gz] should contain&lt;br /&gt;
*** three main programs: SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
*** readme file&lt;br /&gt;
*** folder with source code &lt;br /&gt;
&lt;br /&gt;
From now on in this manual SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R are referred to as SGSGeneLoss.jar, graph_chromosomes.R, graph_circles.R&lt;br /&gt;
&lt;br /&gt;
To run the programs you have to use full names SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSGeneLoss.tar.gz&lt;br /&gt;
* Unpack SGSGeneLoss.tar.gz and place SGSGeneLoss.jar and all the R scripts in chosen directory/directories, for example ./my_geneloss&lt;br /&gt;
* Move into ./my_geneloss and create SGSGeneLoss_lib directory (on linux: cd ./my_geneloss, mkdir SGSGeneLoss_lib directory)&lt;br /&gt;
** The name of the lib directory is the name of the .jar file without .jar extension + _lib, so if you are using SGSGeneLoss.v0.1.jar the lib directory is: SGSGeneLoss.v0.1_lib&lt;br /&gt;
** The lib directory has to be in the same folder as the .jar file&lt;br /&gt;
* Download picard-tools (SGSGeneLoss was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_geneloss/SGSGeneLoss_lib&lt;br /&gt;
* Now you are ready to run SGSGeneLoss&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSGeneLoss.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately - .excov&lt;br /&gt;
** File with overall stats - stats.txt&lt;br /&gt;
** File with summary for all the chromosomes used - chrs.csv (this file is used by one of the R scripts)&lt;br /&gt;
** File with list of genes lost for all the chromosomes - graph.csv (this file is used by one of the R scripts)&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSGeneLoss.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to your bam file/files, has to end with / or \ bamPath=/home/my_bams/&lt;br /&gt;
&lt;br /&gt;
bamFileList - a single .bam file or a comma separated list, only file names, bam and corresponding .bai files have to be in a directory provided in bamPath bamFileList=bam1.bam,bam2.bam&lt;br /&gt;
&lt;br /&gt;
gffFile - location of gff3 file gffFile=/home/my_gffs/annot.gff3&lt;br /&gt;
&lt;br /&gt;
outDirPath - location output directory, has to end with / or \ outDirPath=/home/my_results&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
minCov - minimal coverage threshold to consider position covered [minCov=1]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes to be used for analysis, use all, for all chromosomes [chromosomeList=all]&lt;br /&gt;
&lt;br /&gt;
lostCutoff - coverage cutoff to consider gene as lost for calculating stats [lostCutoff=0.0]&lt;br /&gt;
&lt;br /&gt;
covCats - coverage categories for visualization [cavCats=0,10,20,30,40,70]&lt;br /&gt;
&lt;br /&gt;
extendedFmt - used extended format, additional info included in output files [regular format] &lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSGeneLoss.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Move into directory where SGSGeneLoss.jar is&lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3 outDirPath=/home/uqagnieszka/results/&lt;br /&gt;
 chromosomeList=all&lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam, arabidopsis2.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3&lt;br /&gt;
 outDirPath=/home/uqagnieszka/results/ chromosomeList=Chr1,Chr2 minCov=2 lostCutoff=0.05 covCats=0,2,5,10,20 extendedFmt&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.excov files - files with results for each chromosome (files use chromosome names as in .bam files), files come in two formats basic (default) or extended (extendedFmt)&lt;br /&gt;
**basic format: chromosome,ID,is_lost,start_position,end_postion,frac_exons_covered,frac_gene_covered,ave_cov_depth_exons,cov_cat,ave_cove_depth_gene&lt;br /&gt;
**extended format: contains additional columns with information about each of the exons&lt;br /&gt;
*stats.csv - file with summary information about all genes&lt;br /&gt;
*chrs.csv - file with summary information about chromosomes&lt;br /&gt;
**chr,start,end,len&lt;br /&gt;
*graph.csv - file with list of genes lost as determined by lostCutoff&lt;br /&gt;
**chr,id,start,end&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R scripts.&lt;br /&gt;
&lt;br /&gt;
Two ways of visualization are possible:&lt;br /&gt;
*results per chromosome&lt;br /&gt;
*results for all chromosomes as a circular graph&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_chromosomes.R&lt;br /&gt;
*.excov files (either basic or extended) with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc.&lt;br /&gt;
*directory (location) where files with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc. can be found&lt;br /&gt;
&lt;br /&gt;
graph_chromosomes.R takes two arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .excov file are located  &lt;br /&gt;
&lt;br /&gt;
2. gene loss cutoff&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_chromosomes.R /home/uqagnieszka/results 0.0&lt;br /&gt;
&lt;br /&gt;
'''Summary results for all chromosomes, possibly multiple samples:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_circles.R&lt;br /&gt;
*graph.csv from SGSGeneLoss.jar run&lt;br /&gt;
*chrs.csv from SGSGeneLoss.jar run&lt;br /&gt;
*file assigning numeric order to chromosomes (this is done because some chromosomes have complicated names and sorting in ASCII order does not always work) - file should look like this, chromosome names will be replaced by corresponding numbers&lt;br /&gt;
 chrs,no&lt;br /&gt;
 chr1,1&lt;br /&gt;
 chr2,2&lt;br /&gt;
 chr10,10&lt;br /&gt;
&lt;br /&gt;
graph_circles.R takes four arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. file with chromosome info - chrs.csv from SGSGeneLoss.jar run&lt;br /&gt;
&lt;br /&gt;
2. file with chromosome order &lt;br /&gt;
&lt;br /&gt;
3. file with genes lost - graph.csv from SGSGeneLoss.jar run; it can be a comma separated list of multiple files (for example multiple samples). Circles will be drawn in the following order:&lt;br /&gt;
&lt;br /&gt;
first file in the list is the innermost circle, so if you have graph1.csv,graph2.csv,graph3.csv, order of circles will reflect order of files, starting from the inside&lt;br /&gt;
&lt;br /&gt;
4. output file&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_circles.R chrs.csv chrs_order.csv graph1.csv,graph2.csv,graph3.csv out.png&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=628</id>
		<title>SGSGeneLoss</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=628"/>
				<updated>2014-06-18T02:05:26Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSGeneLoss depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
* [http://www.bioconductor.org/packages/release/bioc/html/ggbio.html ggbio]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSGeneLoss.v0.1.tar.gz SGSGeneLoss.v0.1.tar.gz] should contain&lt;br /&gt;
*** three main programs: SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
*** readme file&lt;br /&gt;
*** folder with source code &lt;br /&gt;
&lt;br /&gt;
From now on in this manual SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R are referred to as SGSGeneLoss.jar, graph_chromosomes.R, graph_circles.R&lt;br /&gt;
&lt;br /&gt;
To run the programs you have to use full names SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSGeneLoss.tar.gz&lt;br /&gt;
* Unpack SGSGeneLoss.tar.gz and place SGSGeneLoss.jar and all the R scripts in chosen directory/directories, for example ./my_geneloss&lt;br /&gt;
* Move into ./my_geneloss and create SGSGeneLoss_lib directory (on linux: cd ./my_geneloss, mkdir SGSGeneLoss_lib directory)&lt;br /&gt;
** The name of the lib directory is the name of the .jar file without .jar extension + _lib, so if you are using SGSGeneLoss.v0.1.jar the lib directory is: SGSGeneLoss.v0.1_lib&lt;br /&gt;
* Download picard-tools (SGSGeneLoss was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_geneloss/SGSGeneLoss_lib&lt;br /&gt;
* Now you are ready to run SGSGeneLoss&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSGeneLoss.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately - .excov&lt;br /&gt;
** File with overall stats - stats.txt&lt;br /&gt;
** File with summary for all the chromosomes used - chrs.csv (this file is used by one of the R scripts)&lt;br /&gt;
** File with list of genes lost for all the chromosomes - graph.csv (this file is used by one of the R scripts)&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSGeneLoss.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to your bam file/files, has to end with / or \ bamPath=/home/my_bams/&lt;br /&gt;
&lt;br /&gt;
bamFileList - a single .bam file or a comma separated list, only file names, bam and corresponding .bai files have to be in a directory provided in bamPath bamFileList=bam1.bam,bam2.bam&lt;br /&gt;
&lt;br /&gt;
gffFile - location of gff3 file gffFile=/home/my_gffs/annot.gff3&lt;br /&gt;
&lt;br /&gt;
outDirPath - location output directory, has to end with / or \ outDirPath=/home/my_results&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
minCov - minimal coverage threshold to consider position covered [minCov=1]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes to be used for analysis, use all, for all chromosomes [chromosomeList=all]&lt;br /&gt;
&lt;br /&gt;
lostCutoff - coverage cutoff to consider gene as lost for calculating stats [lostCutoff=0.0]&lt;br /&gt;
&lt;br /&gt;
covCats - coverage categories for visualization [cavCats=0,10,20,30,40,70]&lt;br /&gt;
&lt;br /&gt;
extendedFmt - used extended format, additional info included in output files [regular format] &lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSGeneLoss.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Move into directory where SGSGeneLoss.jar is&lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3 outDirPath=/home/uqagnieszka/results/&lt;br /&gt;
 chromosomeList=all&lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam, arabidopsis2.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3&lt;br /&gt;
 outDirPath=/home/uqagnieszka/results/ chromosomeList=Chr1,Chr2 minCov=2 lostCutoff=0.05 covCats=0,2,5,10,20 extendedFmt&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.excov files - files with results for each chromosome (files use chromosome names as in .bam files), files come in two formats basic (default) or extended (extendedFmt)&lt;br /&gt;
**basic format: chromosome,ID,is_lost,start_position,end_postion,frac_exons_covered,frac_gene_covered,ave_cov_depth_exons,cov_cat,ave_cove_depth_gene&lt;br /&gt;
**extended format: contains additional columns with information about each of the exons&lt;br /&gt;
*stats.csv - file with summary information about all genes&lt;br /&gt;
*chrs.csv - file with summary information about chromosomes&lt;br /&gt;
**chr,start,end,len&lt;br /&gt;
*graph.csv - file with list of genes lost as determined by lostCutoff&lt;br /&gt;
**chr,id,start,end&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R scripts.&lt;br /&gt;
&lt;br /&gt;
Two ways of visualization are possible:&lt;br /&gt;
*results per chromosome&lt;br /&gt;
*results for all chromosomes as a circular graph&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_chromosomes.R&lt;br /&gt;
*.excov files (either basic or extended) with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc.&lt;br /&gt;
*directory (location) where files with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc. can be found&lt;br /&gt;
&lt;br /&gt;
graph_chromosomes.R takes two arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .excov file are located  &lt;br /&gt;
&lt;br /&gt;
2. gene loss cutoff&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_chromosomes.R /home/uqagnieszka/results 0.0&lt;br /&gt;
&lt;br /&gt;
'''Summary results for all chromosomes, possibly multiple samples:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_circles.R&lt;br /&gt;
*graph.csv from SGSGeneLoss.jar run&lt;br /&gt;
*chrs.csv from SGSGeneLoss.jar run&lt;br /&gt;
*file assigning numeric order to chromosomes (this is done because some chromosomes have complicated names and sorting in ASCII order does not always work) - file should look like this, chromosome names will be replaced by corresponding numbers&lt;br /&gt;
 chrs,no&lt;br /&gt;
 chr1,1&lt;br /&gt;
 chr2,2&lt;br /&gt;
 chr10,10&lt;br /&gt;
&lt;br /&gt;
graph_circles.R takes four arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. file with chromosome info - chrs.csv from SGSGeneLoss.jar run&lt;br /&gt;
&lt;br /&gt;
2. file with chromosome order &lt;br /&gt;
&lt;br /&gt;
3. file with genes lost - graph.csv from SGSGeneLoss.jar run; it can be a comma separated list of multiple files (for example multiple samples). Circles will be drawn in the following order:&lt;br /&gt;
&lt;br /&gt;
first file in the list is the innermost circle, so if you have graph1.csv,graph2.csv,graph3.csv, order of circles will reflect order of files, starting from the inside&lt;br /&gt;
&lt;br /&gt;
4. output file&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_circles.R chrs.csv chrs_order.csv graph1.csv,graph2.csv,graph3.csv out.png&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=627</id>
		<title>SGSGeneLoss</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=627"/>
				<updated>2014-06-18T01:57:45Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSGeneLoss depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
* [http://www.bioconductor.org/packages/release/bioc/html/ggbio.html ggbio]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSGeneLoss.v0.1.tar.gz SGSGeneLoss.v0.1.tar.gz] should contain&lt;br /&gt;
*** three main programs: SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
*** readme file&lt;br /&gt;
*** folder with source code &lt;br /&gt;
&lt;br /&gt;
From now on in this manual SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R are referred to as SGSGeneLoss.jar, graph_chromosomes.R, graph_circles.R&lt;br /&gt;
&lt;br /&gt;
To run the programs you have to use full names SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSGeneLoss.tar.gz&lt;br /&gt;
* Unpack SGSGeneLoss.tar.gz and place SGSGeneLoss.jar and all the R scripts in chosen directory/directories, for example ./my_geneloss&lt;br /&gt;
* Move into ./my_geneloss and create SGSGeneLoss_lib directory (on linux: cd ./my_geneloss, mkdir SGSGeneLoss_lib directory)&lt;br /&gt;
** The name of the lib directory is the name of the .jar file without .jar extension + _lib, so if you are using SGSGeneLoss.v0.1.jar the lib directory is: SGSGeneLoss.v0.1_lib&lt;br /&gt;
* Download picard-tools (SGSGeneLoss was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_geneloss/SGSGeneLoss_lib&lt;br /&gt;
* Now you are ready to run SGSGeneLoss&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSGeneLoss.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately - .excov&lt;br /&gt;
** File with overall stats - stats.txt&lt;br /&gt;
** File with summary for all the chromosomes used - chrs.csv (this file is used by one of the R scripts)&lt;br /&gt;
** File with list of genes lost for all the chromosomes - graph.csv (this file is used by one of the R scripts)&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSGeneLoss.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to your bam file/files, has to end with / or \ bamPath=/home/my_bams/&lt;br /&gt;
&lt;br /&gt;
bamFileList - a single .bam file or a comma separated list, only file names, bam and corresponding .bai files have to be in a directory provided in bamPath bamFileList=bam1.bam,bam2.bam&lt;br /&gt;
&lt;br /&gt;
gffFile - location of gff3 file gffFile=/home/my_gffs/annot.gff3&lt;br /&gt;
&lt;br /&gt;
outDirPath - location output directory, has to end with / or \ outDirPath=/home/my_results&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
minCov - minimal coverage threshold to consider position covered [minCov=1]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes to be used for analysis, use all, for all chromosomes [chromosomeList=all]&lt;br /&gt;
&lt;br /&gt;
lostCutoff - coverage cutoff to consider gene as lost for calculating stats [lostCutoff=0.0]&lt;br /&gt;
&lt;br /&gt;
covCats - coverage categories for visualization [cavCats=0,10,20,30,40,70]&lt;br /&gt;
&lt;br /&gt;
extendedFmt - used extended format, additional info included in output files [regular format] &lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSGeneLoss.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Move into directory where SGSGeneLoss.jar is&lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3 outDirPath=/home/uqagnieszka/results/&lt;br /&gt;
 chromosomeList=all&lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam, arabidopsis2.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3&lt;br /&gt;
 outDirPath=/home/uqagnieszka/results/ chromosomeList=Chr1,Chr2 minCov=2 lostCutoff=0.05 covCats=0,2,5,10,20 extendedFmt&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.excov files - files with results for each chromosome (files use chromosome names as in .bam files), files come in two formats basic (default) or extended (extendedFmt)&lt;br /&gt;
**basic format: chromosome,ID,is_lost,start_position,end_postion,frac_exons_covered,frac_gene_covered,ave_cov_depth_exons,cov_cat,ave_cove_depth_gene&lt;br /&gt;
**extended format: contains additional columns with information about each of the exons&lt;br /&gt;
*stats.csv - file with summary information about all genes&lt;br /&gt;
*chrs.csv - file with summary information about chromosomes&lt;br /&gt;
**chr,start,end,len&lt;br /&gt;
*graph.csv - file with list of genes lost as determined by lostCutoff&lt;br /&gt;
**chr,id,start,end&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R scripts.&lt;br /&gt;
&lt;br /&gt;
Two ways of visualization are possible:&lt;br /&gt;
*results per chromosome&lt;br /&gt;
*results for all chromosomes as a circular graph&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_chromosomes.R&lt;br /&gt;
*.excov files (either basic or extended) with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc.&lt;br /&gt;
*directory (location) where files with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc. can be found&lt;br /&gt;
&lt;br /&gt;
graph_chromosomes.R takes two arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .excov file are located  &lt;br /&gt;
&lt;br /&gt;
2. gene loss cutoff&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_chromosomes.R /home/uqagnieszka/results 0.0&lt;br /&gt;
&lt;br /&gt;
'''Summary results for all chromosomes, possibly multiple samples:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_circles.R&lt;br /&gt;
*graph.txt from SGSGeneLoss.jar run&lt;br /&gt;
*chrs.txt from SGSGeneLoss.jar run&lt;br /&gt;
*file assigning numeric order to chromosomes (this is done because some chromosomes have complicated names and sorting in ASCII order does not always work) - file should look like this, chromosome names will be replaced by corresponding numbers&lt;br /&gt;
 chrs,no&lt;br /&gt;
 chr1,1&lt;br /&gt;
 chr2,2&lt;br /&gt;
 chr10,10&lt;br /&gt;
&lt;br /&gt;
graph_circles.R takes four arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. file with chromosome info - chrs.csv from SGSGeneLoss.jar run&lt;br /&gt;
&lt;br /&gt;
2. file with chromosome order &lt;br /&gt;
&lt;br /&gt;
3. file with genes lost - graph.csv from SGSGeneLoss.jar run; it can be a comma separated list of multiple files (for example multiple samples). Circles will be drawn in the following order:&lt;br /&gt;
&lt;br /&gt;
first file in the list is the innermost circle, so if you have graph1.csv,graph2.csv,graph3.csv, order of circles will reflect order of files, starting from the inside&lt;br /&gt;
&lt;br /&gt;
4. output file&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_circles.R chrs.csv chrs_order.csv graph1.csv,graph2.csv,graph3.csv out.png&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=625</id>
		<title>SGSGeneLoss</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=625"/>
				<updated>2014-06-18T00:45:03Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSGeneLoss depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
* [http://www.bioconductor.org/packages/release/bioc/html/ggbio.html ggbio]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSGeneLoss.v0.1.tar.gz SGSGeneLoss.v0.1.tar.gz] should contain&lt;br /&gt;
*** three main programs: SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
*** readme file&lt;br /&gt;
*** folder with source code &lt;br /&gt;
&lt;br /&gt;
From now on in this manual SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R are referred to as SGSGeneLoss.jar, graph_chromosomes.R, graph_circles.R&lt;br /&gt;
&lt;br /&gt;
To run the programs you have to use full names SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSGeneLoss.tar.gz&lt;br /&gt;
* Unpack SGSGeneLoss.tar.gz and place SGSGeneLoss.jar and all the R scripts in chosen directory/directories, for example ./my_geneloss&lt;br /&gt;
* Move into ./my_geneloss and create SGSGeneLoss_lib directory (on linux: cd ./my_geneloss, mkdir SGSGeneLoss_lib directory)&lt;br /&gt;
** The name of the lib directory is the name of the .jar file without .jar extension + _lib, so if you are using SGSGeneLoss.v0.1.jar the lib directory is: SGSGeneLoss.v0.1_lib&lt;br /&gt;
* Download picard-tools (SGSGeneLoss was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_geneloss/SGSGeneLoss_lib&lt;br /&gt;
* Now you are ready to run SGSGeneLoss&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSGeneLoss.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately - .excov&lt;br /&gt;
** File with overall stats - stats.csv&lt;br /&gt;
** File with summary for all the chromosomes used - chrs.csv (this file is used by one of the R scripts)&lt;br /&gt;
** File with list of genes lost for all the chromosomes - graph.csv (this file is used by one of the R scripts)&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSGeneLoss.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to your bam file/files, has to end with / or \ bamPath=/home/my_bams/&lt;br /&gt;
&lt;br /&gt;
bamFileList - a single .bam file or a comma separated list, only file names, bam and corresponding .bai files have to be in a directory provided in bamPath bamFileList=bam1.bam,bam2.bam&lt;br /&gt;
&lt;br /&gt;
gffFile - location of gff3 file gffFile=/home/my_gffs/annot.gff3&lt;br /&gt;
&lt;br /&gt;
outDirPath - location output directory, has to end with / or \ outDirPath=/home/my_results&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
minCov - minimal coverage threshold to consider position covered [minCov=1]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes to be used for analysis, use all, for all chromosomes [chromosomeList=all]&lt;br /&gt;
&lt;br /&gt;
lostCutoff - coverage cutoff to consider gene as lost for calculating stats [lostCutoff=0.0]&lt;br /&gt;
&lt;br /&gt;
covCats - coverage categories for visualization [cavCats=0,10,20,30,40,70]&lt;br /&gt;
&lt;br /&gt;
extendedFmt - used extended format, additional info included in output files [regular format] &lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSGeneLoss.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Move into directory where SGSGeneLoss.jar is&lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3 outDirPath=/home/uqagnieszka/results/&lt;br /&gt;
 chromosomeList=all&lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam, arabidopsis2.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3&lt;br /&gt;
 outDirPath=/home/uqagnieszka/results/ chromosomeList=Chr1,Chr2 minCov=2 lostCutoff=0.05 covCats=0,2,5,10,20 extendedFmt&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.excov files - files with results for each chromosome (files use chromosome names as in .bam files), files come in two formats basic (default) or extended (extendedFmt)&lt;br /&gt;
**basic format: chromosome,ID,is_lost,start_position,end_postion,frac_exons_covered,frac_gene_covered,ave_cov_depth_exons,cov_cat,ave_cove_depth_gene&lt;br /&gt;
**extended format: contains additional columns with information about each of the exons&lt;br /&gt;
*stats.csv - file with summary information about all genes&lt;br /&gt;
*chrs.csv - file with summary information about chromosomes&lt;br /&gt;
**chr,start,end,len&lt;br /&gt;
*graph.csv - file with list of genes lost as determined by lostCutoff&lt;br /&gt;
**chr,id,start,end&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R scripts.&lt;br /&gt;
&lt;br /&gt;
Two ways of visualization are possible:&lt;br /&gt;
*results per chromosome&lt;br /&gt;
*results for all chromosomes as a circular graph&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_chromosomes.R&lt;br /&gt;
*.excov files (either basic or extended) with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc.&lt;br /&gt;
*directory (location) where files with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc. can be found&lt;br /&gt;
&lt;br /&gt;
graph_chromosomes.R takes two arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .excov file are located  &lt;br /&gt;
&lt;br /&gt;
2. gene loss cutoff&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_chromosomes.R /home/uqagnieszka/results 0.0&lt;br /&gt;
&lt;br /&gt;
'''Summary results for all chromosomes, possibly multiple samples:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_circles.R&lt;br /&gt;
*graph.txt from SGSGeneLoss.jar run&lt;br /&gt;
*chrs.txt from SGSGeneLoss.jar run&lt;br /&gt;
*file assigning numeric order to chromosomes (this is done because some chromosomes have complicated names and sorting in ASCII order does not always work) - file should look like this, chromosome names will be replaced by corresponding numbers&lt;br /&gt;
 chrs,no&lt;br /&gt;
 chr1,1&lt;br /&gt;
 chr2,2&lt;br /&gt;
 chr10,10&lt;br /&gt;
&lt;br /&gt;
graph_circles.R takes four arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. file with chromosome info - chrs.csv from SGSGeneLoss.jar run&lt;br /&gt;
&lt;br /&gt;
2. file with chromosome order &lt;br /&gt;
&lt;br /&gt;
3. file with genes lost - graph.csv from SGSGeneLoss.jar run; it can be a comma separated list of multiple files (for example multiple samples). Circles will be drawn in the following order:&lt;br /&gt;
&lt;br /&gt;
first file in the list is the innermost circle, so if you have graph1.csv,graph2.csv,graph3.csv, order of circles will reflect order of files, starting from the inside&lt;br /&gt;
&lt;br /&gt;
4. output file&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_circles.R chrs.csv chrs_order.csv graph1.csv,graph2.csv,graph3.csv out.png&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=619</id>
		<title>SGSGeneLoss</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=619"/>
				<updated>2014-06-16T01:52:41Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSGeneLoss depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
* [http://www.bioconductor.org/packages/release/bioc/html/ggbio.html ggbio]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSGeneLoss.v0.1.tar.gz SGSGeneLoss.v0.1.tar.gz] should contain&lt;br /&gt;
*** three main programs: SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
*** readme file&lt;br /&gt;
*** folder with source code &lt;br /&gt;
&lt;br /&gt;
From now on in this manual SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R are referred to as SGSGeneLoss.jar, graph_chromosomes.R, graph_circles.R&lt;br /&gt;
&lt;br /&gt;
To run the programs you have to use full names SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSGeneLoss.tar.gz&lt;br /&gt;
* Unpack SGSGeneLoss.tar.gz and place SGSGeneLoss.jar and all the R scripts in chosen directory/directories, for example ./my_geneloss&lt;br /&gt;
* Move into ./my_geneloss and create SGSGeneLoss_lib directory (on linux: cd ./my_geneloss, mkdir SGSGeneLoss_lib directory)&lt;br /&gt;
** The name of the lib directory is the name of the .jar file without .jar extension + _lib, so if you are using SGSGeneLoss.v0.1.jar the lib directory is: SGSGeneLoss.v0.1_lib&lt;br /&gt;
* Download picard-tools (SGSGeneLoss was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_geneloss/SGSGeneLoss_lib&lt;br /&gt;
* Now you are ready to run SGSGeneLoss&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSGeneLoss.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately - .excov&lt;br /&gt;
** File with overall stats - stats.csv&lt;br /&gt;
** File with summary for all the chromosomes used - chrs.csv (this file is used by one of the R scripts)&lt;br /&gt;
** File with list of genes lost for all the chromosomes - graph.csv (this file is used by one of the R scripts)&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSGeneLoss.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to your bam file/files, has to end with / or \ bamPath=/home/my_bams/&lt;br /&gt;
&lt;br /&gt;
bamFileList - a single .bam file or a comma separated list, only file names, bam and corresponding .bai files have to be in a directory provided in bamPath bamFileList=bam1.bam,bam2.bam&lt;br /&gt;
&lt;br /&gt;
gffFile - location of gff3 file gffFile=/home/my_gffs/annot.gff3&lt;br /&gt;
&lt;br /&gt;
outDirPath - location output directory, has to end with / or \ outDirPath=/home/my_results&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
minCov - minimal coverage threshold to consider position covered [minCov=1]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes to be used for analysis, use all, for all chromosomes [chromosomeList=all]&lt;br /&gt;
&lt;br /&gt;
lostCutoff - coverage cutoff to consider gene as lost for calculating stats [lostCutoff=0.0]&lt;br /&gt;
&lt;br /&gt;
covCats - coverage categories for visualization [cavCats=0,10,20,30,40,70]&lt;br /&gt;
&lt;br /&gt;
extendedFmt - used extended format, additional info included in output files [regular format] &lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSGeneLoss.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Move into directory where SGSGeneLoss.jar is&lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3 outDirPath=/home/uqagnieszka/results/&lt;br /&gt;
 chromosomeList=all&lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam, arabidopsis2.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3&lt;br /&gt;
 outDirPath=/home/uqagnieszka/results/ chromosomeList=Chr1,Chr2 minCov=2 lostCutoff=0.05 covCats=0,2,5,10,20 extendedFmt&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.excov files - files with results for each chromosome (files use chromosome names as in .bam files), files come in two formats basic (default) or extended (extendedFmt)&lt;br /&gt;
**basic format: chromosome,ID,is_lost,start_position,end_postion,frac_exons_covered,frac_gene_covered,ave_cov_depth_exons,cov_cat,ave_cove_depth_gene&lt;br /&gt;
**extended format: contains additional columns with information about each of the exons&lt;br /&gt;
*stats.csv - file with summary information about all genes&lt;br /&gt;
*chrs.csv - file with summary information about chromosomes&lt;br /&gt;
**chr,start,end,len&lt;br /&gt;
*graph.csv - file with list of genes lost as determined by lostCutoff&lt;br /&gt;
**chr,id,start,end&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R scripts.&lt;br /&gt;
&lt;br /&gt;
Two ways of visualization are possible:&lt;br /&gt;
*results per chromosome&lt;br /&gt;
*results for all chromosomes as a circular graph&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*scripts graph_chromosomes.R, graph_main.R in the same directory&lt;br /&gt;
*.excov files (either basic or extended) with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc.&lt;br /&gt;
*directory (location) where files with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc. can be found&lt;br /&gt;
*file listing all the result files for which you want graphs drawn, one per line - for example graph_list.txt file which looks like this:&lt;br /&gt;
 Chr1.excov&lt;br /&gt;
 Chr2.excov&lt;br /&gt;
 Chr3.excov&lt;br /&gt;
&lt;br /&gt;
graph_chromosomes.R takes three arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .excov file are located &lt;br /&gt;
&lt;br /&gt;
2.  file listing all the result files for which you want graphs drawn &lt;br /&gt;
&lt;br /&gt;
3. gene loss cutoff&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_chromosomes.R /home/uqagnieszka/results /home/uqagnieszka/results/graph_list.txt 0.0&lt;br /&gt;
&lt;br /&gt;
'''Summary results for all chromosomes, possibly multiple samples:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_circles.R&lt;br /&gt;
*graph.txt from SGSGeneLoss.jar run&lt;br /&gt;
*chrs.txt from SGSGeneLoss.jar run&lt;br /&gt;
*file assigning numeric order to chromosomes (this is done because some chromosomes have complicated names and sorting in ASCII order does not always work) - file should look like this, chromosome names will be replaced by corresponding numbers&lt;br /&gt;
 chrs,no&lt;br /&gt;
 chr1,1&lt;br /&gt;
 chr2,2&lt;br /&gt;
 chr10,10&lt;br /&gt;
&lt;br /&gt;
graph_circles.R takes four arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. file with chromosome info - chrs.csv from SGSGeneLoss.jar run&lt;br /&gt;
&lt;br /&gt;
2. file with chromosome order &lt;br /&gt;
&lt;br /&gt;
3. file with genes lost - graph.csv from SGSGeneLoss.jar run; it can be a comma separated list of multiple files (for example multiple samples). Circles will be drawn in the following order:&lt;br /&gt;
&lt;br /&gt;
first file in the list is the innermost circle, so if you have graph1.csv,graph2.csv,graph3.csv, order of circles will reflect order of files, starting from the inside&lt;br /&gt;
&lt;br /&gt;
4. output file&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_circles.R chrs.csv chrs_order.csv graph1.csv,graph2.csv,graph3.csv out.png&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=618</id>
		<title>SGSGeneLoss</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=618"/>
				<updated>2014-06-16T01:45:22Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSGeneLoss depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
* [http://www.bioconductor.org/packages/release/bioc/html/ggbio.html ggbio]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSGeneLoss.v0.1.tar.gz SGSGeneLoss.v0.1.tar.gz] should contain&lt;br /&gt;
*** three main programs: SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
*** readme file&lt;br /&gt;
*** folder with source code &lt;br /&gt;
&lt;br /&gt;
From now on in this manual SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R are referred to as SGSGeneLoss.jar, graph_chromosomes.R, graph_circles.R&lt;br /&gt;
&lt;br /&gt;
To run the programs you have to use full names SGSGeneLoss.v0.1.jar, graph_chromosomes.v0.1.R, graph_circles.v0.1.R&lt;br /&gt;
&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSGeneLoss.tar.gz&lt;br /&gt;
* Unpack SGSGeneLoss.tar.gz and place SGSGeneLoss.jar and all the R scripts in chosen directory/directories, for example ./my_geneloss&lt;br /&gt;
* Move into ./my_geneloss and create SGSGeneLoss_lib directory (on linux: cd ./my_geneloss, mkdir SGSGeneLoss_lib directory)&lt;br /&gt;
** The name of the lib directory is the name of the .jar file without .jar extension + _lib, so if you are using SGSGeneLoss.v0.1.jar the lib directory is: SGSGeneLoss.v0.1_lib&lt;br /&gt;
* Download picard-tools (SGSGeneLoss was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_geneloss/SGSGeneLoss_lib&lt;br /&gt;
* Now you are ready to run SGSGeneLoss&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSGeneLoss.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately&lt;br /&gt;
** File with overall stats - stats.txt&lt;br /&gt;
** File with summary for all the chromosomes used - chrs.txt (this file is used by one of the R scripts)&lt;br /&gt;
** File with list of genes lost for all the chromosomes - graph.txt (this file is used by one of the R scripts)&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSGeneLoss.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to your bam file/files, has to end with / or \ bamPath=/home/my_bams/&lt;br /&gt;
&lt;br /&gt;
bamFileList - a single .bam file or a comma separated list, only file names, bam and corresponding .bai files have to be in a directory provided in bamPath bamFileList=bam1.bam,bam2.bam&lt;br /&gt;
&lt;br /&gt;
gffFile - location of gff3 file gffFile=/home/my_gffs/annot.gff3&lt;br /&gt;
&lt;br /&gt;
outDirPath - location output directory, has to end with / or \ outDirPath=/home/my_results&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
minCov - minimal coverage threshold to consider position covered [minCov=1]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes to be used for analysis, use all, for all chromosomes [chromosomeList=all]&lt;br /&gt;
&lt;br /&gt;
lostCutoff - coverage cutoff to consider gene as lost for calculating stats [lostCutoff=0.0]&lt;br /&gt;
&lt;br /&gt;
covCats - coverage categories for visualization [cavCats=0,10,20,30,40,70]&lt;br /&gt;
&lt;br /&gt;
extendedFmt - used extended format, additional info included in output files [regular format] &lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSGeneLoss.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Move into directory where SGSGeneLoss.jar is&lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3 outDirPath=/home/uqagnieszka/results/&lt;br /&gt;
 chromosomeList=all&lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam, arabidopsis2.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3&lt;br /&gt;
 outDirPath=/home/uqagnieszka/results/ chromosomeList=Chr1,Chr2 minCov=2 lostCutoff=0.05 covCats=0,2,5,10,20 extendedFmt&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.excov files - files with results for each chromosome (files use chromosome names as in .bam files), files come in two formats basic (default) or extended (extendedFmt)&lt;br /&gt;
**basic format: chromosome,ID,is_lost,start_position,end_postion,frac_exons_covered,frac_gene_covered,ave_cov_depth_exons,cov_cat,ave_cove_depth_gene&lt;br /&gt;
**extended format: contains additional columns with information about each of the exons&lt;br /&gt;
*stats.txt - file with summary information about all genes&lt;br /&gt;
*chrs.txt - file with summary information about chromosomes&lt;br /&gt;
**chr,start,end,len&lt;br /&gt;
*graph.txt - file with list of genes lost as determined by lostCutoff&lt;br /&gt;
**chr,id,start,end&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R scripts.&lt;br /&gt;
&lt;br /&gt;
Two ways of visualization are possible:&lt;br /&gt;
*results per chromosome&lt;br /&gt;
*results for all chromosomes as a circular graph&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*scripts graph_chromosomes.R, graph_main.R in the same directory&lt;br /&gt;
*.excov files (either basic or extended) with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc.&lt;br /&gt;
*directory (location) where files with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc. can be found&lt;br /&gt;
*file listing all the result files for which you want graphs drawn, one per line - for example graph_list.txt file which looks like this:&lt;br /&gt;
 Chr1.excov&lt;br /&gt;
 Chr2.excov&lt;br /&gt;
 Chr3.excov&lt;br /&gt;
&lt;br /&gt;
graph_chromosomes.R takes three arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .excov file are located &lt;br /&gt;
&lt;br /&gt;
2.  file listing all the result files for which you want graphs drawn &lt;br /&gt;
&lt;br /&gt;
3. gene loss cutoff&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_chromosomes.R /home/uqagnieszka/results /home/uqagnieszka/results/graph_list.txt 0.0&lt;br /&gt;
&lt;br /&gt;
'''Summary results for all chromosomes, possibly multiple samples:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_circles.R&lt;br /&gt;
*graph.txt from SGSGeneLoss.jar run&lt;br /&gt;
*chrs.txt from SGSGeneLoss.jar run&lt;br /&gt;
*file assigning numeric order to chromosomes (this is done because some chromosomes have complicated names and sorting in ASCII order does not always work) - file should look like this, chromosome names will be replaced by corresponding numbers&lt;br /&gt;
 chrs,no&lt;br /&gt;
 chr1,1&lt;br /&gt;
 chr2,2&lt;br /&gt;
 chr10,10&lt;br /&gt;
&lt;br /&gt;
graph_circles.R takes four arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. file with chromosome info - chrs.csv from SGSGeneLoss.jar run&lt;br /&gt;
&lt;br /&gt;
2. file with chromosome order &lt;br /&gt;
&lt;br /&gt;
3. file with genes lost - graph.csv from SGSGeneLoss.jar run; it can be a comma separated list of multiple files (for example multiple samples). Circles will be drawn in the following order:&lt;br /&gt;
&lt;br /&gt;
first file in the list is the innermost circle, so if you have graph1.txt,graph2.txt,graph3.txt, order of circles will reflect order of files, starting from the inside&lt;br /&gt;
&lt;br /&gt;
4. output file&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_circles.R chrs.csv chrs_order.csv graph1.csv,graph2.csv,graph3.csv out.png&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	<entry>
		<id>https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=617</id>
		<title>SGSGeneLoss</title>
		<link rel="alternate" type="text/html" href="https://www.appliedbioinformatics.com.au/Edwards/index.php?title=SGSGeneLoss&amp;diff=617"/>
				<updated>2014-06-16T01:30:49Z</updated>
		
		<summary type="html">&lt;p&gt;Agnieszka: Created page with &amp;quot;== What does SGSGeneLoss depend on? == SGSGeneLoss depends on the following: * [http://www.java.com/en/ Java 1.6] or higher * [http://www.r-project.org/ R/3.1.0] * [http://source...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== What does SGSGeneLoss depend on? ==&lt;br /&gt;
SGSGeneLoss depends on the following:&lt;br /&gt;
* [http://www.java.com/en/ Java 1.6] or higher&lt;br /&gt;
* [http://www.r-project.org/ R/3.1.0]&lt;br /&gt;
* [http://sourceforge.net/projects/picard/files/picard-tools/ picard-tools]&lt;br /&gt;
* [http://ggplot2.org/ ggplot2]&lt;br /&gt;
* [http://www.bioconductor.org/packages/release/bioc/html/ggbio.html ggbio]&lt;br /&gt;
&lt;br /&gt;
== Download ==&lt;br /&gt;
* Latest Version 0.1 (29/04/2014):&lt;br /&gt;
** [http://appliedbioinformatics.com.au/download/SGSGeneLoss.tar.gz SGSGeneLoss.tar.gz] should contain&lt;br /&gt;
*** four main programs: SGSGeneLoss.jar, graph_chromosomes.R, graph_main.R, graph_circles.R&lt;br /&gt;
*** readme file &lt;br /&gt;
*** sample_results folder with results for sample data]&lt;br /&gt;
&lt;br /&gt;
== How to install? ==&lt;br /&gt;
* SGSGeneLoss.tar.gz&lt;br /&gt;
* Unpack SGSGeneLoss.tar.gz and place SGSGeneLoss.jar and all the R scripts in chosen directory/directories, for example ./my_geneloss&lt;br /&gt;
* Move into ./my_geneloss and create SGSGeneLoss_lib directory (on linux: cd ./my_geneloss, mkdir SGSGeneLoss_lib directory)&lt;br /&gt;
* Download picard-tools (SGSGeneLoss was tested with picard-tools 1.89)&lt;br /&gt;
* Place picard-1.89.jar and sam-1.89.jar in ./my_gene_loss/SGSGeneLoss_lib&lt;br /&gt;
* Now you are ready to run SGSGeneLoss&lt;br /&gt;
&lt;br /&gt;
== Input and output files for SGSGeneLoss.jar ==&lt;br /&gt;
* Input files:&lt;br /&gt;
** Sorted, indexed .bam file with sequencing reads mapped to the reference genome sequence, multiple .bam files can be provided  as comma separated list&lt;br /&gt;
** Gff3 file with reference genome annotation, has to contain gene, mRNA and exon fields&lt;br /&gt;
* Output files&lt;br /&gt;
** Result files for each chromosome separately&lt;br /&gt;
** File with overall stats - stats.txt&lt;br /&gt;
** File with summary for all the chromosomes used - chrs.txt (this file is used by one of the R scripts)&lt;br /&gt;
** File with list of genes lost for all the chromosomes - graph.txt (this file is used by one of the R scripts)&lt;br /&gt;
&lt;br /&gt;
== Command line options for SGSGeneLoss.jar==&lt;br /&gt;
&lt;br /&gt;
Required:&lt;br /&gt;
&lt;br /&gt;
bamPath - path to your bam file/files, has to end with / or \ bamPath=/home/my_bams/&lt;br /&gt;
&lt;br /&gt;
bamFileList - a single .bam file or a comma separated list, only file names, bam and corresponding .bai files have to be in a directory provided in bamPath bamFileList=bam1.bam,bam2.bam&lt;br /&gt;
&lt;br /&gt;
gffFile - location of gff3 file gffFile=/home/my_gffs/annot.gff3&lt;br /&gt;
&lt;br /&gt;
outDirPath - location output directory, has to end with / or \ outDirPath=/home/my_results&lt;br /&gt;
&lt;br /&gt;
Optional:&lt;br /&gt;
&lt;br /&gt;
minCov - minimal coverage threshold to consider position covered [minCov=1]&lt;br /&gt;
&lt;br /&gt;
chromosomeList - comma separated list of chromosomes to be used for analysis, use all, for all chromosomes [chromosomeList=all]&lt;br /&gt;
&lt;br /&gt;
lostCutoff - coverage cutoff to consider gene as lost for calculating stats [lostCutoff=0.0]&lt;br /&gt;
&lt;br /&gt;
covCats - coverage categories for visualization [cavCats=0,10,20,30,40,70]&lt;br /&gt;
&lt;br /&gt;
extendedFmt - used extended format, additional info included in output files [regular format] &lt;br /&gt;
 &lt;br /&gt;
To see help run: java -jar SGSGeneLoss.jar help&lt;br /&gt;
&lt;br /&gt;
== Sample command == &lt;br /&gt;
* Move into directory where SGSGeneLoss.jar is&lt;br /&gt;
* Please make sure that all your supplied paths end with / or \ &lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3 outDirPath=/home/uqagnieszka/results/&lt;br /&gt;
 chromosomeList=all&lt;br /&gt;
&lt;br /&gt;
 java -Xmx4g -jar SGSGeneLoss.jar bamPath=/home/uqagnieszka/bams/ bamFileList=arabidopsis.sorted.bam, arabidopsis2.sorted.bam gffFile=/home/gff_files/Athaliana_167_gene_exons.gff3&lt;br /&gt;
 outDirPath=/home/uqagnieszka/results/ chromosomeList=Chr1,Chr2 minCov=2 lostCutoff=0.05 covCats=0,2,5,10,20 extendedFmt&lt;br /&gt;
&lt;br /&gt;
== Output files format ==&lt;br /&gt;
&lt;br /&gt;
All the output files are comma separated text files.&lt;br /&gt;
*.excov files - files with results for each chromosome (files use chromosome names as in .bam files), files come in two formats basic (default) or extended (extendedFmt)&lt;br /&gt;
**basic format: chromosome,ID,is_lost,start_position,end_postion,frac_exons_covered,frac_gene_covered,ave_cov_depth_exons,cov_cat,ave_cove_depth_gene&lt;br /&gt;
**extended format: contains additional columns with information about each of the exons&lt;br /&gt;
*stats.txt - file with summary information about all genes&lt;br /&gt;
*chrs.txt - file with summary information about chromosomes&lt;br /&gt;
**chr,start,end,len&lt;br /&gt;
*graph.txt - file with list of genes lost as determined by lostCutoff&lt;br /&gt;
**chr,id,start,end&lt;br /&gt;
&lt;br /&gt;
==Plotting results==&lt;br /&gt;
&lt;br /&gt;
Results are visualized using R scripts.&lt;br /&gt;
&lt;br /&gt;
Two ways of visualization are possible:&lt;br /&gt;
*results per chromosome&lt;br /&gt;
*results for all chromosomes as a circular graph&lt;br /&gt;
&lt;br /&gt;
'''Results per chromosome:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*scripts graph_chromosomes.R, graph_main.R in the same directory&lt;br /&gt;
*.excov files (either basic or extended) with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc.&lt;br /&gt;
*directory (location) where files with results from SGSGeneLoss.jar: Chr1.excov, Chr2.excov etc. can be found&lt;br /&gt;
*file listing all the result files for which you want graphs drawn, one per line - for example graph_list.txt file which looks like this:&lt;br /&gt;
 Chr1.excov&lt;br /&gt;
 Chr2.excov&lt;br /&gt;
 Chr3.excov&lt;br /&gt;
&lt;br /&gt;
graph_chromosomes.R takes three arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. location of directory where .excov file are located &lt;br /&gt;
&lt;br /&gt;
2.  file listing all the result files for which you want graphs drawn &lt;br /&gt;
&lt;br /&gt;
3. gene loss cutoff&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_chromosomes.R /home/uqagnieszka/results /home/uqagnieszka/results/graph_list.txt 0.0&lt;br /&gt;
&lt;br /&gt;
'''Summary results for all chromosomes, possibly multiple samples:'''&lt;br /&gt;
&lt;br /&gt;
What you need:&lt;br /&gt;
*script graph_circles.R&lt;br /&gt;
*graph.txt from SGSGeneLoss.jar run&lt;br /&gt;
*chrs.txt from SGSGeneLoss.jar run&lt;br /&gt;
*file assigning numeric order to chromosomes (this is done because some chromosomes have complicated names and sorting in ASCII order does not always work) - file should look like this, chromosome names will be replaced by corresponding numbers&lt;br /&gt;
 chrs,no&lt;br /&gt;
 chr1,1&lt;br /&gt;
 chr2,2&lt;br /&gt;
 chr10,10&lt;br /&gt;
&lt;br /&gt;
graph_circles.R takes four arguments in this order:&lt;br /&gt;
&lt;br /&gt;
1. file with chromosome info - chrs.txt from SGSGeneLoss.jar run&lt;br /&gt;
&lt;br /&gt;
2. file with chromosome order &lt;br /&gt;
&lt;br /&gt;
3. file with genes lost - graph.txt from SGSGeneLoss.jar run; it can be a comma separated list of multiple files (for example multiple samples). Circles will be drawn in the following order:&lt;br /&gt;
&lt;br /&gt;
first file in the list is the innermost circle, so if you have graph1.txt,graph2.txt,graph3.txt, order of circles will reflect order of files, starting from the inside&lt;br /&gt;
&lt;br /&gt;
4. output file&lt;br /&gt;
 &lt;br /&gt;
 Rscript --vanilla graph_circles.R chrs.txt chrs_order.txt graph1.txt,graph2.txt,graph3.txt out.png&lt;br /&gt;
&lt;br /&gt;
== FAQ ==&lt;br /&gt;
* If memory consumption is a problem please consider increasing -Xmx or splitting your .bam files &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Back to [[Main_Page]]&lt;/div&gt;</summary>
		<author><name>Agnieszka</name></author>	</entry>

	</feed>