myths, thoughts and facts about genes and genomes of vertebrates (especially jawless fishes and cartilaginous fishes)
Saturday, December 28, 2013
Japanese lamprey gene set now available at aLeaves
As announced in the change log, I have included the predicted gene set of the Japanese lamprey (Lethenteron japonicum) in the database #6 of aLeaves, the homolog collection tool I have been maintaining.
Thursday, September 12, 2013
Sea Lamprey Consortium Gene Set at aLeaves
The other day I updated the web server I am maintaining, aLeaves, to include latest information for protein-coding genes based on public genome databases.
The database #6 at aLeaves now includes the Ensembl gene set and the consortium gene set for the sea lamprey (Petromyzon marinus) genome. It also contain the set of predicted proteins for the elephant shark genome, although the genome assembly for this species is highly incomplete.
The database #6 at aLeaves now includes the Ensembl gene set and the consortium gene set for the sea lamprey (Petromyzon marinus) genome. It also contain the set of predicted proteins for the elephant shark genome, although the genome assembly for this species is highly incomplete.
Labels:
aLeaves,
chondrichthyes,
cyclostome
Saturday, July 6, 2013
Tools for NGS analysis - fastq file processing
As I have recently been working (although very slowly) in person on output of the HiSeq1500 in our facility, I needed to look for, test and validate some tools to handle fastq files for various purposes. Below I list some of them for those who are starting or will start this sort of work.
For various kinds of filtering/trimming
seqtk - a fixed version that retains full sequence names (or 'comments') is here
[see a post at BioStar]
fastx-tools (of many tools therein, I am using fastx_trimmer and fastq_quality_filter)
prinseq (of many options, I use 'trim_left/right' and 'derep')
condetri - ... I could not get this working in the way I wanted
For merging overlapping paired-end reads
cope
See this external blog post (from Nov. 2012) for more info
For removing adaptor sequences etc.
tagdust
cutadapt - Can't this tool accept multiple adapter sequences in a multifasta file?
For retrieving paired reads after read filtering
cmpfastq_pe
For removing 'duplicates'
filterPCRdupl (I will not use this any more because 'prinseq -derep 4' does the exactly the same thing much faster)
For validating the tools' functions
fastqc [ also, a tutorial movie available at YouTube ]
fastx_quality_stats
prinseq -stats_all
There should be more useful tools that I did not list here. Please first google with some key words and look into the 'Bioinformatics' forum at SEQanswers to get latest info. Its Wiki page there also provides a list of tools.
For various kinds of filtering/trimming
seqtk - a fixed version that retains full sequence names (or 'comments') is here
[see a post at BioStar]
fastx-tools (of many tools therein, I am using fastx_trimmer and fastq_quality_filter)
prinseq (of many options, I use 'trim_left/right' and 'derep')
condetri - ... I could not get this working in the way I wanted
For merging overlapping paired-end reads
cope
See this external blog post (from Nov. 2012) for more info
For removing adaptor sequences etc.
tagdust
cutadapt - Can't this tool accept multiple adapter sequences in a multifasta file?
For retrieving paired reads after read filtering
cmpfastq_pe
For removing 'duplicates'
filterPCRdupl (I will not use this any more because 'prinseq -derep 4' does the exactly the same thing much faster)
For validating the tools' functions
fastqc [ also, a tutorial movie available at YouTube ]
fastx_quality_stats
prinseq -stats_all
There should be more useful tools that I did not list here. Please first google with some key words and look into the 'Bioinformatics' forum at SEQanswers to get latest info. Its Wiki page there also provides a list of tools.
Labels:
bioinformatics,
fastq,
ngs,
tools
Friday, March 1, 2013
Tuesday, January 29, 2013
Complicated! - lamprey genome resource availability
I have been asked by many of people around me about publicly available resources of the sea lamprey genome. I have just written down some facts and information on an independent new page inside this blog, titled 'Lamprey genome guide'. The situation is somewhat complicated, I see.
I will try to keep this page updated for the lamprey researchers' convenience.
I believe this third-party guide does not interfere any database function and research activity by other researchers. But, please let me know if there is some problem.
I will try to keep this page updated for the lamprey researchers' convenience.
I believe this third-party guide does not interfere any database function and research activity by other researchers. But, please let me know if there is some problem.
Labels:
gene prediction,
genome,
lamprey
Saturday, January 26, 2013
'aLeaves' launched !
My team has launched a new tool 'aLeaves' to allow researchers to collect sequences that are homologous to a query (with NCBI Blast). The motivation to launched it is visualized in an image below. I hope it leads to wider uses of molecular phylogenetics and better understanding about how sequences evolve and how gene families have diversified. I would appreciate any feedback from users of this tool.
Labels:
aLeaves
Subscribe to:
Posts (Atom)