List of Baseline Services

This document summarises the current list of existing Web services that we intend to harness in the bioinformatics theme of the OpenKnowledge project. For most of these we provide a link to the WSDL interface that we can use as the connection point between LCC interaction models and the service itself (via constraints in the LCC models). As OpenKnowledge proceeds this list will be adapted. For more information on the list and its maintenance contact Xueping Quan (xquan@inf.ed.ac.uk).

Category

Web Service

Description

Literature and Ontologies

 

Whatizit(EBI)

A text processing system that allows you to do textmining tasks on text. (wsdl file not available)

Ontology Lookup(EBI)

OLS provides a web service interface to query multiple ontologies from a single location with a unified output format.

CitExplorer(EBI)

A number of web services are available to retrieve data from the Citation database.

GO (DDBJ)

Gene Ontology (GO) provides a controlled vocabulary to describe gene and gene product attributes in any organism.

 

 

 

Sequence Signature Analysis

InterproScan(EBI)

InterProScan, protein domain identifier, a tool that combines different protein signature recognition methods native to the InterPro member databases into one resource with look up of corresponding InterPro and GO annotation

SeqVISTA(zhiping)

A new module of integrated computational tools for studying transcriptional regulation.(n/a)

Promoser (zhiping)

Human, Mouse and Rat promoter extraction service

RepeatMasker(zhiping)

Screens DNA sequences for interspersed repeats and low complexity DNA sequences (n/a)

POSSUM(zhiping)

Detect cis-elements in DNA sequences

Clover(zhiping)

A program for identifying functional sites in DNA sequences

MotifScanner(zhiping)

A program that can be used to screen DNA sequences with precompiled motif models

GLAM(zhiping)

A program for discovering functional motifs shared by a set of nucleotide sequences

MotifSampler(zhiping)

A Gibbs sampling based motif finding algorithm

Cluster-Buster(zhiping)

Finding dense clusters of motifs in DNA sequences.

PML (DDBJ)

It assemble and analyze SNPs data from distributed heterogeneous database in XML documents described by PML

SPS (DDBJ)

A splicing profile similarity (SPS) index, which measures relative exon length discrepancy

 

 

 

Pair-wise Alignment

Emboss(EBI)

Use the Water and Needle methods for pair-wise alignment

 

 

 

Multiple Alignment

Muscle(EBI)

The Multiple Sequence Comparison by Log-Expectation.

ClustalW(EBI)

A progressive global multiple alignment tool for DNA and protein sequences.

TCoffee(EBI)

A progressive multiple sequence alignment program.

KAlign(EBI)

A fast and accurate multiple sequence alignment algorithm.

Mafft(EBI)

A high speed multiple sequence alignment program.

 

 

 

Homology Searching

Fasta(EBI)

Used for a fast protein comparison or a fast nucleotide comparison.

WU-Blast(EBI)

Used to compare a novel sequence with those contained in nucleotide and protein databases.

NCBI Blast(EBI)

Used to compare a novel sequence with those contained in nucleotide and protein databases.

PSI-Blast(EBI)

Position specific iterative / Pattern Hit Initiated Blast

MPsrch(EBI)

A biological sequence sequence comparison tool that implements Smith and Waterman algorithm

ScanPS(EBI)

A program for comparing a protein sequence to a database of protein sequences.

BLAST (DDBJ)

(blastdemo)

Used to compare a novel sequence with those contained in nucleotide and protein databases.

FASTA (DDBJ)

Used for a fast protein comparison or a fast nucleotide comparison.

 

 

 

Data Retrieval

XEMBL (EBI)

A service to display data from the EMBL Nucleotide Sequence Database in XML formats, BSML and AGAVE

Dbfetch (EBI)

Allows you to retrieve entries from various up-to-date biological databases.

MSD API(EBI)

Access to data and tools from the Macromolecular Structures Database.

ChEBI(EBI)

Allows you to retrieve entries from the ChEBI database.

Integr8(EBI)

Access to a subset of the data available from the Integr8 Web Portal.

Entrez Utilities(NCBI)

(Lite version of E-Utilities (no EFetch);

EFetch Web Service;

EFetch Web Service for PubMed and nlmCatalog only.)

A federated search engine that allows users to search many discrete health sciences databases, maintained by the National Center for Biotechnology Information with a single query string and user interface.

It includes EGQuery, EFetch, EInfo, ELink, ESearch, ESpell, and ESummary utilities

 

SRS (DDBJ)

Sequence retrieval system

ARSA (DDBJ)

all-round retrieval of sequence and annotation in database DDBJ

GetEntry (DDBJ)

Data retrieval by accession number etc.

Txsearch (DDBJ)

Execute Taxonomy Database search specified with taxonomy name

VecScreen (DDBJ)

DDBJ Vector Screening System

 

 

 

Genome Analysis

GIB (DDBJ)

Genome information broker: search selected genomes at once

GTOP (DDBJ)

A database consists of data analyses of proteins identified by various genome projects It uses sequence homology analyses and features extensive utilization of information on three-dimensional structures

ENSEMBLE (DDBJ)

Ensembl Genome Browser produces and maintains automatic annotation on selected eukaryotic genomes

MAPPING (DDBJ)

Map a sequence to mouse genome

 

 

 

Structural Comparison

Dalilite(EBI)

A program for pair-wise structure comparison.

SSM(EBI)

A service for comparing protein structures in 3D. (no wsdl file)

GASH (PDBJ)

A robust method for aligning protein structures by maximizing the

Number of Equivalent Residues

 

 

 

Structure Modelling

Maxsprout (EBI)

A fast database algorithm for generating protein backbone and side chain co-ordinates from a C(alpha) trace.

 

 

 

Microarray

BioMart( EBI)

n/a

GeneCruiser (MIT)

GeneCruiser: a web service for the annotation of microarray data

SemBiosphere(Yale)

A Semantic Web Approach to Recommending Microarray Clustering Services(n/a)

 

 

 

DataBase

DDBJ

DNA Data Bank in Japan

PDBJ

Protein Structure DataBank

BIND

Biomolecular Interaction Network Database

Pfam

A collection of multiple sequence alignments and hidden Markov models covering many common protein domains and families

OMIM (DDBJ)

This database is a catalog of human genes and genetic disorders.

RefSeq (DDBJ)

The NCBI Reference Sequence (RefSeq) collection aims to provide a comprehensive, integrated, non-redundant set of sequences, including genomic DNA, transcript (RNA), and protein products, for major research organisms.

NCBI Genome Annotation (DDBJ)

Assembled and annotated genomes available via GenBank, EMBL and DDBJ.

 

 

 

"omic" data interpreter

GoMiner

GoMiner: a resource for biological interpretation of genomic and proteomic data

KEGG

Kyoto Encyclopedia of Genes and Genomes

 

 

 

system biology language

BASIS

Biology of Ageing e-Science Integration and Simulation System

 

 

 

Integration

BioMOBY

Integrate distributed heterogeneous bioinformatics web services

myGrid

Data and application integration such as resource discovery, workflow enactment and distributed  query processing

PathPort(VBI)

Gateway to Distributed Data Management and Computing: The PathPort/ToolBus Framework