Profiles of genomics organisations – Ambry Genetics

There are hundreds of companies and institutes devoted to the study of the genetic code (genome), which are making tremendous strides in understanding the mechanisms of life (including individual people) at the most fundamental of levels.  I’ve listed the ones that I can find at http://www.grouthbio.com – hopefully helping future biologists and computer scientists see what they might do as a genome scientist, young researchers find jobs, start-up companies find customers, collaborators and investors, and the rest of us learn what’s going on in this fascinating field, or at least find hope in our rapidly advancing understanding of cancer and other complex diseases.

Now I’m beginning profile them as I read what’s written on their websites and categorize them as to what they hope to do or provide.   I’m starting with Ambry Genetics.

Ambry Genetics – http://www.ambrygen.com :

Ambry is a full-service genomics service provider, serving both the medical (as a CLIA-approved diagnostics laboratory) and research communities.  Theirs is one of the most comprehensive services in the industry since they systematically research the available technologies and adopt them as needed to fill any holes in their capabilities.   As they’ve been in business for over 13 years, they have tested, and at some point adopted, most of the major DNA sequencing and genotyping platforms.   This makes them a comprehensive expert resource.

Currently, they use Illumina platforms quite a lot for whole genome sequencing (determining the sequence of all the nuclear DNA in an organism), whole exome sequencing (studying just the DNA that codes for genes), and multiplexed analyses of specific collections of genes.  They complement their Illumina technologies with Fluidigm, Agilent, Ion torrent, Roche and other platforms as needed.  Therefore they can look globally at an entire genome sequence, the exome (just that part of the genome that codes for the genes), a targeted list of genes, or develop assays for a specific mutation or set of mutations.  They also have the capacity to look at how and when genes are expressed (or used) by an organism at a specific developmental stage or in response to a specific environmental stimulus.

Since these laboratory tests generate a vast amount of data, Ambry maintains a bioinformatics group using Ingenuity Systems’ platforms for biochemical pathway analysis and variant detection.  They also use and develop specific algorithms to analyze the quality of their sequence data, and assist with determining which (of potentially many) mutations are worth a second look when a doctor and patient are on diagnostic odyssey to find cause, prognosis, and treatment of a rare or complex and poorly understood malady.  Ambry also maintains genetic counseling resources for clinicians and their patients.

They have specific clinical resources for exome sequencing as well as test panels specific to certain types of cancer, Autism, cardiovascular diseases, Marfan syndrome and other disorders.

In short, Ambry endeavors to be a full-service expert resource for clinical genetic testing and counseling, as well as a research service to academic and industry labs which cannot maintain their own comprehensive genomics program.

Advertisements

Three Bioinformatics Tools that Any Scientist Can Learn Today

Hello and thanks for stopping by!  The first entry of what I hope will be many is an essay by Mark N. Ziats, PhD.  He is the founder and president of the consulting firm Creative Bioinformatics (www.creativebioinformatics.com).  His publication record (http://www.ncbi.nlm.nih.gov/pubmed/?term=ziats+mn%5Bauthor%5D) demonstrates his first-hand experience in coping with the flood of structural and functional genomics data that threatens to overwhelm the modern neuroscientist.  His essay below is intended to share some useful tools he has learned about in his work.  Please scroll down the the bottom of this post for more on Mark.  So, without further rambling, on to his essay:

Three Bioinformatics Tools that Any Scientist Can Learn Today

Bioinformatics is scary to the uninitiated. Fortunately, there are many bioinformatics tools available for biologists with no computational expertise that can provide important, insightful analysis to genomics datasets. These tools are designed to be point-and-click applications that take no more computer skills to operate than are required by Microsoft Word or PowerPoint.

I personally taught myself how to use each of the following three tools in less than one day.  With a starter dataset to work with, and a few hours spent learning each of these programs, anyone can expertly annotate genomics data to provide novel insight into cellular pathways and network interactions among the genes of interest.

 

1) DAVID Gene Ontology (http://david.abcc.ncifcrf.gov/)

The Database for Annotation, Visualization and Integrated Discovery (DAVID) is a gene set enrichment analysis tool provided to the research community for free by an intramural laboratory at the National Institute of Allergy and Infectious Disease, NIH.  The DAVID tool is an online web interface that allows for copy-and-paste or manual upload of files directly on the website.

DAVID takes as an input lists of genes in many formats, such as the official Entrez gene symbol, UniProt ID, or even unique gene identifiers from some of the most commonly used microarray platforms.  The software then runs a functional ontology enrichment analysis on the inputted list of genes, to assess the list for over-representation of particular cellular or biological pathways (termed Gene Ontology (GO) categories).

As an output, DAVID provides lists of pathways enriched among the input dataset with corresponding significance values, which are even corrected for multiple testing comparisons.  While the default settings should be more than sufficient for most users, DAVID allows users to specify specific parameters such as the stringency of significance, the type of significance tests to include, and the database/pathways to assess.

Gene ontology enrichment analysis should be the first step in the annotation of any genomics dataset, and DAVID provides users with a simple interface to do so for free.  Results can be downloaded in .txt format and then opened with Microsoft Excel, or even just copy-and-pasted from the browser interface.

Estimated time to learn: 2 hours

 

2) Cytoscape (www.cytoscape.org/‎)

Cytoscape is another free tool for the research community that allows for the visualization of network interactions among genes/proteins and provides the ability to quantify network properties in order to compare them to one another.

Cytoscape was originally developed by the Systems Biology Institute in Seattle, WA, and is now managed by a multiple member consortia of research organizations.  Unlike the web-based program DAVID, Cytoscape requires the user to download software to their local computer in order to function.

Cytoscape is an open-source software tool, meaning that others can access the code the program is built upon and therefore can develop additional tools that integrate within Cytoscape (called plugins).  These can be incredibly useful in addition to the basic features of Cytoscape, and there are hundreds available with a myriad of functions (all free of charge).

Cytoscape can import user’s files or can directly access archived datasets or known gene-gene (or protein-protein) interaction databases.  Cytoscape then creates interaction networks based on the underlying data (for example correlations in gene expression levels, or known protein-protein interactions).  These networks can be further analyzed by quantifying their graph theory properties using built-in analysis tools, or more sophisticated plugins.

Unlike DAVID, Cytoscape provides not only quantifiable analysis outputs but also publication-ready graphics.  While an analysis may assess many networks for different properties, one high-quality graphic of a representative network makes for a nice figure in genomics manuscripts.

Cytoscape can initially seem intimidating, but after spending a day getting comfortable with its workflow any scientist should be able to create networks, analyze their properties, and generate publication-ready figures quickly.  Furthermore, the Cytoscape community is large and supportive, so there are many forums and publications to help beginners get acquainted and start analyzing their data.

Suggested reading:  Cline MS, et al. Integration of biological networks and gene expression data using Cytoscape. Nat Protoc. 2007;2(10):2366-82.

Estimated time to learn: 8 hours

 

3) Ingenuity Pathways Analysis (www.ingenuity.com/products/ipa)

For those of you at research institutions, check to see if your department or institution has a site license for Ingenuity Pathways Analysis (IPA).  This software, which runs as a hybrid compared to DAVID and Cytoscape (i.e. the analysis is carried out on IPA’s servers but user access this through a JAVA-based interface that is downloaded to their local machines), also functions as somewhat of a hybrid between those two tools.

IPA takes as input gene lists, similar to DAVID, and as output provides both functional enrichment analysis lists (again similar to DAVID) and graphical pathway networks similar to Cytoscape. However, IPA is unique to these two programs in a number of ways.

First, IPA is somewhat easier to learn than Cytoscape, but at the cost of providing less information about the ‘networks.’  Whereas Cytoscape allows for the statistical assessment of network properties based on graph theory, IPA only provides graphical representation of biological pathways as networks of interacting genes, but focuses solely on the biological nature of the pathways and not their graph theoretical properties.

Unlike DAVID, which assesses for functional enrichment into known Gene Ontology (GO) categories, IPA functional enrichment assesses for pathways based on IPA’s proprietary ‘knowledge base.’ This knowledge base is a based upon manually-curated descriptions of gene-gene (or protein-protein) interactions from the literature (as is GO), but without the ‘open-access to the underlying annotations that GO allows.

The output of IPA is both functional enrichment lists with statistical significance, and gene interaction pathways.  While the functional enrichment is similar compared to DAVID, the pathways graphics produced by IPA are more ‘gene focused’ than in Cytoscape, where the emphasis is more on the network as a whole.  Therefore, IPA is often a good resource for biologists interested in specific molecular/cellular pathways based on known biological interactions, as opposed to the more theoretical approach to network analysis provided by Cytoscape.

Note: requires site license to access after initial free trial period

Estimated time to learn: 4 hours

 

Summary

DAVID, Cytoscape, and IPA represent three of the core bioinformatics tools for assessing genomics data that can be learned by any biologist with no computer skills, and can be learned today.  These three tools provide complementary insight into the functional ontologies, pathways, and network properties of underlying gene lists, with varying degrees of user competency needed and different types of outputs generated.  Used together, these three tools could easily produce enough annotation of a gene expression dataset to fill the results section of a high-quality manuscript.  The best part is that any scientist can quickly and easily learn these tools, and he or she could have all the analysis complete by the end of the week.

 

 

 

About the author:

Mark N. Ziats, PhD is the founder and current president of Creative Bioinformatics Consultants, LLC (www.creativebioinformatics.com), a fee-for-service bioinformatics firm specializing in custom data analysis for academic laboratories, non-profit research institutes, and industry.  Creative Bioinformatics Consultants function like a team of ‘temporary post-docs,’ providing customers with specific data analysis at their direction to enable customers to evaluate their specific scientific hypothesis by avoiding pipeline processing of data. Contact him via email at mark@creativebioinformatics.com.

Welcome!

Hello and thanks for visiting!  As I have been following new developments in biology and medicine while listing related companies and institutions on my website (www.grouthbio.com), I know many folks who have a lot to say on the subject.  I hope to provide a forum for them to share their knowledge on genetics, genomics, bioinformatics, and medicine (particularly personalized/precision medicine).  My target audience is fairly broad, so information relevant to biologists, clinicians, or the interested general public are welcome.  If you would like to post something here, please call me at (805) 223-5831 or email me at geoffhrouth@gmail.com.

Thank you!

Cheers,

Geoff

 

Geoffrey Routh, Ph.D.