Datasources available on the platform

  • Allen Institute - 4,060 datasets

    The Allen Institute for Brain Science published a vast collection of gene expression and RNA-Seq data from 6 different human brains.

  • Array Express - 28,200 datasets

    ArrayExpress is an archive of functional genomics data provided by the European Bioinformatics Institute (EMBL-EBI).

  • Corpasome - 10 datasets

    The Corpasome - The first ever crowd-sourced analysis of a family's genomes using a number of different Direct-To-Consumer methods.

  • dbGaP - 915 datasets

    The dbGaP archives and distributes the results of studies that have investigated the interaction of genotype and phenotype.

  • ENCODE - 9,360 datasets

    Encyclopedia of DNA Elements - The ENCODE project is building a comprehensive parts list of functional elements in the human genome.

  • Estonian Biocentre - 20 datasets

    The Estonian Biocentre (EBC) was established in 1986 to promote research and technological development (RTD) of gene and cell technologies in Estonia.

  • ExAC - 1 dataset

    The Exome Aggregation Consortium (ExAC) is aiming to aggregate and harmonize exome sequencing data from a variety of large-scale sequencing projects.

  • GenomeAsia100K - 1,060 datasets

    The GenomeAsia100K Project - aims to generate genomic information for Asian populations by sequencing 100,000 individuals from from 12 South Asian countries.

  • Genome of the Netherlands - 1 dataset

    The Genome of the Netherlands aims to provide an ultra-sharp genetic group portrait of the Dutch population.

  • Genomes Unzipped - 16 datasets

    Genomes Unzipped provides results from commercial genetic tests of volunteers, making the raw data publicly available for others to download, analyse and reuse

  • GEO - 1,770 datasets

    Gene Expression Omnibus (GEO) is a public functional genomics data repository for array- and sequence-based gene expression profiles.

  • Genome in a Bottle - 52 datasets

    Genome in a Bottle (GIAB) is a public-private-academic consortium working to enable the translation of whole human genome sequencing into clinical practice.

  • GigaDB - 56 datasets

    GigaDB contains 255 discoverable, trackable, and citable datasets that have been assigned DOIs and are available for public download and use.

  • Genome Expression Atlas - 1,060 datasets

    The Genome Expression Atlas (GXA) provides differential expression data from both microarray and RNA-seq data.

  • Horizon Discovery - 22 datasets

    Horizon Discovery is a CRO offering over 170 carefully curated, well-characterized patient-derived xenograft (PDX) tumor models.

  • Integrative Japanese Genome Variation Project - 3 datasets

    Integrative Japanese Genome Variation Project. The iJGVD provides data of genomic variations obtained by whole-genome sequencing of Japanese individuals.

  • InSilico DB - 6,930 datasets

    InSilico DB - A curated database with bioinformatics analysis tools.

  • Chinese Kadoorie Biobank - 1 dataset

    The Chinese Kadoorie Biobank - provides a collection of control data from Chinese individuals.

  • Korean Bioinformation Center - 2 datasets

    A reference genome project using subjects from Korea.

  • MAGIC Consortium - 7 datasets

    The MAGIC Consortium - MAGIC investigates the genetic factors in insulin secretion and sensitivity using GWAS studies.

  • EBI Metagenomics - 92 datasets

    The EBI Metagenomics service is an automated pipeline for the analysis and archiving of metagenomic data.

  • Mike Lin - 1 dataset

    The personal genome of Mike Lin. The dataset contains WGS data from a blood sample.

  • Jackson Laboratory - 416 datasets

    The Jackson Laboratory's Mouse Tumor Biology (MTB) Database is providing information about patient-derived xenograft (PDX) mouse models.

  • Harvard Personal Genome Project - 1,080 datasets

    The Personal Genome Project hosts publicly shared genomic and health data from thousands of participants across the United States under a CC0 license.

  • Repositive - 12 datasets

    This collection contains data registered by users to the Repositive platform.

  • Singapore Genome Variation Project - 1 dataset

    The Singapore Genome Variation Project aims to characterize the genetic variation of each of the three ethnic groups in Singapore – Chinese, Malays and Indians.

  • Simons Foundation - 271 datasets

    The Simons Foundation division of Life Sciences provides complete genome sequences from more than one hundred diverse human populations.

  • NCBI SRA - 454,000 datasets

    The NCBI Sequence Rad Archive (SRA) stores raw sequencing data and alignment information from high-throughput sequencing platforms.

  • Steven Keating - 3 datasets

    Steven Keating publicised his personal genome and microbiome before, during and after chemotherapy.

  • THL Biobank - 7 datasets

    Finland's National Institute for Health and Welfare's (THL) Biobank hosts a remarkable collection of population and disease-specific samples for research purposes.

  • 1000 Genomes Project - 3,510 datasets

    The 1000 Genomes Project - Running 2008 to 2015 it is creating the largest public catalogue of human variation and genotype data.

  • TXCCR - 3,010 datasets

    The Texas Cancer Cell Repository - The TXCCR contains the Short-Tandem-Repeat profiles for cell lines in the COG Cell Line and Xenograft Repository.

  • University of California Santa Cruz - 1 dataset

    The UCSC provides access to the human reference genome and annotation data for download and analysis using their Genome Browser.

  • Cancer Methylome System - 232 datasets

    The Cancer Methylome System provides Methylomes for breast and endometrial cancer and corresponding control data.

  • Xpressomics - 6,720 datasets

    Xpressomics - A commercial collection of expert-curated and analyzed raw differential expression data.

Search through 1,000,000+ human genomic datasets
Create Account