You can submit data in two ways. You can submit results summary statistics (calculated and formatted according to the analysis plan) or you can submit individual-level data.
We prefer you submit individual-level data because they can be used beyond the few analyses that are described in the analysis plan.
Results summary statistics
Information on how to upload results summary statistics are given in the analysis plan in the section “Results upload instructions”
If you are not from US:
If you are from the US:
All researchers can apply for access to the initiative's data deposited on EGA and AnVIL via their respective DAC. For data at the EGA, the DAC is composed by the PIs of the studies that have deposited the data, and will facilitate access to the full data pool. For data on the AnVIL, access will be managed by a DAC at the NIH.
All researchers are required to follow the code of conduct outlined in https://www.covid19hg.org/about/.
On the result page, we make available the meta-analysis summary statistics for the combined studies with and without UK Biobank. However, to access the study-specific summary statistics you will need to get in contact with each study PI separately.
The EGA is working with the ELIXIR network to establish the EGA Federation network to enable data to be deposited within national jurisdictions. We expect to launch the first nodes in mid-late 2020. In the meantime, we suggest you contact your country's ELIXIR head of node to find out about the current status for your country.
Both EGA and AnVIL recommend using open standards and formats that are maintained by the Global Alliance for Genomics and Health (GA4GH), published in the GA4GH Genomic Data Toolkit. For genome sequencing data this includes FASTQ, BAM, CRAM, and VCF. All array-based technologies are accepted, which may include the raw data, intensity and analysis files, and there are no restrictions on data formats accepted.
The EGA is managed by EMBL-EBI and Center for Genome Regulation, Barcelona (CRG). At EMBL, that protection is enacted by the Internal Policy 68 on general data protection (IP 68). IP 68 resembles the GDPR, but adapts to the intergovernmental nature of EMBL and to the needs of enabling free scientific research across national borders. CRG is subject to the GDPR and implements it fully. The EGA GDPR notices can be found here.
Clinical data should be included as part of the study submission. We suggest formatting the data following the initiative’s data dictionary (tab FREEZE_1). Not all the variables listed in the data dictionary are required. If you want to submit variables that are not listed in the data dictionary please contact email@example.com
Yes, this is entirely possible. We suggest creating a dataset to submit every 500 samples