1. Introduction

This section give a brief introduction of PGIDB. From this section, users can learn what information PGIDB provides and how to cite PGIDB.

1.1 About PGIDB

PGIDB is a user-friendly platform designed to collect globally available genome sequencing data to construct a high-quality pig reference panel. In the current version, the reference panel integrates 22993 samples, covering a total of 63 breeds (9 outgroups).

Service provided
  • Discover the diversity of pig breeds in our Library.
  • Uncover genetic potential through our Imputation scenarios.
  • Search for breed-specific variants and potentially deleterious variants in the "Search" module
  • Dive deep into breed-specific haplotype blocks in our Haplotype section.
  • Download breed-specific VCF files from our resources.
  • Support user submit data link to our database in "Submit" module

1.2 How to cite PGIDB

Kaili Zhang, Jiete Liang, Yuhua Fu, Jinyu Chu, Liangliang Fu, Yongfei Wang, Wangjiao Li, You Zhou, Jinhua Li, Xiaoxiao Yin, Haiyan Wang, Xiaolei Liu, Chunyan Mou, Chonglong Wang, Heng Wang, Xinxing Dong, Dawei Yan, Mei Yu, Shuhong Zhao, Xinyun Li, Yunlong Ma, AGIDB: a versatile database for genotype imputation and variant decoding across species, Nucleic Acids Research, Volume 52, Issue D1, 5 January 2024, Pages D835–D849, https://doi.org/10.1093/nar/gkad913

2. Tutorial

This part provides the operation guide for each functional module of PGIDB. Important steps are marked with red boxes and serial numbers.

2.1 Library

This module is a pig library, which systematically introduces the genome information, breeds and published databases of pigs. All breeds are divided by classification. By clicking on the name of the breed, users can learn more about the appearance and characteristics of the breed.

2.2 Genotype Imputation

This module provides a free online imputation service, including four imputation scenarios:

1.Imputation from Chip to High-coverage sequencing.

2.Imputation from Low-coverage to High-coverage sequencing.

3.Imputation from Chip to Chip, to realize map conversion between chips.

4.Imputation of the specified map.

Four imputation scenarios

2.2.1 Chip-to-sequence

This module provides the imputation function from chip data to sequencing data using Beagle 5.4 software. In this module, it is supported to upload the target data set in VCF or VCF compressed format, and the imputed or phased file will automatically pop up to the local browser page.

Imputation

Imputation steps: submit the target file to be imputed(Size must < 50 MB); enter the genomic region of interest; select the default reference panel or a subset of the reference panel; specify phasing and imputation requirements; click the "Submit" button. Users can also click the "example" button to use the example file to experience our imputation function.

The imputation result file will be directly downloaded to the local.

The reference panel defaults to "AGIDB", that is, all samples of Pig provided by the database. Users can also select subsets of the reference panel by breed as desired.

2.2.2 Low-coverage to High-coverage sequence

This module uses GLIMPSE2 software to provide the imputation function from low-depth sequencing to high-depth sequencing. This module supports uploading two forms of target files (VCF/BCF or BAM/CRAM), and the resulting file will automatically pop up to the local browser page.

Scenario 1: The target file is in VCF/BCF format.

Note: Users can click "example.vcf.gz" to download the example VCF file provided for each species.

The imputation result file will be directly downloaded to the local.

Scenario 2: The target file is in BAM/CRAM format.

In this module, BAM/CRAM files of a single sample or multiple samples are supported to be uploaded at the same time. When uploading BAM files of multiple samples, a BAM/CRAM list file must be submitted at the same time. For the format of the list file, see "List example". One file per line. A second column (space separated) can be used to specify the sample name, otherwise the name of the file is used.

The imputation guide is shown in the figure below:

The imputation result file will be directly downloaded to the local.

2.2.3 Chip to Chip

This module provides the conversion function of different versions of chips. By selecting the chip version of interest, a VCF file of the specified map can be obtained. We also provide a Venn diagram between versions for user reference.

Step 1: Upload a VCF file to be converted or click the "example" button to use an example file
Step 2: Choose a chip version
Step 3: Enter a genomic region of interest
Step 4: Enter the effective population size
Step 5: Click the "Submit" button to perform chip version conversion
Step 6: Download the converted file

2.2.4 To Map

This module provides an imputation function for a specified map. By submitting a Map file, the user can obtain the VCF file corresponding to the map.

Step 1: Upload a Map file or click the "example" button to use an example file
Step 2: Upload a VCF file or click the "example" button to use an example file
Step 3: Enter a genomic region of interest
Step 4: Enter the effective population size
Step 5: Click the "Submit" button to get the imputation result of the specified map
Step 6: Download

2.3 Breed-specific Variant

This module provides breed-specific variant search. Users can easily query breed-specific Variant based on breeds and genomic regions. Search results are displayed in a comprehensive table.

Step 1: Breed-specific Variant Search
Step 2: Breed-specific Variant
Step 3: Allele Frequency Distribution

2.4 Deleterious Variant

This module provides deleterious variant search. Users can easily query deleterious variant based on breeds, software and score. Search results are displayed in a comprehensive table.

Step 1: Deleterious Variant Search
Step 2: Deleterious Variant
Step 3: Deleterious Variant Score

2.5 Haplotype Block

This module provides haplotype block search. Users can easily query haplotype blocks based on breeds and genomic regions. Search results are displayed in a comprehensive table.

Step 1: Haplotype Block Search
Step 2: Haplotype and frequency

Click on a BLOCK ID in the second column to display its frequency data.

2.6 Download

In the "Download" module, we provide the VCF compressed file of each breed for download, and mark the size of each file.

2.7 Submit

In the "Submit" module, support user submit data link to our database.

2.8 About AGIDB

Animal Genotype Imputation Database

Click "AGIDB" on the "Home" page to jump to the AGIDB website.

3. FAQs

Q1. How to use PGIDB?

A1: You can quickly learn about the core functions of PGIDB through the Home page, and learn about each functional module of PGIDB in more detail through the Tutorial on the Help page.

Q2. How should I contact you if I found a bug or have a suggestion about the database?

A2: You can contact us by email in "Contact". You are welcome to contact us anytime.

Q3. What is the future plan of PGIDB?

A3: PGIDB will work to collect more sample data to increase genetic diversity; enriching the imputation scenarios and applications in the database, and breaking through some technical limitations of the imputation platform to achieve more practical and efficient imputation services.

Q4. How to cite PGIDB?

A4: Kaili Zhang, Jiete Liang, Yuhua Fu, Jinyu Chu, Liangliang Fu, Yongfei Wang, Wangjiao Li, You Zhou, Jinhua Li, Xiaoxiao Yin, Haiyan Wang, Xiaolei Liu, Chunyan Mou, Chonglong Wang, Heng Wang, Xinxing Dong, Dawei Yan, Mei Yu, Shuhong Zhao, Xinyun Li, Yunlong Ma, AGIDB: a versatile database for genotype imputation and variant decoding across species, Nucleic Acids Research, Volume 52, Issue D1, 5 January 2024, Pages D835–D849, https://doi.org/10.1093/nar/gkad913

4. Updates

  • 2024/1/5
  • The manuscript was officially online.
  • 2023/10/27
  • The manuscript was online. Links to the article
  • 2023/5/4
  • PGIDB was released to the Internet environment.
  • 2023/3/9
  • Update Beagle to the latest version, and update the low-depth sequencing imputation software GLIMPSE1 to GLIMPSE2.
  • 2022/11/7
  • Improve the functions of each module and provide VCF files for download.
  • 2022/9/23
  • Complete the SNP Search Module.
  • 2022/9/15
  • Complete the imputation module.
  • 2022/8/16
  • The haplotype analysis results of each breed were integrated into PGIDB in the form of haplotype modules to realize the retrieval and visualization of haplotype blocks.
  • 2022/7/1
  • Customized SNP retrieval and imputation module for pig, designed pig imputation platform (PGIDB).
  • 2021/11/23
  • Complete data cleaning and phasing, and determine the haplotype analysis scheme.
  • 2021/10/20
  • Collection, processing, and genotyping of pig whole-genome sequencing (WGS) data.

5. Contact

Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, College of Animal Science and Technology, Huazhong Agricultural University Wuhan, Hubei, 430070, PR China

Email:

The mailing lists are in no particular order, sorted by first and last name.

Yunlong Ma (Yunlong.Ma@mail.hzau.edu.cn)

Kaili Zhang (kelly1153793935@163.com)