You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While merging the vcf output from our sumstat merger, the following error occurs:
The REF prefixes differ: G vs A (1,1)
Failed to merge alleles at 1:896680 in /mnt/mfs/statgen/snuc_pseudo_bulk/eight_tissue_analysis/8test/ALL_Ast_End_Exc_Inh_Mic_OPC_Oli.1/ALL.log2cpm.bed.chr1.norminal.cis_long_table.vcf.gz
This is caused by:
1 896680 chr1:896680_A_G A G . PASS GENE=LINC01409 STAT:SE:P:TSS_D:AF:MA_SAMPLES:MA_COUNT -0.033299796:0.0425892:0.4348487535797959:117933:0.34819278:249:289
1 896680 chr1:896680_G_T G T . PASS GENE=LINC01409 STAT:SE:P:TSS_D:AF:MA_SAMPLES:MA_COUNT -0.12872803:0.12619513:0.3084487106650284:117933:0.024096385:20:20
1 896680 chr1:896680_A_G A G . PASS GENE=LINC01128 STAT:SE:P:TSS_D:AF:MA_SAMPLES:MA_COUNT -0.0072335657:0.040994305:0.8600473508955363:71542:0.34819278:249:289
1 896680 chr1:896680_G_T G T . PASS GENE=LINC01128 STAT:SE:P:TSS_D:AF:MA_SAMPLES:MA_COUNT -0.019534744:0.121549845:0.8724180132737893:71542:0.024096385:20:20
where we have two SNPs: chr1:896680_A_G and chr1:896680_G_T.
This can potentially be fixed by +fixref from bcftools, but it is unclear what will happen to our data in the format field.
Alternatively, this issue force us to produce a high-quality TARGET file, with only 1 REF for each position and used that to serve as our templates.
The sumstat merger is otherwise error-free
The text was updated successfully, but these errors were encountered:
Fortunately, at the moment, all instance of multiple-ref are in the similar format as shown above, I.e. there were at least one bp shared between snps.
This issue may come from the following setup in our vcf_qc module.
# when incorrect or missing REF allele is encountered: warn (w), no left normalization is done.
bcftools norm -d exact -N --check-ref w -f ${reference_genome} -Oz --threads ${numThreads} |\
The problem should disappear once we have a good target file that have only 1 ref
Two things need to be done:
Fix vcf_qc to make this issue disappeared: For future user
This should be simple enough based on following, test pending
-c, --check-ref e|w|x|s
what to do when incorrect or missing REF allele is encountered: exit (e), warn (w), exclude (x), or set/fix (s) bad sites. The w option can be combined with x and s. Note that s can swap alleles and will update genotypes (GT) and AC counts, but will not attempt to fix PL or other fields. Also note, and this cannot be stressed enough, that s will NOT fix strand issues in your VCF, do NOT use it for that purpose!!! (Instead see http://samtools.github.io/bcftools/howtos/plugin.af-dist.html and <http://samtools.github.io/bcftools/howtos/plugin.fixref.html>.)
Create the correct sumstat reference: For us, cuz we don't want to spend couple days to redo all the genotype processing
While merging the vcf output from our sumstat merger, the following error occurs:
This is caused by:
where we have two SNPs: chr1:896680_A_G and chr1:896680_G_T.
This can potentially be fixed by +fixref from bcftools, but it is unclear what will happen to our data in the format field.
Alternatively, this issue force us to produce a high-quality TARGET file, with only 1 REF for each position and used that to serve as our templates.
The sumstat merger is otherwise error-free
The text was updated successfully, but these errors were encountered: