Description
您好,我的基因组非常大,单条染色体长度都是1Gb以上,所以跑Centromics的时候在blastn这一步都是core dumped。所以我把染色体切割来跑,但是由于contig太多,到circos这一步就中断了。我想请问下centomics.candidate_peaks.bed这个文件就是最终的结果吗?如果是的话我怎么根据这个文件来找到具体的centomics的位置?谢谢!
这个文件的部分结果如下:
#chrom start end data_from peak_value sum_value
1000 80000 1930000 TR-CL1 0.0 0.0
1000 80000 1930000 TR-CL2 0.0 0.0
1000 80000 1930000 TR-CL3 0.0 0.0
1000 80000 1930000 TR-CL4 0.0 0.0
1023 560000 2120000 TR-CL1 0.0 0.0
1023 560000 2120000 TR-CL2 0.0 0.0
1023 560000 2120000 TR-CL3 0.0 0.0
1023 560000 2120000 TR-CL4 0.0 0.0
10291 0 30000 TR-CL1 0.0 0.0
10291 0 30000 TR-CL2 0.0 0.0
10291 0 30000 TR-CL3 0.0 0.0
10291 0 30000 TR-CL4 0.0 0.0
1038 330000 370000 TR-CL1 0.0 0.0
1038 330000 370000 TR-CL2 0.0 0.0
1038 330000 370000 TR-CL4 0.0 0.0
1076 40000 1020000 TR-CL6 6818.0 30136.0
1076 40000 1430000 TR-CL1 0.0 0.0
1076 40000 1430000 TR-CL2 0.0 0.0
1076 40000 1430000 TR-CL3 0.0 0.0
1076 40000 1430000 TR-CL4 0.0 0.0
1144 1630000 4980000 TR-CL1 0.0 0.0
1144 1630000 4980000 TR-CL2 0.0 0.0
1144 1630000 4980000 TR-CL3 0.0 0.0
1144 1630000 4980000 TR-CL4 0.0 0.0
1153 590000 6450000 TR-CL1 0.0 0.0
请问这里的1076 :40000-1020000这个文件就是潜在的centomics吗?
Activity
zhangrengang commentedon Jul 18, 2023
是的。可以看看TR-CL6的序列。你也可以将1076 40000 1020000这种坐标转换成染色体坐标,再用circos画图(替换原先run的circos数据文件,然后重跑下circos)。
SC-Duan commentedon Jul 18, 2023
您好,转换完坐标后,很多有peak value的区间不连续,这种情况该怎么定义边界呢?比如:
#chrom start end data_from peak_value sum_value
1 309953951 314813951 TR-CL1 0.0 0.0
1 309953951 314813951 TR-CL2 0.0 0.0
1 309953951 314813951 TR-CL3 0.0 0.0
1 309953951 314813951 TR-CL4 0.0 0.0
1 388651275 393421275 TR-CL1 0.0 0.0
1 388651275 393421275 TR-CL2 0.0 0.0
1 388651275 393421275 TR-CL3 0.0 0.0
1 388651275 393421275 TR-CL4 0.0 0.0
1 500614965 504334965 TR-CL1 0.0 0.0
1 500614965 504334965 TR-CL2 0.0 0.0
1 500614965 504334965 TR-CL3 0.0 0.0
1 500614965 504334965 TR-CL4 0.0 0.0
1 640461008 642391008 TR-CL1 0.0 0.0
1 640461008 642391008 TR-CL2 0.0 0.0
1 640461008 642391008 TR-CL3 10022.0 281630.0
1 640461008 642391008 TR-CL4 0.0 0.0
1 640461008 642391008 TR-CL6 0.0 0.0
1 950891082 951021082 TR-CL1 0.0 0.0
1 950891082 951021082 TR-CL2 0.0 0.0
1 950891082 951021082 TR-CL3 0.0 0.0
1 950891082 951021082 TR-CL4 0.0 0.0
1 950901082 951011082 TR-CL6 190814.0 1245689.0
1 951160717 951450717 TR-CL1 0.0 0.0
1 951160717 951450717 TR-CL2 0.0 0.0
1 951160717 951450717 TR-CL3 0.0 0.0
1 951160717 951450717 TR-CL4 0.0 0.0
1 951160717 951450717 TR-CL6 192670.0 2703144.0
1 971478525 971548525 TR-CL1 0.0 0.0
1 971478525 971548525 TR-CL2 0.0 0.0
1 971478525 971548525 TR-CL3 0.0 0.0
1 971478525 971548525 TR-CL4 0.0 0.0
1 971488525 971548525 TR-CL6 141074.0 427004.0
1 971549039 971979039 TR-CL6 194289.0 2811413.0
1 971549039 972009039 TR-CL1 0.0 0.0
1 971549039 972009039 TR-CL2 0.0 0.0
1 971549039 972009039 TR-CL3 0.0 0.0
1 971549039 972009039 TR-CL4 0.0 0.0
1 972007597 972357597 TR-CL1 0.0 0.0
1 972007597 972357597 TR-CL2 0.0 0.0
1 972007597 972357597 TR-CL3 0.0 0.0
1 972007597 972357597 TR-CL4 0.0 0.0
1 972007597 972357597 TR-CL6 190695.0 2640124.0
1 972353085 975623085 TR-CL1 0.0 0.0
1 972353085 975623085 TR-CL2 0.0 0.0
1 972353085 975623085 TR-CL3 0.0 0.0
1 972353085 975623085 TR-CL4 0.0 0.0
1 972353085 975623085 TR-CL6 191107.0 23607978.0
1 975620783 980240783 TR-CL1 0.0 0.0
1 975620783 980240783 TR-CL2 0.0 0.0
1 975620783 980240783 TR-CL3 0.0 0.0
1 975620783 980240783 TR-CL4 0.0 0.0
1 975620783 980240783 TR-CL6 194455.0 36322240.0
1 980262685 981932685 TR-CL1 0.0 0.0
1 980262685 981932685 TR-CL2 0.0 0.0
1 980262685 981932685 TR-CL3 0.0 0.0
1 980262685 981932685 TR-CL4 0.0 0.0
1 980312685 981932685 TR-CL6 190616.0 10884563.0
1 981935811 982165811 TR-CL1 0.0 0.0
1 981935811 982165811 TR-CL2 0.0 0.0
1 981935811 982165811 TR-CL3 0.0 0.0
1 981935811 982165811 TR-CL4 0.0 0.0
1 981935811 982165811 TR-CL6 188429.0 1239067.0
1 982163375 982633375 TR-CL6 190661.0 3090747.0
1 982163375 982643375 TR-CL1 0.0 0.0
1 982163375 982643375 TR-CL2 0.0 0.0
1 982163375 982643375 TR-CL3 0.0 0.0
1 982163375 982643375 TR-CL4 0.0 0.0
1 982638799 983158799 TR-CL1 0.0 0.0
1 982638799 983158799 TR-CL2 0.0 0.0
1 982638799 983158799 TR-CL3 0.0 0.0
1 982638799 983158799 TR-CL4 0.0 0.0
1 982638799 983158799 TR-CL6 194106.0 3914810.0
1 983179745 983439745 TR-CL6 190706.0 2125908.0
1 983179745 983449745 TR-CL1 0.0 0.0
1 983179745 983449745 TR-CL2 0.0 0.0
1 983179745 983449745 TR-CL3 0.0 0.0
1 983179745 983449745 TR-CL4 0.0 0.0
1 983500215 983640215 TR-CL6 189177.0 1204249.0
1 983500215 983650215 TR-CL1 0.0 0.0
我检查了整个文件,基本出现peak value的地方都是CL6。我看了下这些peak value的区间都是repeat富集、gene稀疏的区域,跨度大概在10-50Mb左右。但是我该怎么根据这个peak value来确定完整的centomics边界?谢谢您!
zhangrengang commentedon Jul 18, 2023
不连续是什么原因呢?我觉得取最小到最大的区间就行吧?
SC-Duan commentedon Jul 18, 2023
好的,我就取最小到最大的区间,谢谢您!
Blosers commentedon Mar 16, 2024
Traceback (most recent call last):
File "/home/master_dingyuanhao/miniconda3/envs/hsc/bin/centromics", line 33, in
sys.exit(load_entry_point('Centromics==0.3', 'console_scripts', 'centromics')())
File "/home/master_dingyuanhao/miniconda3/envs/hsc/bin/centromics", line 25, in importlib_load_entry_point
return next(matches).load()
File "/home/master_dingyuanhao/miniconda3/envs/hsc/lib/python3.10/importlib/metadata/init.py", line 171, in load
module = import_module(match.group('module'))
File "/home/master_dingyuanhao/miniconda3/envs/hsc/lib/python3.10/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 1006, in _find_and_load_unlocked
File "", line 688, in _load_unlocked
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "/home/master_dingyuanhao/miniconda3/envs/hsc/lib/python3.10/site-packages/Centromics-0.3-py3.10.egg/Centromics/pipe.py", line 7, in
from REPcluster.Mcl import MclGroup
ModuleNotFoundError: No module named 'REPcluster'
您好,请问出现这样的问题应该怎么解决啊?
zhangrengang commentedon Mar 16, 2024
这是应为REPcluster未正确安装。git clone时是否有
--recurse-submodules
参数?安装时是否提示REPcluster成功安装?Blosers commentedon Mar 17, 2024
Blosers commentedon Mar 17, 2024
zhangrengang commentedon Mar 17, 2024
要结合circos图来看,所有染色体都一致的地方可以假定为着丝粒,再去bed中找相应坐标。