Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cmap data format #3

Open
TahaAslani opened this issue Mar 22, 2019 · 4 comments
Open

Cmap data format #3

TahaAslani opened this issue Mar 22, 2019 · 4 comments

Comments

@TahaAslani
Copy link

Hi,

I am trying to create a cmap file from other DNA reads (not optical mapping) so that I can blast it against optical mapping data. I can fill the first five columns of cmap file. I have a few question and I would appreciate it if you could help me.

I don't know how to fill columns like StdDev, Coverage, Occurrence. Do you have a file specification document that explains the cmap file format?

Moreover, what is the meaning of -1? Does it mean that we don't know the value? What values do you suggest for cells in which the data is missing?

Also, does OMBlast accept cmap version 0.1? or the input must be 0.2 or above?
Thank you in advance for your help. I look forward to your answers.

@aldenleung
Copy link
Collaborator

Hi. Please refer to the specification from the following: https://bionanogenomics.com/wp-content/uploads/2017/03/30039-CMAP-File-Format-Specification-Sheet.pdf

Please let me know if you still find any question about cmap file.

As CMAP files contain a lot of information. You may consider using other file formats for your inputs:
http://opticalmapping.info/tutorials/file-formats/

Alternatively I would suggest to use OMTools https://github.com/TF-Chan-Lab/OMTools to create virtual optical maps from DNA reads and your expected labeling sites. For example, you can use

java -jar OMTools.jar FastaToOM --fastain DNAread.fa --refmapout virtual_map.cmap --enzyme BspQI

@TahaAslani
Copy link
Author

Thank you very much for your response.
OMTools is a great package. It helped a lot!
Now my problem is that I keep receiving "unmapped" as the output pf OMBlast, even when I try to Blast one of these Virtual cmap files against itself (where should be a match). Any suggestions? Maybe I should try other input parameters in OMBlast? because right now, I am just trying the default setting.
Many thanks!

@aldenleung
Copy link
Collaborator

There could be many reasons, depending on the nature of virtual cmap files you provided as reference/query. For example, you may have too few signals or there exist many similar patterns in your virtual maps. You could visit our website http://opticalmapping.info/tutorials/alignment/ for more details on parameter tuning. To quickly get some non-specific outputs, you could tune the "minscore", "minjoinscore", and "minconf" parameters.

@TahaAslani
Copy link
Author

Thanks a lot for your detailed answer, Alden. I apologize for bothering you frequently. I created this virtual cmap file by combining all of the contigs with a large number of nick sites from different enzymes. Even though there are a reasonable number of nick sites in the contigs (for example the first contig has 39 sites) it still cannot be blasted against itself. What do you think can be the reason? Note that the distance between the position of adjacent nick sites is very small (something like 200) do you think this might prevent the algorithm from generating matched outputs?
Many thanks.
VIRTUALCMAP.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants