rGFAΒΆ
rGFA (https://github.com/lh3/gfatools/blob/master/doc/rGFA.md)
is a subset of GFA1, in which only particular line types (S and L)
are allowed, and the S lines are required to contain the tags
SN
(of type Z
), SO
and SR
(of type i
).
When working with rGFA files, it is convenient to use the dialect="rgfa"
option in the constructor Gfa()
and in
func:Gfa.from_file()
.
This ensures that additional validations are performed: GFA version must be 1,
only rGFA-compatible lines (S,L) are allowed and that the required tags are
required (with the correct datatype). The validations can also be executed
manually using Gfa.validate_rgfa()
.
Furthermore, the stable_sequence_names
attribute of the GFA objects
returns the set of stable sequence names contained in the SN
tags
of the segments.
>>> g = gfapy.Gfa("S\tS1\tCTGAA\tSN:Z:chr1\tSO:i:0\tSR:i:0", dialect="rgfa")
>>> g.segment_names
['S1']
>>> g.stable_sequence_names
['chr1']
>>> g.add_line("S\tS2\tACG\tSN:Z:chr1\tSO:i:5\tSR:i:0")