Releases: nanoporetech/modkit
Releases · nanoporetech/modkit
v0.3.1rc1
v0.3.0
Fixes
- [validate] Fix bug where observations were not being balanced correctly.
- [dmr] Revert noodles to 0.50.0, partially fixes #178.
Adds
- [find-motifs] Add command to find motif sequences that are enriched for base modification.
- [entropy] Add command to calculate methylation entropy.
- [dmr] Add
--segment
option to categorize modified positions into contiguous "same" and "different" groups. - [pileup] Add
--header
option to emit a column header.
Deprecations
- [pileup] Change output to always be all-tab delimited, next version will error with
--only-tabs
flag.
v0.2.8-rc1
v0.2.7
Fixes
- [dmr] Header was incorrect with multiple samples
- [pileup] Improve performance when using
--include-bed
, only process contigs in the BED file. - [dmr, single-site] When using multiple samples, don't fail a position when one or more samples doesn't have a modification call at that position.
- [extract] Expose queue size to reduce memory usage with long reads.
- [validate] Report number of calls filtered out with thresholds.
v0.2.6
Fixes
- [dmr, single-site] Don't require that there are equal numbers of samples for single site DMR with multiple samples. Fixes #140.
- [dmr, pairwise, region] Protect when zero bedmethyl records are found for a region, fixes #146.
Adds
- [validate] Adds on-the-fly filtering of reads by alignment identity and/or alignment length.
v0.2.5
Fixes
- [extract] Only emit mapped reads when
--region
is provided, but still emit unmapped bases in those reads unless--mapped-only
is passed. - [extract] Performance improvement due to better tracking of interval boundaries.
- [repair] Updates the
MN
tag on repaired records.
Adds
- [dmr, single-site] Refactor
dmr pair
without regions (i.e. single site analysis) to increase performance. - [dmr, single-site] Add estimated MAP-based p-value to output.
- [all] Allows BED3 input for all options that use
--include-bed
. Strand will be assumed to be BOTH (equivalent to '.'). - [extract] Increases the kmer size limit to 50.
v0.2.5-rc2
Fixes
- [all] Reads with entirely implicit canonical calls are no longer skipped for "modbase info empty" or similar.
Adds
- adds the
--no-implicit-calls
flag tomodkit update-tags
so when changing from the "probability of modification" modes (".", and "") to explicit mode (?
) implicit cannonical calls can be skipped. This is important, for example, if the modification calling model did not have the?
mode available.
v0.2.5-rc1
Adds
modkit validate
sub-command for rigorous testing of modified base calling models when ground truth labels are known.
Fixes
- [all] Improve performance when using commands on transcriptome reference (or any reference with many sequences less than ~100kb).
v0.2.4
Adds
- [extract, adjust-mods, update-tags, call-mods] Parse MN tag in order to use secondary and supplementary alignments.
Fixes
- [all] Improve performance slightly when using short and frequent motifs with
--motif
option.
v0.2.3
Adds
- [dmr, multi] Allow site-level scoring by omitting the
--regions
argument. Sites will be collected from the input bedMethyl files. - [dmr] Friendlier handling of missing files and when regions aren't found in the bedMethyl input files.
- [dmr] allow filtering on valid coverage without changing input
- [extract] output "read calls" extract table, a TSV with the base modification calls for every modified position in every read using the same thresholding algorithm as pileup
- [extract] allow filtering of calls by reference motif (like in pileup) as well as BED regions (and exclude regions)
Fixes
- [extract] Improve performance, especially on longer reads.
- [extract] Improve performance with long reads (actually a bug fix)
- [extract] num_soft_clipped_start and num_soft_clipped_end were incorrect on some reverse-mapped reads