Skip to content

NULL pointer on using bam_itr_next #1179

@torbjoernk

Description

@torbjoernk

After updating from htslib v1.9 to v1.11 we are getting regular crashes on usage of bam_itr_next. We are always using indexed BAM files:

htsFile *fp_in = hts_open(bam_file,"rb");
hts_idx_t *idx_in = bam_index_load(bam_file);
hts_itr_t *iter = sam_itr_queryi(idx_in, tid_t, beg_t, end_t);
bam_itr_next(fp_in, iter, b);

This gives us now a crash due to NULL of fp in sam.c:958 in the call stack:

sam.h:1083

#define bam_itr_next(htsfp, itr, r) hts_itr_next((htsfp)->fp.bgzf, (itr), (r), 0)

hts.c:3320-3335

int hts_itr_next(BGZF *fp, hts_itr_t *iter, void *r, void *data)
{
    int ret, tid;
    hts_pos_t beg, end;
    if (iter == NULL || iter->finished) return -1;
    if (iter->read_rest) {
        if (iter->curr_off) { // seek to the start
            if (bgzf_seek(fp, iter->curr_off, SEEK_SET) < 0) {
                hts_log_error("Failed to seek to offset %"PRIu64"%s%s",
                              iter->curr_off,
                              errno ? ": " : "", strerror(errno));
                return -2;
            }
            iter->curr_off = 0; // only seek once
        }
        ret = iter->readrec(fp, data, r, &tid, &beg, &end);

Via

sam.c:1158-1166

hts_itr_t *sam_itr_queryi(const hts_idx_t *idx, int tid, hts_pos_t beg, hts_pos_t end)
{
    const hts_cram_idx_t *cidx = (const hts_cram_idx_t *) idx;
    if (idx == NULL)
        return hts_itr_query(NULL, tid, beg, end, sam_readrec_rest);
    else if (cidx->fmt == HTS_FMT_CRAI)
        return cram_itr_query(idx, tid, beg, end, sam_readrec);
    else
        return hts_itr_query(idx, tid, beg, end, sam_readrec);

iter->hts_readrec_func is set to sam_readrec:

sam.c:954-958

static int sam_readrec(BGZF *ignored, void *fpv, void *bv, int *tid, hts_pos_t *beg, hts_pos_t *end)
{
    htsFile *fp = (htsFile *)fpv;
    bam1_t *b = bv;
    fp->line.l = 0;

Thus, we suspect that the macro of bam_itr_next is wrongly defined by setting the data parameter of hts_itr_next to 0.

Question to the maintainers: Are we (read: the users of htslib) supposed to use the generic sam_* API from now on - and in favour of the old-ish bam_* API?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions