Skip to content

Conversation

@liaoxin01
Copy link
Contributor

@liaoxin01 liaoxin01 commented Mar 7, 2023

Proposed changes

Issue Number: close #xxx

Problem summary

void CsvReader::_split_line(const Slice& line) {
    _split_values.clear();
    if (_file_format_type == TFileFormatType::FORMAT_PROTO) {
        PDataRow** ptr = reinterpret_cast<PDataRow**>(line.data);
        PDataRow* row = *ptr;
        for (const PDataColumn& col : (row)->col()) {
            int len = col.value().size();
            uint8_t* buf = new uint8_t[len];
            memcpy(buf, col.value().c_str(), len);
            _split_values.emplace_back(buf, len);
        }
        delete row;
        delete[] ptr;
    } 
 ...
}
PInternalServiceImpl::send_data(google::protobuf::RpcController* controller,
                                     const PSendDataRequest* request, PSendDataResult* response,
                                     google::protobuf::Closure* done) {
    brpc::ClosureGuard closure_guard(done);
    TUniqueId fragment_instance_id;
    fragment_instance_id.hi = request->fragment_instance_id().hi();
    fragment_instance_id.lo = request->fragment_instance_id().lo();
    auto pipe = _exec_env->fragment_mgr()->get_pipe(fragment_instance_id);
    if (pipe == nullptr) {
        response->mutable_status()->set_status_code(1);
        response->mutable_status()->add_error_msgs("pipe is null");
    } else {
        for (int i = 0; i < request->data_size(); ++i) {
            PDataRow* row = new PDataRow();
            row->CopyFrom(request->data(i));
            pipe->append_and_flush(reinterpret_cast<char*>(&row), sizeof(row),
                                   sizeof(row) + row->ByteSizeLong());
        }
        response->mutable_status()->set_status_code(0);
    }
}

There are two problems when using begin, insert into, and commit operations.

  1. The memory of buf(uint8_t* buf = new uint8_t[len]) in _split_line function didn't be released when clear _split_values.
  2. The memory of PDataRow may leak when the load fails. The memory of row(PDataRow* row = new PDataRow()) in the send_data function can't be released when some error occurs.

Checklist(Required)

  • Does it affect the original behavior
  • Has unit tests been added
  • Has document been added or modified
  • Does it need to update dependencies
  • Is this PR support rollback (If NO, please explain WHY)

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 7, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Mar 7, 2023

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 7, 2023

PR approved by anyone and no changes requested.

@yiguolei yiguolei merged commit 059a021 into apache:branch-1.2-lts Mar 7, 2023
@liaoxin01 liaoxin01 deleted the fix_mem_leak_1.2 branch February 6, 2024 12:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. area/vectorization dev/1.2.3-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants