Skip to content

UnicodeDecodeError while reading some commits #58

Closed
@gotec

Description

@gotec

Describe the bug
When reading certain commits (likely containing non utf8 characters), the commit object breaks. A UnicodeDecodeError error is thrown when trying to access most properties. For details see the example below.

To Reproduce
E.g. commit 13e644bb36a0b1f3ef0c2091ab648978d18f369d in https://github.com/gentoo/gentoo.

import pydriller
import git

repo_url = 'https://github.com/gentoo/gentoo.git'
local_directory = '/tmp'
repo_name = 'gentoo'

git.Git(local_directory).clone(repo_url)

git_repo = pydriller.GitRepository(local_directory + '/' + repo_name)
commit = git_repo.get_commit('13e644bb36a0b1f3ef0c2091ab648978d18f369d')

commit.author_date

OS Version:
Ubuntu 19.04

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions