Skip to content

**Try** to extract all available EBDs by introducing dynamic test parametrization#8

Merged
hf-kklein merged 20 commits intomainfrom
more_extraction
Dec 19, 2022
Merged

**Try** to extract all available EBDs by introducing dynamic test parametrization#8
hf-kklein merged 20 commits intomainfrom
more_extraction

Conversation

@hf-kklein
Copy link
Copy Markdown
Contributor

@hf-kklein hf-kklein commented Dec 19, 2022

Dieser PR führt einen Unittest ein, der es ermöglicht, schnell einen überblick zu bekommen, was gerade noch hängt und warum:
die idee ist, dass dieser test einen überblick gibt, was geht und was nicht geht und die einzelnen so gefundenen problem, dann einzeln (in eigenen tests) adressiert werden können.

grafik

SUBPASS [100%]
Skipped: Error while scraping 'E_0462': The cell content '' does not belong to a ja/nein cell
Skipped: Error while scraping 'E_0402': EBD Table 'E_0402' was not found.
Skipped: Error while scraping 'E_0405': EBD Table 'E_0405' was not found.
Skipped: Error while scraping 'E_0458': EBD Table 'E_0458' was not found.
Skipped: Error while scraping 'E_0406': EBD Table 'E_0406' was not found.
Skipped: Error while scraping 'E_0452': EBD Table 'E_0452' was not found.
Skipped: Error while scraping 'E_0407': EBD Table 'E_0407' was not found.
Skipped: Error while scraping 'E_0503': EBD Table 'E_0503' was not found.
Skipped: Error while scraping 'E_0408': EBD Table 'E_0408' was not found.
Skipped: Error while scraping 'E_0409': EBD Table 'E_0409' was not found.
Skipped: Error while scraping 'E_0410': EBD Table 'E_0410' was not found.
Skipped: Error while scraping 'E_0411': EBD Table 'E_0411' was not found.
Skipped: Error while scraping 'E_0415': EBD Table 'E_0415' was not found.
Skipped: Error while scraping 'E_0412': EBD Table 'E_0412' was not found.
Skipped: Error while scraping 'E_0416': EBD Table 'E_0416' was not found.
Skipped: Error while scraping 'E_0453': The cell content 'a97' does not belong to a ja/nein cell
Skipped: Error while scraping 'E_0460': EBD Table 'E_0460' was not found.
Skipped: Error while scraping 'E_0418': EBD Table 'E_0418' was not found.
Skipped: Error while scraping 'E_0419': EBD Table 'E_0419' was not found.
Skipped: Error while scraping 'E_0420': EBD Table 'E_0420' was not found.
Skipped: Error while scraping 'E_0421': EBD Table 'E_0421' was not found.
Skipped: Error while scraping 'E_0423': EBD Table 'E_0423' was not found.
Skipped: Error while scraping 'E_0422': EBD Table 'E_0422' was not found.
Skipped: Error while scraping 'E_0413': EBD Table 'E_0413' was not found.
Skipped: Error while scraping 'E_0414': EBD Table 'E_0414' was not found.
Skipped: Error while scraping 'E_0464': EBD Table 'E_0464' was not found.
Skipped: Error while scraping 'E_0424': EBD Table 'E_0424' was not found.
Skipped: Error while scraping 'E_0425': EBD Table 'E_0425' was not found.
Skipped: Error while scraping 'E_0465': EBD Table 'E_0465' was not found.
Skipped: Error while scraping 'E_0426': EBD Table 'E_0426' was not found.
Skipped: Error while scraping 'E_0427': EBD Table 'E_0427' was not found.
Skipped: Error while scraping 'E_0428': EBD Table 'E_0428' was not found.
Skipped: Error while scraping 'E_0466': EBD Table 'E_0466' was not found.
Skipped: Error while scraping 'E_0429': EBD Table 'E_0429' was not found.
Skipped: Error while scraping 'E_0430': EBD Table 'E_0430' was not found.
Skipped: Error while scraping 'E_0431': EBD Table 'E_0431' was not found.
Skipped: Error while scraping 'E_0432': EBD Table 'E_0432' was not found.
Skipped: Error while scraping 'E_0436': EBD Table 'E_0436' was not found.
Skipped: Error while scraping 'E_0434': EBD Table 'E_0434' was not found.
Skipped: Error while scraping 'E_0435': EBD Table 'E_0435' was not found.
Skipped: Error while scraping 'E_0467': EBD Table 'E_0467' was not found.
Skipped: Error while scraping 'E_0446': EBD Table 'E_0446' was not found.
Skipped: Error while scraping 'E_0447': EBD Table 'E_0447' was not found.
Skipped: Error while scraping 'E_0448': EBD Table 'E_0448' was not found.
Skipped: Error while scraping 'E_0449': EBD Table 'E_0449' was not found.
Skipped: Error while scraping 'E_0455': list index out of range
Skipped: Error while scraping 'E_0454': EBD Table 'E_0454' was not found.
Skipped: Error while scraping 'E_0438': EBD Table 'E_0438' was not found.
Skipped: Error while scraping 'E_0484': EBD Table 'E_0484' was not found.
Skipped: Error while scraping 'E_0493': EBD Table 'E_0493' was not found.
Skipped: Error while scraping 'E_0485': EBD Table 'E_0485' was not found.
Skipped: Error while scraping 'E_0494': EBD Table 'E_0494' was not found.
Skipped: Error while scraping 'E_0480': EBD Table 'E_0480' was not found.
Skipped: Error while scraping 'E_0482': EBD Table 'E_0482' was not found.
Skipped: Error while scraping 'E_0491': EBD Table 'E_0491' was not found.
Skipped: Error while scraping 'E_0463': EBD Table 'E_0463' was not found.
Skipped: Error while scraping 'E_0445': EBD Table 'E_0445' was not found.
Skipped: Error while scraping 'E_0461': EBD Table 'E_0461' was not found.
Skipped: Error while scraping 'E_0048': EBD Table 'E_0048' was not found.
Skipped: Error while scraping 'E_0046': EBD Table 'E_0046' was not found.
Skipped: Error while scraping 'E_0047': list index out of range
Skipped: Error while scraping 'E_0049': list index out of range
Skipped: Error while scraping 'E_0005': EBD Table 'E_0005' was not found.
Skipped: Error while scraping 'E_0013': EBD Table 'E_0013' was not found.
Skipped: Error while scraping 'E_0014': list index out of range
Skipped: Error while scraping 'E_0004': list index out of range
Skipped: Error while scraping 'E_0051': EBD Table 'E_0051' was not found.
Skipped: Error while scraping 'E_0016': EBD Table 'E_0016' was not found.
Skipped: Error while scraping 'E_0017': list index out of range
Skipped: Error while scraping 'E_0052': list index out of range
Skipped: Error while scraping 'E_0055': ("'result_code' must match regex '^[A-Z]\\d+$' ('A**' doesn't)", Attribute(name='result_code', default=NOTHING, validator=<optional validator for <matches_re validator for pattern re.compile('^[A-Z]\d+$')> or None>, repr=True, eq=True, eq_key=None, order=True, order_key=None, hash=None, init=True, metadata=mappingproxy({}), type=typing.Optional[str], converter=None, kw_only=True, inherited=False, on_setattr=None), re.compile('^[A-Z]\d+$'), 'A**')
Skipped: Error while scraping 'E_0069': EBD Table 'E_0069' was not found.
Skipped: Error while scraping 'E_0058': ("'result_code' must match regex '^[A-Z]\\d+$' ('A**' doesn't)", Attribute(name='result_code', default=NOTHING, validator=<optional validator for <matches_re validator for pattern re.compile('^[A-Z]\d+$')> or None>, repr=True, eq=True, eq_key=None, order=True, order_key=None, hash=None, init=True, metadata=mappingproxy({}), type=typing.Optional[str], converter=None, kw_only=True, inherited=False, on_setattr=None), re.compile('^[A-Z]\d+$'), 'A**')
Skipped: Error while scraping 'E_0045': EBD Table 'E_0045' was not found.
Skipped: Error while scraping 'E_0026': list index out of range
Skipped: Error while scraping 'E_0042': list index out of range
Skipped: Error while scraping 'E_0043': ("'result_code' must match regex '^[A-Z]\\d+$' ('A**' doesn't)", Attribute(name='result_code', default=NOTHING, validator=<optional validator for <matches_re validator for pattern re.compile('^[A-Z]\d+$')> or None>, repr=True, eq=True, eq_key=None, order=True, order_key=None, hash=None, init=True, metadata=mappingproxy({}), type=typing.Optional[str], converter=None, kw_only=True, inherited=False, on_setattr=None), re.compile('^[A-Z]\d+$'), 'A**')
Skipped: Error while scraping 'E_0070': list index out of range
Skipped: Error while scraping 'E_0060': The cell content '--' does not belong to a ja/nein cell
Skipped: Error while scraping 'E_0061': ("'result_code' must match regex '^[A-Z]\\d+$' ('A**' doesn't)", Attribute(name='result_code', default=NOTHING, validator=<optional validator for <matches_re validator for pattern re.compile('^[A-Z]\d+$')> or None>, repr=True, eq=True, eq_key=None, order=True, order_key=None, hash=None, init=True, metadata=mappingproxy({}), type=typing.Optional[str], converter=None, kw_only=True, inherited=False, on_setattr=None), re.compile('^[A-Z]\d+$'), 'A**')
Skipped: Error while scraping 'E_0031': EBD Table 'E_0031' was not found.
Skipped: Error while scraping 'E_0032': EBD Table 'E_0032' was not found.
Skipped: Error while scraping 'E_0033': EBD Table 'E_0033' was not found.
Skipped: Error while scraping 'E_0094': EBD Table 'E_0094' was not found.
Skipped: Error while scraping 'E_0095': EBD Table 'E_0095' was not found.
Skipped: Error while scraping 'E_0096': list index out of range
Skipped: Error while scraping 'E_0097': list index out of range
Skipped: Error while scraping 'E_0077': ("'result_code' must match regex '^[A-Z]\\d+$' ('A**' doesn't)", Attribute(name='result_code', default=NOTHING, validator=<optional validator for <matches_re validator for pattern re.compile('^[A-Z]\d+$')> or None>, repr=True, eq=True, eq_key=None, order=True, order_key=None, hash=None, init=True, metadata=mappingproxy({}), type=typing.Optional[str], converter=None, kw_only=True, inherited=False, on_setattr=None), re.compile('^[A-Z]\d+$'), 'A**')
Skipped: Error while scraping 'E_0084': ("'result_code' must match regex '^[A-Z]\\d+$' ('A**' doesn't)", Attribute(name='result_code', default=NOTHING, validator=<optional validator for <matches_re validator for pattern re.compile('^[A-Z]\d+$')> or None>, repr=True, eq=True, eq_key=None, order=True, order_key=None, hash=None, init=True, metadata=mappingproxy({}), type=typing.Optional[str], converter=None, kw_only=True, inherited=False, on_setattr=None), re.compile('^[A-Z]\d+$'), 'A**')
Skipped: Error while scraping 'E_0303': EBD Table 'E_0303' was not found.
Skipped: Error while scraping 'E_0305': EBD Table 'E_0305' was not found.
Skipped: Error while scraping 'E_0300': EBD Table 'E_0300' was not found.
Skipped: Error while scraping 'E_0301': EBD Table 'E_0301' was not found.
Skipped: Error while scraping 'E_0304': EBD Table 'E_0304' was not found.
Skipped: Error while scraping 'E_0306': EBD Table 'E_0306' was not found.
Skipped: Error while scraping 'E_0200': EBD Table 'E_0200' was not found.
Skipped: Error while scraping 'E_0201': EBD Table 'E_0201' was not found.
Skipped: Error while scraping 'E_0232': EBD Table 'E_0232' was not found.
Skipped: Error while scraping 'E_0202': EBD Table 'E_0202' was not found.
Skipped: Error while scraping 'E_0203': EBD Table 'E_0203' was not found.
Skipped: Error while scraping 'E_0240': EBD Table 'E_0240' was not found.
Skipped: Error while scraping 'E_0204': cannot access local variable 'role' where it is not associated with a value
Skipped: Error while scraping 'E_0245': EBD Table 'E_0245' was not found.
Skipped: Error while scraping 'E_0246': EBD Table 'E_0246' was not found.
Skipped: Error while scraping 'E_0247': EBD Table 'E_0247' was not found.
Skipped: Error while scraping 'E_0241': EBD Table 'E_0241' was not found.
Skipped: Error while scraping 'E_0210': EBD Table 'E_0210' was not found.
Skipped: Error while scraping 'E_0211': EBD Table 'E_0211' was not found.
Skipped: Error while scraping 'E_0243': EBD Table 'E_0243' was not found.
Skipped: Error while scraping 'E_0259': EBD Table 'E_0259' was not found.
Skipped: Error while scraping 'E_0260': EBD Table 'E_0260' was not found.
Skipped: Error while scraping 'E_0261': EBD Table 'E_0261' was not found.
Skipped: Error while scraping 'E_0217': EBD Table 'E_0217' was not found.
Skipped: Error while scraping 'E_0248': EBD Table 'E_0248' was not found.
Skipped: Error while scraping 'E_0219': EBD Table 'E_0219' was not found.
Skipped: Error while scraping 'E_0220': EBD Table 'E_0220' was not found.
Skipped: Error while scraping 'E_0221': EBD Table 'E_0221' was not found.
Skipped: Error while scraping 'E_0222': EBD Table 'E_0222' was not found.
Skipped: Error while scraping 'E_0225': EBD Table 'E_0225' was not found.
Skipped: Error while scraping 'E_0226': EBD Table 'E_0226' was not found.
Skipped: Error while scraping 'E_0227': EBD Table 'E_0227' was not found.
Skipped: Error while scraping 'E_0228': EBD Table 'E_0228' was not found.
Skipped: Error while scraping 'E_0229': EBD Table 'E_0229' was not found.
Skipped: Error while scraping 'E_0230': EBD Table 'E_0230' was not found.
Skipped: Error while scraping 'E_0231': EBD Table 'E_0231' was not found.
Skipped: Error while scraping 'E_0251': EBD Table 'E_0251' was not found.
Skipped: Error while scraping 'E_0253': EBD Table 'E_0253' was not found.
Skipped: Error while scraping 'E_0258': EBD Table 'E_0258' was not found.
Skipped: Error while scraping 'E_0254': The cell content 'cluster: ablehnung
es handelte sich bei der bestellung um eine einmalige übermittlung.' does not belong to a ja/nein cell
Skipped: Error while scraping 'E_0803': EBD Table 'E_0803' was not found.
Skipped: Error while scraping 'E_0801': EBD Table 'E_0801' was not found.
Skipped: Error while scraping 'E_0802': EBD Table 'E_0802' was not found.
Skipped: Error while scraping 'E_0902': EBD Table 'E_0902' was not found.

@hf-kklein hf-kklein changed the title **Try** to extract all available EBDs **Try** to extract all available EBDs by introducing dynamic test parametrization Dec 19, 2022
Base automatically changed from extract_more_tables to main December 19, 2022 13:53
Comment thread unittests/test_highlevel.py Outdated
Comment on lines +88 to +103
@pytest.mark.parametrize(
"get_ebd_keys_and_files",
[
pytest.param(
"ebd20221128.docx", # this is used as positional argument for the indirect fixture
),
],
indirect=["get_ebd_keys_and_files"],
)
def test_extraction(self, datafiles, get_ebd_keys_and_files: List[Tuple[str, str]], subtests):
"""
tests the extraction and conversion without specific assertions
"""
for ebd_key, filename in get_ebd_keys_and_files:
# I tried for 1.5h to dynamically create test cases for each entry but the parametrization really f***ed me
with subtests.test(ebd_key):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was passiert hier denn für 'ne Magie? ^^
Ist datafiles jetzt auch "ebd20221128.docx"? Hier wäre vielleicht ein bisschen mehr Aufklärung (für mich) wünschenswert, was pytest da genau macht. Und was subtests ist. Ich nehme mal an, dass das irgendein Feature von pytest ist?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4f9eb00

es ist eine eigene fixture mit indirect parametrization (siehe ganz oben in der datei), kombiniert mit einem pytest-plugin.

assert isinstance(actual, EbdTable)
except Exception as error:
error_msg = f"Error while scraping '{ebd_key}': {str(error)}"
pytest.skip(error_msg)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gibt's hier keine Möglichkeit bei den Subtests 'nen Fehler zu schmeißen statt eines skips?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahso, haste hier nen Skip, damit du pushen kannst ohne direkt alles perfekt zu fixen? ^^ Dann vielleicht nen TODO kommentieren.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nen Fehler zu schmeißen statt eines skips?

ja. den try/catch block ausbauen :D

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahso, haste hier nen Skip, damit du pushen kannst ohne direkt alles perfekt zu fixen?

ja, genau. so muss man nicht nach fehlern/todos im code/der docx suchen sondern kriegt sie aufm präsentierteller. habe schon #9 #10 und #11 aufgemacht deswegen. die hätte ich ja nie gefunden auf gut glück.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a39e740 jetzt auch nochmal im code.

@hf-kklein hf-kklein enabled auto-merge (squash) December 19, 2022 16:39
@hf-kklein hf-kklein merged commit 3a9cba1 into main Dec 19, 2022
@hf-kklein hf-kklein deleted the more_extraction branch December 19, 2022 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants