How to prepare a dataset to train "Quality Scorer" classifier? #449
Unanswered
kdcyberdude
asked this question in
Q&A
Replies: 1 comment 1 reply
-
Basically, "Quality Scorer" is a fasttext classifier that was trained to assign high scores to pages that are similar to "high quality" content like Wikipedia pages and books. "Document Coherence Scorer" is a scorer to assign high scores to pages where paragraphs are more "consistent", bases on their embedding cosine similarity. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I want to know the implementation details of the "Quality Scorer" and "Document Coherence Scorer" filters.
Beta Was this translation helpful? Give feedback.
All reactions