A novel text classification problem and its solution

Zadrożny, Sławomir; Kacprzyk, Janusz; Gajewski, Marek; Wysocki, Maciej

Journal

Czasopismo Techniczne

- |

Article title

A novel text classification problem and its solution

Authors

Sławomir Zadrożny , Janusz Kacprzyk , Marek Gajewski , Maciej Wysocki

Content

Full texts:

Download

Title variants

Languages of publication

PL

Abstracts

PL

A new text categorization problem is introduced. As in the classical problem, there is a set of documents and a set of categories. However, in addition to being assigned to a specific category, each document belongs to a certain sequence of documents, referred to as a case. It is assumed that all documents in the same case belong to the same category. An example may be a set of news articles. Their categories may be sport, politics, entertainment, etc. In each category there exist cases, i.e., sequences of documents describing, for example evolution of some events. The problem considered is how to classify a document to a proper category and a proper case within this category. In the paper we formalize the problem and discuss two approaches to its solution.

Keywords

PL

text categorization, sequences of documents, sequence mining, hidden Markov models

Publisher

[unknown2]

Journal

Czasopismo Techniczne

Year

-

Physical description

Dates

online

2015-04-21

Contributors

author

Sławomir Zadrożny

author

Janusz Kacprzyk

author

Marek Gajewski

author

Maciej Wysocki

References

Document Type

Publication order reference

Identifiers

YADDA identifier

bwmeta1.element.ojs-nameId-6e3e8ea9-5a94-37a7-828b-e6cd5da23db6-year-2015-article-1732