Publication details

Topic Modelling of the Czech Supreme Court Decisions



Year of publication 2020
Type Appeared in Conference without Proceedings
MU Faculty or unit

Faculty of Law

Attached files
Description Czech Supreme Court produces several thousands of court decisions per year. The Supreme court decisions are published almost unprocessed in the full-text with minimal fundamental metadata (date of the decision, docket number). This fact makes a case law research very time-consuming. Therefore, new automatic methods of processing court decisions need to be developed in order to improve ways how to retrieve more relevant case law efficiently. Topic modelling methods have the potential to cluster a large number of documents automatically or to provide new categories of relevant metadata to these documents. In this paper, two topic modelling methods - latent Dirichlet allocation and non-negative matrix factorization are applied to the corpus of Czech Supreme Court decisions. Several models for methods are trained and compared according to their coherence scores in order to find the best number of topics. Further manual qualitative analysis of the most coherent models is performed by authors.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.

More info