verBERT: Automating Brazilian Case Law Document Multi-label Categorization Using BERT

11 Mar 2022  ·  Felipe R. Serras, Marcelo Finger ·

In this work, we carried out a study about the use of attention-based algorithms to automate the categorization of Brazilian case law documents. We used data from the Kollemata Project to produce two distinct datasets with adequate class systems. Then, we implemented a multi-class and multi-label version of BERT and fine-tuned different BERT models with the produced datasets. We evaluated several metrics, adopting the micro-averaged F1-Score as our main metric for which we obtained a performance value of F1-micro=0.72 corresponding to gains of 30 percent points over the tested statistical baseline. In this work, we carried out a study about the use of attention-based algorithms to automate the categorization of Brazilian case law documents. We used data from the \textit{Kollemata} Project to produce two distinct datasets with adequate class systems. Then, we implemented a multi-class and multi-label version of BERT and fine-tuned different BERT models with the produced datasets. We evaluated several metrics, adopting the micro-averaged F1-Score as our main metric for which we obtained a performance value of $\langle \mathcal{F}_1 \rangle_{micro}=0.72$ corresponding to gains of 30 percent points over the tested statistical baseline.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods