Comparing CNN-LSTM and BERTimbau: an analysis of AI models in legal document classification

Lucas de França Carneiro  Agra; Guilherme Neves  Acorsi; Oberdan Rocha  Pinheiro

Autores

Lucas de França Carneiro Agra SENAI CIMATEC https://orcid.org/0009-0003-2963-5931
Guilherme Neves Acorsi SENAI CIMATEC https://orcid.org/0009-0009-1369-9459
Oberdan Rocha Pinheiro SENAI CIMATEC

Palavras-chave:

Convolutional Neural Network, Long Short-Term Memory, MemoryBERTimbau, Judicial

Resumo

This study addresses the urgent need of the Attorney General's Office of the State of Bahia (PGE) to automate the classification of initial petitions, a challenge exacerbated by the lack of standardization in file naming. To tackle this issue, the work proposes the implementation of advanced Deep Learning models, aiming to overcome the limitations of the currently used approach based on regular expressions (Regex), which shows an average accuracy of 80%. The research compares the efficacy of a hybrid model, integrating Convolutional Neural Network (CNN) with Long Short-Term Memory (LSTM), and the BERTimbau model, with the goal of not only enhancing the precision in identifying these essential documents but also promoting procedural efficiency through automation. Preliminary results reveal that the CNN-LSTM model achieved an accuracy of 99.34%, while BERTimbau obtained 98.51%, demonstrating the great potential of both techniques in optimizing the judicial workflow in the digital era.

Downloads

Não há dados estatísticos.

Referências

MASTELLA, J. O. Uma metodologia usando ambientes paralelos para otimização da classificação de textos aplicada a documentos jurídicos. 2020. Dissertação (Mestrado em Ciência da Computação) – Escola Politécnica, Programa de Pós-Graduação em Ciência da Computação, Pontifícia Universidade Católica do Rio Grande do Sul, Porto Alegre, 2020.

BENTO, F. M.; TEIVE, R. C. G. Classificação de documentos jurídicos utilizando a arquitetura transformer: uma análise comparativa com algoritmos tradicionais de Machine Learning e ChatGPT. Brazilian Journal of Development, Curitiba, v. 9, n. 6, p. 20208-20224, jun. 2023. DOI: 10.34117/bjdv9n6-97.

MAGALHÃES, D.; POZO, A.; MACHADO, S. Técnicas de Aprendizado de Máquinas Aplicadas à Classificação de Decisões Judiciais. Revista de Estudos Empíricos em Direito (Brazilian Journal of Empirical Legal Studies), v. 9, 2022. DOI: 10.19092/reed.v9.573.

JANG, B.; KIM, I.; KIM, J. W. Word2vec convolutional neural networks for classification of news articles and tweets. PLoS ONE, [S.l.], v. 14, n. 8, e0220976, ago. 2019. Disponível em: https://doi.org/10.1371/journal.pone.0220976.

P, SARATHA; MUKHERJEE, SASWAT. A novel approach for improving the accuracy using word embedding on deep neural networks for software requirements classification. Research Article, Anna University Chennai, 31 mar. 2023. Disponível em: https://doi.org/10.21203/rs.3.rs-2742342/v1.

Comparing CNN-LSTM and BERTimbau: an analysis of AI models in legal document classification

Autores

Palavras-chave:

Resumo

Downloads

Referências

Downloads

Publicado

Como Citar

Edição

Seção

Submit Submission