Integrating natural language processing machines to optimize the classification of legal texts with different linguistic patterns
Integrando máquinas de processamento de linguagem natural para otimizar a classificação de textos jurídicos com diferentes padrões linguísticos
Palavras-chave:
Classifier, Law, Machine Ensemble, Natural Language ProcessingResumo
In this study, a machine learning ensemble strategy was proposed and evaluated to enhance the accuracy of classifying legal issues in three areas of Brazilian law: Labor Law, Family Law, and Consumer Law. The approach combined two natural language processing expert systems, trained on both 'popular' and 'non-popular' language texts, along with a classifier to identify them. The results demonstrated that the ensemble classifier achieved an overall accuracy of 96%, significantly outperforming the individual expert systems. However, the strategy increased computational costs, a circumstance that should be consider when one chose to deploy a system like this.
Downloads
Referências
AHMAD, I. et al. Fake news detection using machine learning ensemble methods. Complexity, Hindawi, v. 2020, p. 8885861, Oct 2020. ISSN 1076-2787. Disponível em: https://doi.org/10.1155/2020/8885861.
ALARIE, B.; NIBLETT, A.; YOON, A. H. Using machine learning to predict outcomes in tax law. Can. Bus. LJ, HeinOnline, v. 58, p. 231, 2016.
BARROS, R. et al. Case law analysis with machine learning in brazilian court. In: MOUHOUB, M. et al. (Ed.). Recent Trends and Future Technology in Applied Intelligence. Cham: Springer International Publishing, 2018. p. 857–868. ISBN 978-3-319-92058-0.
BELL, K. et al. The recon approach: A new direction for machine learning in criminal law. Berkeley Tech. LJ, HeinOnline, v. 36, p. 821, 2021.
BHAVANI, A.; KUMAR, B. S. A review of state art of text classification algorithms. In: 2021 5th International Conference on Computing Methodologies and Communication (ICCMC). [S.l.: s.n.], 2021. p. 1484–1490.
CAETANO, J. M. P.; CABRAL, H. B.; LUQUETTI, E. C. F. A (in) compreensão da linguagem jurídica e seus efeitos na celeridade processual. Litterata: Revista do Centro de Estudos Hélio Simões, v. 3, n. 1, p. 94–105, 2013.
CHOLLET, F. et al. Keras. GitHub, 2015. Disponível em: https://github.com/fchollet/keras.
DIETTERICH, T. G. Ensemble methods in machine learning. In: SPRINGER. Multiple Classifier Systems: First International Workshop, MCS 2000 Cagliari, Italy, June 21–23, 2000 Proceedings 1. [S.l.], 2000. p. 1–15.
DIETTERICH, T. G. Ensemble methods in machine learning. In: Multiple Classifier Systems. Berlin, Heidelberg: Springer Berlin Heidelberg, 2000. p. 1–15. ISBN 978-3-540-45014-6.
GONZALEZ, M.; LIMA, V. L. S. Recuperação de informação e processamento da linguagem natural. In: XXIII Congresso da Sociedade Brasileira de Computação. [S.l.: s.n.], 2003. v. 3, p. 347–395.
GUIMARÃES, J. P. F. et al. Classificador de textos escritos em linguagem popular numa subárea do direito. REVISTA FOCO, v. 16, n. 02, p. e1070–e1070, 2023.
GUIMARÃES, L. H. P. de A. A simplificação da linguagem jurídica como instrumento fundamental de acesso à justiça. Publication UEPG: Ciências Humanas, Linguística, Letras e Artes, v. 20, n. 2, p. 173–184, 2012.
HAYKIN, S. Redes Neurais: Princípios e Prática. Bookman Editora, 2000. ISBN 9788577800865. Disponível em: https://books.google.com.br/books?id=bhMwDwAAQBAJ.
HEINERMANN, J.; KRAMER, O. Machine learning ensembles for wind power prediction. Renewable Energy, v. 89, p. 671–679, 2016. ISSN 0960-1481. Disponível em: https://www.sciencedirect.com/science/article/pii/S0960148115304894.
KANG, D.; OH, S. Balanced training/test set sampling for proper evaluation of classification models. Intelligent Data Analysis, IOS Press, v. 24, n. 1, p. 5–18, 2020.
KAUR, H.; MALHI, A. K.; PANNU, H. S. Machine learning ensemble for neurological disorders. Neural Computing and Applications, v. 32, n. 16, p. 12697–12714, Aug 2020. ISSN 1433-3058. Disponível em: https://doi.org/10.1007/s00521-020-04720-1.
KOVÁCS, Z. L. Redes neurais artificiais. [S.l.]: Editora Livraria da Física, 2002.
MOREIRA, N. S. et al. Linguagem jurídica: termos técnicos e juridiquês. Unoesc & Ciência-ACSA, v. 1, n. 2, p. 139–146, 2010.
NADKARNI, P. M.; OHNO-MACHADO, L.; CHAPMAN, W. W. Natural language processing: an introduction. Journal of the American Medical Informatics Association, BMJ Group BMA House, Tavistock Square, London, WC1H 9JR, v. 18, n. 5, p. 544–551, 2011.
NAY, J. Natural language processing and machine learning for law and policy texts. In: . Legal Informatics. Cambridge University Press, 2021. Disponível em: https://ssrn.com/abstract=3438276.
SANTOS, B. F. de Oliveira de Oliveira e Sandra Vieira dos. A linguagem jurídica como obstáculo na comunicação entre pessoas comuns e a concretização do acesso à justiça. REGRAD - Revista Eletrônica de Graduação do UNIVEM - ISSN 1984-7866, v. 14, n. 1, p. 109–123, 2022. ISSN 1984-7866. Disponível em: https://revista.univem.edu.br/REGRAD/article/view/3426.
SIL, R.; ROY, A. A novel approach on argument based legal prediction model using machine learning. In: 2020 International Conference on Smart Electronics and Communication (ICOSEC). [S.l.: s.n.], 2020. p. 487–490.