Bakir,D.Yildiz,B.Aktas,M.S.Software Engineering2024-07-052024-07-0520230979-835032445-710.1109/BigData59044.2023.103866892-s2.0-85184987087https://doi.org/10.1109/BigData59044.2023.10386689https://hdl.handle.net/20.500.14411/4147Ankura; IEEE DataportIn the complicated world of legal law, Question Answering (QA) systems only work if they can give correct, situation-aware, and logically sound answers. Traditional evaluation methods, which rely on superficial similarity measures, can't catch the complex accuracy and reasoning needed in legal answers. This means that evaluation methods need to change completely. To fix the problems with current methods, this study presents a new model-based evaluation metric that is designed to work well with legal QA systems. We are looking into the basic ideas that are needed for this kind of metric, as well as the problems of putting it into practice in the real world, finding the right technological frameworks, creating good evaluation methods. We talk about a theory framework that is based on legal standards and computational linguistics. We also talk about how the metric was created and how it can be used in real life. Our results, which come from thorough tests, show that our suggested measure is better than existing ones. It is more reliable, accurate, and useful for judging legal quality assurance systems. © 2023 IEEE.eninfo:eu-repo/semantics/closedAccessLarge Language ModelModel-Based Evaluation MetricNatural Language ProcessingQuestion Answering SystemsTransformer ModelsDeveloping and Evaluating a Model-Based Metric for Legal Question Answering SystemsConference Object27452754