Search Results

Now showing 1 - 2 of 2
  • Article
    Citation - WoS: 6
    Citation - Scopus: 10
    Beyond Rouge: a Comprehensive Evaluation Metric for Abstractive Summarization Leveraging Similarity, Entailment, and Acceptability
    (World Scientific Publ Co Pte Ltd, 2024) Briman, Mohammed Khalid Hilmi; Yıldız, Beytullah; Yildiz, Beytullah; Yıldız, Beytullah
    A vast amount of textual information on the internet has amplified the importance of text summarization models. Abstractive summarization generates original words and sentences that may not exist in the source document to be summarized. Such abstractive models may suffer from shortcomings such as linguistic acceptability and hallucinations. Recall-Oriented Understudy for Gisting Evaluation (ROUGE) is a metric commonly used to evaluate abstractive summarization models. However, due to its n-gram-based approach, it ignores several critical linguistic aspects. In this work, we propose Similarity, Entailment, and Acceptability Score (SEAScore), an automatic evaluation metric for evaluating abstractive text summarization models using the power of state-of-the-art pre-trained language models. SEAScore comprises three language models (LMs) that extract meaningful linguistic features from candidate and reference summaries and a weighted sum aggregator that computes an evaluation score. Experimental results show that our LM-based SEAScore metric correlates better with human judgment than standard evaluation metrics such as ROUGE-N and BERTScore.
  • Article
    Citation - WoS: 9
    Citation - Scopus: 11
    Using Deep Learning Approaches for Coloring Silicone Maxillofacial Prostheses: a Comparison of Two Approaches
    (Wolters Kluwer Medknow Publications, 2023) Kurt, Meral; Kurt, Zuhal; Isik, Sahin
    Aim: This study aimed to compare the performance of two deep learning algorithms, attention-based gated recurrent unit (GRU), and the artificial neural networks (ANNs) algorithm for coloring silicone maxillofacial prostheses. Settings and Design: This was an in vitro study. Materials and Methods: A total of 21 silicone samples in different colors were produced with four pigments (white, yellow, red, and blue). The color of the samples was measured with a spectrophotometer, then the LFNx01, aFNx01, and bFNx01 values were recorded. The relationship between the LFNx01, aFNx01, and bFNx01 values of each sample and the amount of each pigment in the compound of the same sample was used as the training dataset, entered into each algorithm, and the prediction models were obtained. While generating the prediction model for each sample, the data of the corresponding sample assigned as the target color were excluded. LFNx01, aFNx01, and bFNx01 values of each target sample were entered into the obtained models separately, and recipes indicating the ratios for mixing the four pigments were predicted. The mean absolute error (MAE) and root mean square error (RMSE) values between the original recipe used in the production of each silicone and the recipe created by both prediction models for the same silicone were calculated. Statistical Analysis Used: Data were analyzed with the Student t-test (alpha=0.05). Results: The mean RMSE values and MAE values for the ANN algorithm (0.029 & PLUSMN; 0.0152 and 0.045 & PLUSMN; 0.0235, respectively) were found significantly higher than the attention-based GRU model (0.001 & PLUSMN; 0.0005 and 0.002 & PLUSMN; 0.0008, respectively) (P < 0.001). Conclusions: Attention-based GRU model provided better performance than the ANN algorithm with respect to the MAE and RMSE values.