Yıldız, Beytullah

Yıldız, Beytullah
Yildiz, B
B., Yildiz
B., Yıldız
Beytullah, Yildiz
Y., Beytullah
Beytullah, Yıldız
Yildiz, Beytullah
Doçent Doktor
  Article
    Citation Count: 6
    Improving word embedding quality with innovative automated approaches to hyperparameters
    (Wiley, 2021) Yildiz, Beytullah; Software Engineering
    Deep learning practices have a great impact in many areas. Big data and significant hardware developments are the main reasons behind deep learning success. Recent advances in deep learning have led to significant improvements in text analysis and classification. Progress in the quality of word representation is an important factor among these improvements. In this study, we aimed to develop word2vec word representation, also called embedding, by automatically optimizing hyperparameters. Minimum word count, vector size, window size, negative sample, and iteration number were used to improve word embedding. We introduce two approaches for setting hyperparameters that are faster than grid search and random search. Word embeddings were created using documents of approximately 300 million words. We measured the quality of word embedding using a deep learning classification model on documents of 10 different classes. It was observed that the optimization of the values of hyperparameters alone increased classification success by 9%. In addition, we demonstrate the benefits of our approaches by comparing the semantic and syntactic relations between word embedding using default and optimized hyperparameters.
  Article
    Citation Count: 4
    Optimizing bitmap index encoding for high performance queries
    (Wiley, 2021) Yildiz, Beytullah; Software Engineering
    Many sources such as historical archives, sensor readings, health systems, and machine records produce ever-increasing but often unchanging data. These accumulating data create a need for faster processing. Bitmap index, which can take advantage of multi-core and multiprocessor systems, is designed to process data that increase over time but do not change frequently. It has a well-known advantage, especially in queries on data with low cardinality. However, bitmap index can handle high cardinality data efficiently because it can use its own compression algorithm. Bitmap index has many encoding schemes that affect query processing time. In this study, we developed an algorithm that improves query performance by using optimal encoding among bitmap encodings. With this optimization algorithm, we witnessed up to 40% performance increase in queries made with bitmap indexes created with different encodings. Furthermore, in comparison with a commonly used relational database, we found significant improvements in the number of query operations per second performed on optimized encoded bitmap indexes generated by the introduced algorithm.
  Article
    Citation Count: 0
    Beyond ROUGE: A Comprehensive Evaluation Metric for Abstractive Summarization Leveraging Similarity, Entailment, and Acceptability
    (World Scientific Publ Co Pte Ltd, 2024) Briman, Mohammed Khalid Hilmi; Yıldız, Beytullah; Software Engineering
    A vast amount of textual information on the internet has amplified the importance of text summarization models. Abstractive summarization generates original words and sentences that may not exist in the source document to be summarized. Such abstractive models may suffer from shortcomings such as linguistic acceptability and hallucinations. Recall-Oriented Understudy for Gisting Evaluation (ROUGE) is a metric commonly used to evaluate abstractive summarization models. However, due to its n-gram-based approach, it ignores several critical linguistic aspects. In this work, we propose Similarity, Entailment, and Acceptability Score (SEAScore), an automatic evaluation metric for evaluating abstractive text summarization models using the power of state-of-the-art pre-trained language models. SEAScore comprises three language models (LMs) that extract meaningful linguistic features from candidate and reference summaries and a weighted sum aggregator that computes an evaluation score. Experimental results show that our LM-based SEAScore metric correlates better with human judgment than standard evaluation metrics such as ROUGE-N and BERTScore.
  Article
    Citation Count: 0
    Daha İyi Dağıtımla İyileştirilmiş Dengesiz Veriler Üzerinde Derin Öğrenme ile Verimli Metin Sınıflandırması
    (2022) Yıldız, Beytullah; Software Engineering
    Teknolojik gelişmeler ve internetin yaygınlaşması, günlük olarak üretilen verilerin katlanarak artmasına neden olmaktadır.\rBu veri tufanının önemli bir kısmı sosyal medya, iletişim araçları, müşteri hizmetleri gibi uygulamalardan gelen metin\rverilerinden kaynaklanmaktadır. Bu büyük miktarda metin verisinin işlenmesi otomasyona ihtiyaç duymaktadır. Son\rzamanlarda metin işlemede önemli başarılar elde edilmiştir. Özellikle derin öğrenme uygulamaları ile metin sınıflandırma\rperformansı oldukça tatmin edici hale gelmiştir. Bu çalışmada, metin sınıflandırma başarısını daha da artırmak için veri\rdengesizliği sorununu azaltan yenilikçi bir veri dağıtım algoritması önerdik. Deney sonuçları, veri dağılımını optimize eden\ralgoritma ile sınıflandırma doğruluğunda yaklaşık %3,5 ve F1 puanında 3'ün üzerinde bir iyileşme olduğunu göstermektedir.
  Conference Object
    Citation Count: 0
    Developing and Evaluating a Model-Based Metric for Legal Question Answering Systems
    (Institute of Electrical and Electronics Engineers Inc., 2023) Yıldız, Beytullah; Software Engineering
    In the complicated world of legal law, Question Answering (QA) systems only work if they can give correct, situation-aware, and logically sound answers. Traditional evaluation methods, which rely on superficial similarity measures, can't catch the complex accuracy and reasoning needed in legal answers. This means that evaluation methods need to change completely. To fix the problems with current methods, this study presents a new model-based evaluation metric that is designed to work well with legal QA systems. We are looking into the basic ideas that are needed for this kind of metric, as well as the problems of putting it into practice in the real world, finding the right technological frameworks, creating good evaluation methods. We talk about a theory framework that is based on legal standards and computational linguistics. We also talk about how the metric was created and how it can be used in real life. Our results, which come from thorough tests, show that our suggested measure is better than existing ones. It is more reliable, accurate, and useful for judging legal quality assurance systems. © 2023 IEEE.
  Conference Object
    Citation Count: 0
    Reinforcement Learning for Intrusion Detection
    (Springer Science and Business Media Deutschland GmbH, 2023) Yıldız, Beytullah; Software Engineering
    Network-based technologies such as cloud computing, web services, and Internet of Things systems are becoming widely used due to their flexibility and preeminence. On the other hand, the exponential proliferation of network-based technologies exacerbated network security concerns. Intrusion takes an important share in the security concerns surrounding network-based technologies. Developing a robust intrusion detection system is crucial to solving the intrusion problem and ensuring the secure delivery of network-based technologies and services. In this paper, we propose a novel approach using deep reinforcement learning to detect intrusions to make network applications more secure, reliable, and efficient. As for the reinforcement learning approach, Deep Q-learning is used alongside a custom-built Gym environment that mimics network attacks and guides the learning process. The NSL-KDD dataset is used to create the reinforcement learning environment to train and evaluate the proposed model. The experimental results show that our proposed reinforcement learning approach outperforms other related solutions in the literature, achieving an accuracy that exceeds 93%. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
  Master Thesis
    Soyutlayıcı özetlemek, benzerlik, gereklilik, ve kabul edilebilirliği kullanan kapsamlı değerlendirme metriği
    (2023) Yıldız, Beytullah; Software Engineering
    Uzun metinlerden otomatik olarak anlamlı özetler üretmek, birçok alanda büyük önem taşımaktadır. Transformer modeli gibi yeni sinir ağı mimarilerinin ortaya çıkması, kaliteli özetler üretebilen çok sayıda büyük dil modellerinin gelişmesine neden olmuştur. Fakat, özetleme modellerinin ürettiği özetler, önemli bir sorunu beraberinde getirmektedir. Özetleme modellerinin kalitesini ölçen, ROUGE gibi, standart otomatik değerlendirme metrikleri, kapsamlı bir değerlendirme yapmakta eksik kalmaktadır. Bu çalışmada, modeller tarafından üretilen ve insanlar tarafından yazılan örnek özetleri kullanan, SEAScore adlı yeni bir model tabanlı metrik sunuyoruz. Bu metrik, semantik benzerlik, doğal dil çıkarımı ve dilsel kabul edilebilirlik gibi çeşitli Doğal Dil İşleme yöntemlerini kullanır. Geliştirdiğimiz SEAScore metriği, daha önce eğitilmiş dil modelleri tarafından çıkarılan özellikleri kullanarak, özetleme modellerinin kalitelerini ölçen bir puan üretir. Bu tezde, üç tane özetleme modeli kullanarak yeni metriğimizin kalitesini ölçen deneyler yaptık. Deneysel sonuçlara göre, geliştirdiğimiz SEAScore metriği, bilinen standart metriklerine göre, insan tarafından üretilen değerlendirme puanları ile daha yüksek korelasyon sergileyerek başarılı sonuçlar sunmuştur.
  Conference Object
    Citation Count: 1
    Enhancing Image Resolution with Generative Adversarial Networks
    (Institute of Electrical and Electronics Engineers Inc., 2022) Yıldız, Beytullah; Software Engineering
    Super-resolution is the process of generating high-resolution images from low-resolution images. There are a variety of practical applications used in real-world problems such as high-definition content creation, surveillance imaging, gaming, and medical imaging. Super-resolution has been the subject of many researches over the past few decades, as improving image resolution offers many advantages. Going beyond the previously presented methods, Generative Adversarial Networks offers a very promising solution. In this work, we will use the Generative Adversarial Networks-based approach to obtain 4x resolution images that are perceptually better than previous solutions. Our extensive experiments, including perceptual comparison, Peak Signal-to-Noise Ratio, and classification success metrics, show that our approach is quite promising for image super-resolution. © 2022 IEEE.
  Master Thesis
    Reklam tıklama tahmini için takviyeli öğrenme
    (2023) Yıldız, Beytullah; Software Engineering
    Çevrimiçi reklamcılıkta kritik öneme sahip tıklama oranı (CTR) tahmini için geleneksel yöntemler, kullanıcı tercihlerinin dinamikliği ve reklamların alakasını kapsamada zorlanırken, yeni stratejilerin keşfini başarılı olanlarla dengeli bir şekilde sağlayan Thompson Örnekleme gibi takviyeli öğrenme (RL) algoritmaları, etkili bir çözüm sunar. Bu araştırmada, gerçek dünya reklam izlenimleri ve tıklamalarını simüle etmek için özel bir OpenAI Gym ortamını ve kullanıcı tercihlerinin ve reklamların alakasının sürekli değişimini ele alan dinamik CTR'yi tahmin etmek için bir Thompson Örnekleme uygulamasını içeren yeni bir RL tabanlı yaklaşım sunuyoruz. Bulgular, Thompson Örnekleme'nin CTR tahmininde, diğer RL stratejilerinden yaklaşık \%10 daha yüksek bir güven seviyesi ile, üstün bir performans sergilediğini ve bu sayede çevrimiçi reklam seçim süreçlerinin önemli ölçüde gelişebileceğini, böylece daha yüksek CTR'ler ve potansiyel olarak reklam yayıncıları için artan gelir sağlayabileceğini öne sürüyor.
  Article
    Citation Count: 14
    Text classification using improved bidirectional transformer
    (Wiley, 2022) Tezgider, Murat; Yıldız, Beytullah; Software Engineering
    Text data have an important place in our daily life. A huge amount of text data is generated everyday. As a result, automation becomes necessary to handle these large text data. Recently, we are witnessing important developments with the adaptation of new approaches in text processing. Attention mechanisms and transformers are emerging as methods with significant potential for text processing. In this study, we introduced a bidirectional transformer (BiTransformer) constructed using two transformer encoder blocks that utilize bidirectional position encoding to take into account the forward and backward position information of text data. We also created models to evaluate the contribution of attention mechanisms to the classification process. Four models, including long short term memory, attention, transformer, and BiTransformer, were used to conduct experiments on a large Turkish text dataset consisting of 30 categories. The effect of using pretrained embedding on models was also investigated. Experimental results show that the classification models using transformer and attention give promising results compared with classical deep learning models. We observed that the BiTransformer we proposed showed superior performance in text classification.