Browsing by Author "Gokcay, Erhan"
Now showing 1 - 7 of 7
- Results Per Page
- Sort Options
Article Citation Count: 0Entropy based streaming big-data reduction with adjustable compression ratio(Springer, 2023) Gökçay, Erhan; Software EngineeringThe Internet of Things is a novel concept in which numerous physical devices are linked to the internet to collect, generate, and distribute data for processing. Data storage and processing become more challenging as the number of devices increases. One solution to the problem is to reduce the amount of stored data in such a way that processing accuracy does not suffer significantly. The reduction can be lossy or lossless, depending on the type of data. The article presents a novel lossy algorithm for reducing the amount of data stored in the system. The reduction process aims to reduce the volume of data while maintaining classification accuracy and properly adjusting the reduction ratio. A nonlinear cluster distance measure is used to create subgroups so that samples can be assigned to the correct clusters even though the cluster shape is nonlinear. Each sample is assumed to arrive one at a time during the reduction. As a result of this approach, the algorithm is suitable for streaming data. The user can adjust the degree of reduction, and the reduction algorithm strives to minimize classification error. The algorithm is not dependent on any particular classification technique. Subclusters are formed and readjusted after each sample during the calculation. To summarize the data from the subclusters, representative points are calculated. The data summary that is created can be saved and used for future processing. The accuracy difference between regular and reduced datasets is used to measure the effectiveness of the proposed method. Different classifiers are used to measure the accuracy difference. The results show that the nonlinear information-theoretic cluster distance measure improves the reduction rates with higher accuracy values compared to existing studies. At the same time, the reduction rate can be adjusted as desired, which is a lacking feature in the current methods. The characteristics are discussed, and the results are compared to previously published algorithms.Article Citation Count: 7A generalized Arnold's Cat Map transformation for image scrambling(Springer, 2022) Turan, Mehmet; Gokcay, Erhan; Gökçay, Erhan; Buker, Mohamed; Tora, Hakan; Mathematics; Software Engineering; Airframe and Powerplant MaintenanceThis study presents a new approach to generate the transformation matrix for Arnold's Cat Map (ACM). Matrices of standard and modified ACM are well known by many users. Since the structure of the possible matrices is known, one can easily select one of them and use it to recover the image with several trials. However, the proposed method generates a larger set of transform matrices. Thus, one will have difficulty in estimating the transform matrix used for scrambling. There is no fixed structure for our matrix as in standard or modified ACM, making it much harder for the transform matrix to be discovered. It is possible to use different type, order and number of operations to generate the transform matrix. The quality of the shuffling process and the strength against brute-force attacks of the proposed method is tested on several benchmark images.Article Citation Count: 1An information-theoretic instance-based classifier(Elsevier Science inc, 2020) Gökçay, Erhan; Software EngineeringClassification algorithms are used in many areas to determine new class labels given a training set. Many classification algorithms, linear or not, require a training phase to determine model parameters by using an iterative optimization of the cost function for that particular model or algorithm. The training phase can adjust and fine-tune the boundary line between classes. However, the process may get stuck in a local optimum, which may or may not be close to the desired solution. Another disadvantage of training processes is that upon arrival of a new sample, a retraining of the model is necessary. This work presents a new information-theoretic approach to an instance-based supervised classification. The boundary line between classes is calculated only by the data points without any external parameters or weights, and it is given in closed-form. The separation between classes is nonlinear and smooth, which reduces memorization problems. Since the method does not require a training phase, classified samples can be incorporated in the training set directly, simplifying a streaming classification operation. The boundary line can be replaced with an approximation or regression model for parametric calculations. Features and performance of the proposed method are discussed and compared with similar algorithms. (C) 2020 Elsevier Inc. All rights reserved.Article Citation Count: 1A New Multi-Target Compiler Architecture for Edge-Devices and Cloud Management(Gazi Univ, 2022) Gökçay, Erhan; Software EngineeringEdge computing is the concept where the computation is handled at edge-devices. The transfer of the computation from servers to edge-devices will decrease the massive amount of data transfer generated by edge-devices. There are several efficient management tools for setup and connection purposes, but these management tools cannot provide a unified programming system from a single source code/project. Even though it is possible to control each device efficiently, a global view of the computation is missing in a programming project that includes several edge-devices for computation and data analysis purposes, and the devices need to be programmed individually. A generic workflow engine might automate part of the problem using standard interfaces and predefined objects miming on edge-devices. Nevertheless, the approach fails in fine-tuning each edge-device since the computation cannot be moved easily among devices. This paper introduces a new compiler architecture to control and program edge-devices from a single source code. The source code can be distributed to multiple edge-devices using simple compiler directives, and the transfer and communication of the source code with multiple devices are handled transparently. Fine-tuning the source code and code movement between devices becomes very efficient in editing and time. The proposed architecture is a lightweight system with fine-tuned computation and distribution among devices.Article Citation Count: 3A novel data encryption method using an interlaced chaotic transform(Pergamon-elsevier Science Ltd, 2024) Gökçay, Erhan; Tora, Hakan; Tora, Hakan; Software Engineering; Airframe and Powerplant MaintenanceWe present a novel data encryption approach that utilizes a cascaded chaotic map application. The chaotic map used in both permutation and diffusion is Arnold's Cat Map (ACM), where the transformation is periodic and the encrypted data can be recovered. The original format of ACM is a two-dimensional mapping, and therefore it is suitable to randomize the pixel locations in an image. Since the values of pixels stay intact during the transformation, the process cannot encrypt an image, and known-text attacks can be used to get back the transformation matrix. The proposed approach uses ACM to shuffle the positions and values of two-dimensional data in an interlaced and nested process. This combination extends the period of the transformation, which is significantly longer than the period of the initial transformation. Furthermore, the nested process's possible combinations vastly expand the key space. At the same time, the interlaced pixel and value transformation makes the encryption highly resistant to any known-text attacks. The encrypted data passes all random-data tests proposed by the National Institute of Standards and Technology. Any type of data, including ASCII text, can be encrypted so long as it can be rearranged into a two-dimensional format.Conference Object Citation Count: 1An on Demand Virtual CPU Arhitecture based on Cloud Infrastructure(Scitepress, 2017) Gökçay, Erhan; Software EngineeringCloud technology provides different computational models like, including but not limited to, infrastructure, platform and software as a service. The motivation of a cloud system is based on sharing resources in an optimal and cost effective way by creating virtualized resources that can be distributed easily but the distribution is not necessarily parallel. Another disadvantage is that small computational units like smart devices and less powerful computers, are excluded from resource sharing. Also different systems may have interoperability problems, since the operating system and CPU design differs from each other. In this paper, an on demand dynamically created computational architecture, inspired from the CPU design and called Cloud CPU, is described that can use any type of resource including all smart devices. The computational and data transfer requirements from each unit are minimized. Because of this, the service can be created on demand, each time with a different functionality. The distribution of the calculation over not-so-fast internet connections is compensated by a massively parallel operation. The minimized computational requirements will also reduce the interoperability problems and it will increase fault tolerance because of increased number of units in the system.Conference Object Citation Count: 0A Stream Clustering Algorithm using Information Theoretic Clustering Evaluation Function(Scitepress, 2018) Gökçay, Erhan; Software EngineeringThere are many stream clustering algorithms that can be divided roughly into density based algorithms and hyper spherical distance based algorithms. Only density based algorithms can detect nonlinear clusters and all algorithms assume that the data stream is an ordered sequence of points. Many algorithms need to receive data in buckets to start processing with online and offline iterations with several passes over the data. In this paper we propose a streaming clustering algorithm using a distance function which can separate highly nonlinear clusters in one pass. The distance function used is based on information theoretic measures and it is called Clustering Evaluation Function. The algorithm can handle data one point at a time and find the correct number of clusters even with highly nonlinear clusters. The data points can arrive in any random order and the number of clusters does not need to be specified. Each point is compared against already discovered clusters and each time clusters are joined or divided using an iteratively updated threshold.