Data Science (DS)
Browse by
Recent Submissions

CURIE: a cellular automaton for concept drift detection
(20211101)Data stream mining extracts information from large quantities of data flowing fast and continuously (data streams). They are usually affected by changes in the data distribution, giving rise to a phenomenon referred to as ... 
Package wsbackfit for Smooth Backfitting Estimation of Generalized Structured Models
(20210101)A package is introduced that provides the weighted smooth backfitting estimator for a large family of popular semiparametric regression models. This family is known as generalized structured models, comprising, for example, ... 
ROCnReg: An R Package for Receiver Operating Characteristic Curve Inference With and Without Covariates
(20210101)This paper introduces the package ROCnReg that allows estimating the pooled ROC curve, the covariatespecific ROC curve, and the covariateadjusted ROC curve by different methods, both from (semi) parametric and nonparametric ... 
Phenomics data processing: A plotlevel model for repeated measurements to extract the timing of key stages and quantities at defined time points
(2021)Decisionmaking in breeding increasingly depends on the ability to capture and predict crop responses to changing environmental factors. Advances in crop modeling as well as highthroughput eld phenotyping (HTFP) hold ... 
Alternative Representations for Codifying Solutions in PermutationBased Problems
(20200701)Since their introduction, Estimation of Distribution Algorithms (EDAs) have proved to be very competitive algorithms to solve many optimization problems. However, despite recent developments, in the case of permutationbased ... 
LUNAR: Cellular automata for drifting data streams
(20210108)With the advent of fast data streams, realtime machine learning has become a challenging task, demanding many processing resources. In addition, they can be affected by the concept drift effect, by which learning methods ... 
ATMFCGA: An Adaptive Transferguided Multifactorial Cellular Genetic Algorithm for Evolutionary Multitasking
(20210901)Transfer Optimization is an incipient research area dedicated to solving multiple optimization tasks simultaneously. Among the different approaches that can address this problem effectively, Evolutionary Multitasking resorts ... 
On solving cycle problems with BranchandCut: extending shrinking and exact subcycle elimination separation algorithms
(20210101)In this paper, we extend techniques developed in the context of the Travelling Salesperson Problem for cycle problems. Particularly, we study the shrinking of support graphs and the exact algorithms for subcycle elimination ... 
Statistical assessment of experimental results: a graphical approach for comparing algorithms
(20210825)Nondeterministic measurements are common in realworld scenarios: the performance of a stochastic optimization algorithm or the total reward of a reinforcement learning agent in a chaotic environment are just two examples ... 
Simulation approach for assessing the performance of the γEWMA control chart
(20210222)i) Purpose: The purpose of this paper is to evaluate the performance of a modified EWMA control chart ($\gamma$EWMA control chart), which considers data distribution and incorporate its correlation structure, simulating ... 
Altered effective connectivity in sensorimotor cortices: a novel signature of severity and clinical course in depression
(2021)Functional neuroimaging research on depression has traditionally targeted neural networks associated with the psychological aspects of depression. In this study, instead, we focus on alterations of sensorimotor function ... 
From habitat to management: a simulation framework for improving statistical methods in fisheries science
(20210707)Monte Carlo simulation consists of computer experiments that involve creating data by pseudorandom sampling and has shown to be a powerful tool for studying the performance of statistical methods. In this thesis Monte ... 
A Review on Outlier/Anomaly Detection in Time Series Data
(2021)Recent advances in technology have brought major breakthroughs in data collection, enabling a large amount of data to be gathered over time and thus generating time series. Mining this data has become an important task for ... 
Water leak detection using selfsupervised time series classification
(2021)Leaks in water distribution networks cause a loss of water that needs to be com pensated to ensure a continuous supply for all customers. This compensation is achieved by increasing the flow of the network, which entails ... 
A cheap feature selection approach for the K means algorithm
(202105)The increase in the number of features that need to be analyzed in a wide variety of areas, such as genome sequencing, computer vision or sensor networks, represents a challenge for the Kmeans algorithm. In this regard, ... 
Statistical model for reproducibility in rankingbased feature selection
(20201105)The stability of feature subset selection algorithms has become crucial in realworld problems due to the need for consistent experimental results across different replicates. Specifically, in this paper, we analyze the ... 
On the symmetry of the Quadratic Assignment Problem through Elementary Landscape Decomposition
(202107)When designing metaheuristic strategies to optimize the quadratic assignment problem (QAP), it is important to take into account the specific characteristics of the instance to be solved. One of the characteristics that ... 
Exploring Gaps in DeepFool inSearch of More Effective Adversarial Perturbations
(2021)Adversarial examples are inputs subtly perturbed to produce a wrong prediction in machine learning models, while remaining perceptually similar to the original input. To find adversarial examples, some attack strategies ... 
Delineation of site‐specific management zones using estimation of distribution algorithms
(2021)In this paper, we present a novel methodology to solve the problem of delineating homogeneous sitespecific management zones (SSMZ) in agricultural fields. This problem consists of dividing the field into small regions for ... 
On the fair comparison of optimization algorithms in different machines
(2021)An experimental comparison of two or more optimization algorithms requires the same computational resources to be assigned to each algorithm. When a maximum runtime is set as the stopping criterion, all algorithms need to ...