{"id":464,"date":"2021-07-14T08:23:08","date_gmt":"2021-07-14T08:23:08","guid":{"rendered":"http:\/\/nbmds.uva.es\/?page_id=464"},"modified":"2022-01-13T12:38:04","modified_gmt":"2022-01-13T12:38:04","slug":"mini-symposia","status":"publish","type":"page","link":"http:\/\/nbmds.uva.es\/mini-symposia\/","title":{"rendered":"mini-symposia"},"content":{"rendered":"
[et_pb_section fb_built=”1″ _builder_version=”4.9.7″ _module_preset=”default” global_colors_info=”{}”][et_pb_row column_structure=”3_4,1_4″ _builder_version=”4.9.4″ _module_preset=”default” locked=”off” global_colors_info=”{}”][et_pb_column type=”3_4″ _builder_version=”4.9.4″ _module_preset=”default” global_colors_info=”{}”][et_pb_text _builder_version=”4.9.9″ _module_preset=”default” global_colors_info=”{}”]<\/p>\n
(For a correct display on a mobile device, you may need to flip your device to horizontal orientation)<\/p>\n
<\/span><\/p>\n [\/et_pb_text][et_pb_text _builder_version=”4.14.2″ _module_preset=”default” width=”100%” hover_enabled=”0″ global_colors_info=”{}” sticky_enabled=”0″]MS1. High-dimensional Bayesian networks<\/strong><\/p>\n Speaker: <\/strong> Antonio Salmer\u00f3n (U. of Almer\u00eda, Spain) Speaker: <\/strong> Jose L. Moreno (Technical University of Madrid, Spain) Slides <\/a><\/strong><\/p>\n Speaker: <\/strong>Ofelia Paula Retamero Pascual (U. of Granada, Spain) Speaker: <\/strong>Borja S\u00e1nchez-L\u00f3pez (IIIA-CSIC, Spain) Dual stochastic natural gradient descent (DSNGD) is our version of a natural gradient based algorithm to optimize the conditional log-likelihood of a class variable Y given features X. It is convergent and its computational complexity is linear, when X is discrete. We define DSNGD and take a glance to its convergence property. Some experiments are discussed paying special attention to the performance enhancement acquired after convergence property, with respect to standard non convergent stochastic natural gradient descent (SNGD). We extend DSNGD to Bayesian networks where the log-odds ratio of P(Y|X) is an affine function of features. Since DSNGD is showing low computational complexity, it scales nicely as dimension of the manifold grows.<\/p>\n<\/div>\n<\/div>\n MS2. Functional Data Analysis (I)<\/strong><\/p>\n Speaker: <\/strong> M. Carmen Aguilera Morillo (Universitat Polit\u00e8cnica de Val\u00e8ncia) Speaker: <\/strong> Antonio Cuevas Speaker: <\/strong> Stanislav Nagy (Charles University, Prague) Speaker: <\/strong> Piercesare Secchi (Politecnico di Milano) MS3. Spatio-temporal Data Science<\/strong><\/p>\n Speaker: <\/strong> Stefano Castruccio (University of Notre Dame, USA) Speaker: <\/strong>Aritz Adin (Universidad Publica de Navarra) Speaker: <\/strong>Ying Sun (KAUST University) Speaker: <\/strong>Marc Genton (KAUST University) MS4. Interpretability and explainability of algorithms<\/strong><\/p>\n Speaker: <\/strong> Enrique Valero-Leal Speaker: <\/strong> Jos\u00e9 Luis Salmer\u00f3n, Universidad Pablo de Olavide de Sevilla. Speaker: <\/strong> Pablo Morala, Universidad Carlos III de Madrid. Speaker: <\/strong> Jasone Ram\u00edrez-Ayerbe MS5. High-dimensional variable selection<\/strong><\/p>\n Speaker: <\/strong> Amparo Ba\u00edllo Speaker: <\/strong> Anabel Forte Deltell Speaker: <\/strong>A\u0301lvaro Me\u0301ndez Civieta Speaker: <\/strong>Pepa Ram\u00edrez-Cobo In this paper we propose a sparse version of the Na\u00efve Bayes classifier that is characterized by three properties. First, the sparsity is achieved taking into account the correlation structure of the covariates. Second, different performance measures can be used to guide the selection of features. Third, performance constraints on groups of higher interest can be included.<\/p>\n<\/div>\n<\/div>\n MS6. Fair learning<\/strong><\/p>\n Speaker: <\/strong> Adri\u00e1n P\u00e9rez-Suay, Universitat de Val\u00e8ncia Speaker: <\/strong> Paula Gordaliza (BCAM) Speaker: <\/strong>: Jaume Abella and Francisco J. Cazorla (Barcelona Supercomputing Center) Speaker: <\/strong>Hristo Inouzhe (BCAM) MS7. Optimal transport for data science<\/strong><\/p>\n Speaker: <\/strong> Marc Hallin, ECARES and Department of Mathematics, Universit\u00e9 libre de Bruxelles Speaker: <\/strong>Jos\u00e9 Antonio Carrillo de la Plata, Mathematical Institute, University of Oxford We then develop these ideas to propose a novel method for sampling and also optimization tasks based on a stochastic interacting particle system. We explain how this method can be used for the following two goals: (i) generating approximate samples from a given target distribution, and (ii) optimizing a given objective function. This approach is derivative-free and affine invariant, and is therefore well-suited for solving complex inverse problems, allowing (i) to sample from the Bayesian posterior and (ii) to find the maximum a posteriori estimator. We investigate the properties of this family of methods in terms of various parameter choices, both analytically and by means of numerical simulations.<\/p>\n This talk is a summary of works in collaboration with Y.-P. Choi, O. Tse, C. Totzeck, F. Hoffmann, A. Stuart and U. Vaes.<\/p>\n Speaker: <\/strong>Alberto Gonz\u00e1lez Sanz, Institut de Math\u00e9matiques de Toulouse and ANITI Speaker: <\/strong>Jean-Michel Loubes, Institut de Math\u00e9matiques de Toulouse and ANITI MS8. Adversarial Machine Learning<\/strong><\/p>\n Speaker: <\/strong> D. R\u00edos Insua, R. Naveiro (ICMAT), J. Poulos (Harvard) Speaker: <\/strong> F. Ruggeri (CNR-IMATI), V. Gallego, A. Redondo (ICMAT) Speaker: <\/strong> R. Naveiro (ICMAT), T. Ekin (Texas State), A. Torres (ICMAT) Speaker: <\/strong> D. Garc\u00eda-Rasines, C. Guevara, S. Rodr\u00edguez-Santana (ICMAT) MS9. Probabilistic Learning<\/strong><\/p>\n Speaker: <\/strong> Santiago Mazuelas, Basque Center for Applied Mathematics (BCAM), Bilbao, Spain
\n
\nCo-authors: <\/strong> Helge Langseth, Thomas D. Nielsen, Andr\u00e9s R. Masegosa
\nTitle: <\/strong>High dimensional hybrid Bayesian networks: Is there life beyond the exponential family?
\nAbstract: <\/strong>Within the context of hybrid Bayesian networks, the problem of high dimensionality is challenging both from the point of view of parameter estimation and probabilistic inference. While parameter estimation can be efficiently carried out, specially for models within the exponential family of distributions, it sometimes comes along with limitations on the network structure or costly probabilistic inference\/Bayesian updating schemes. On the other hand, probabilistic models based on mixtures of truncated basis functions (MoTBFs) have turned out to be compatible with efficient probabilistic inference schemes. However, MoTBFs do not belong to the exponential family, which makes the parameter estimation process more problematic due to, for instance, the non-existence of fixed dimension sufficient statistics (but the sample itself). In this work we explore some reparameterizations of MoTBFs distributions that make possible the use of efficient likelihood-based parameter
\nestimation procedures.<\/p>\n
\n
\nCo-authors: <\/strong> Nikolas Bernaola, Pedro Larra\u00f1aga, Concha Bielza
\nTitle: <\/strong>Learning and visualizing massive Bayesian networks with FGES-Merge and BayeSuites
\nAbstract: <\/strong>In this work we present a new algorithm, FGES-Merge, for learning massive Bayesian networks of the order of tens of thousands of nodes by using properties of the topology of the network and improving the parallelization of the arc search procedure. We use the algorithm to learn a network for the full human genome using expression data from the brain and to aid with the interpretation of the results, we present the BayeSuites web tool, which allows for the visualization of the network and gives a GUI for inference and search over the network avoiding the typical scalability problems of networks of this size.<\/p>\n
\n
\nCo-authors: <\/strong> Manuel G\u00f3mez-Olmedo, Andr\u00e9s Cano Utrera
\nTitle: <\/strong>Approximation in Value-Based Potentials
\nAbstract: <\/strong>When dealing with complex models (i.e., models with many variables, a high degree of dependency between variables, or many states per variable), the efficient representation of quantitative information in probabilistic graphical models (PGMs) is a challenging task. To address this problem, Value-Based Potentials (VBPs) leverage repeated values to reduce memory requirements when managing Bayesian Networks or Influence Diagrams. In this work, we propose how to approximate VBPs to achieve a greater reduction in the memory space required and thus be able to deal with more complex models.
\nSlides <\/a><\/strong><\/p>\n
\n
\nCo-authors: <\/strong> Jes\u00fas Cerquides
\nTitle: <\/strong>Convergent and fast natural gradient based optimization method DSNGD and adaptation to large dimensional Bayesian networks
\nAbstract: <\/strong>Information geometry has shown that probabilistic models are twisted and distorted manifolds compared to standard Euclidean spaces. In such cases where every point of the manifold describes a probability distribution, Fisher information metric (FIM) becomes handy to correctly observe the space and their local measure as it actually is. For example, the gradient of a function defined on such manifold is not even well defined until we apply metric information to it. Once FIM is considered, the steepest ascent direction is available and well defined, this is the so-called natural gradient.<\/p>\n
\n
\nCo-authors: <\/strong> Pavel Hern\u00e1ndez Amaro, Mar\u00eda Durban (Universidad Carlos III de Madrid)
\nTitle: <\/strong>Penalized methods for functional data with variable domain: application to chronic obstructive pulmonary disease
\nAbstract: <\/strong>Most statistical techniques for functional data analysis have been developed for situations where all functions have the same domain. However, many real datasets do not stand for this assumption and then, new approaches to estimate functional regression models for functional data with variable domain are required. In this work we focus on variable-domain functional regression models, which estimation is based on basis representations with B-splines and a discrete penalty. This research is motivated by a real study, carried out in collaboration with the Hospital de Galdakao (Vizcaya) and the Universidad del Pa\u00eds Vasco, whose objective is to study the impact of physical activity in patients with Chronic Obstructive Pulmonary Disease on the progression of the disease in terms of the number of hospitalisations.<\/p>\n
\n
\nCo-authors: <\/strong> Jos\u00e9 R. Berrendero, Beatriz Bueno-Larraz, Antonio Co\u00edn (Universidad Aut\u00f3noma de Madrid)
\nTitle: <\/strong>On an alternative formulation of the functional logistic model
\nAbstract: <\/strong>The problem of predicting a binary response Y from a functional explanatory variable X=X(t) arises very often in practice. A common approach, considered by several authors in the recent literature, is the L2-based functional logistic model. We explore here an alternative approach based on the theory of Reproducing Kernel Hilbert Spaces. We will show how this alternative model offers some theoretical and practical advantages in terms of generality (as it encompasses, as particular cases, many easy-to-interpret models, including the L2-one) and ease of estimation of the involved parameters.<\/p>\n
\n
\nTitle: <\/strong>Functional depth: Recent progress and perspectives
\nAbstract: <\/strong>The depth is a tool of nonparametric statistics. Its objective is to generalise quantiles, rankings, and orderings to multivariate and non-Euclidean data. While a rich body of literature on various depths and depth-like procedures exists, many open problems still stimulate research in the area. We consider the depth of random functions. We revisit the very definition of the standard depths for functional data and introduce procedures allowing adaptive selection of a depth in functional data analysis. Secondly, we draw connections of the functional depth research with topics firmly established in the statistical machine learning literature.
\nSlides <\/a><\/strong><\/p>\n
\n
\nCo-authors: <\/strong> Alessandra Menafoglio, Laura Sangalli, Riccardo Scimone (Politecnico di Milano)
\nTitle: <\/strong>Object Oriented Spatial Statistics (O2S2) for densities: an application to the analysis of mortality from all causes in Italy during the COVID-19 pandemic.
\nAbstract: <\/strong>Along the unifying perspective offered by Object Oriented Spatial Statistics (O2S2), we analyze the densities of the time of death during the calendar year for the Italian provinces and municipalities in the year 2020, the first of the COVID-19 pandemic. The official daily data on mortality from all causes are provided by ISTAT, the Italian National Institute of Statistics. Densities are regarded as functional data belonging to the Bayes space B^2. In this space, we use functional-on-functional linear models to predict the expected mortality densities in 2020, based on those observed in the previous years, and we compare predictions with actual observations, to assess the impact of the pandemic. Through spatial downscaling of the provincial data, we identify spatial clusters of municipalities characterized by mortality densities anomalous with respect to the surroundings. The analysis could be extended to indexes different from death counts, measured at a granular spatio-temporal scale, and used as proxies for quantifying the local disruption generated by the pandemic.<\/p>\n<\/div>\n<\/div>\n
\n
\nTitle: <\/strong>Calibration of Spatial Forecasts from Citizen Science Urban Air Pollution Data with Sparse Recurrent Neural Networks
\nAbstract: <\/strong>With their continued increase in coverage and quality, data collected from personal air quality monitors has become an increasingly valuable tool to complement existing public health monitoring systems over urban areas. However, the potential of using such `citizen science data\u2019 for automatic early warning systems is hampered by the lack of models able to capture the high-resolution, nonlinear spatio-temporal features stemming from local emission sources such as traffic, residential heating and commercial activities. In this work, we propose a machine learning approach to forecast high-frequency spatial fields which has two distinctive advantages from standard neural network methods in time: 1) sparsity of the neural network via a spike-and-slab prior, and 2) a small parametric space. The introduction of stochastic neural networks generates additional uncertainty, and in this work we propose a fast approach for forecast calibration, both marginal and spatial. We focus on assessing exposure to urban air pollution in San Francisco, and our results suggest an improvement of 35.7% in the mean squared error over standard time series approach with a calibrated forecast for up to 5 day.
\nVideo<\/a><\/strong><\/p>\n
\n
\nCoauthors: <\/strong>Erick-Orozco Acosta and Mar\u00eda Dolores Ugarte
\nTitle: <\/strong>Scalable Bayesian models for spatio-temporal count data
\nAbstract: <\/strong>Spatio-temporal disease mapping studies the geographical distribution of a disease in space and its evolution in time. Many statistical techniques have been proposed during the last years for analyzing disease risks, most of them including spatial and temporal random effects to smooth risks borrowing information from neighbouring regions and time periods. Despite the enormous expansion of modern computers and the development of new software and estimation techniques to make fully Bayesian inference, dealing with massive data is still computationally challenging. In this work, we propose a scalable Bayesian modeling approach to smooth mortality or incidence risks in high-dimensional spatio-temporal disease mapping context. The method is based on the well-known \u201cdivide and conquer\u201d approach, so that local models can be simultaneously fitted reducing the computational time substantially. Model fitting and inference is carried out using the well-known integrated nested Laplace approximation (INLA) technique. The methods and algorithms proposed in this work are being implemented in the R package \u201cbigDM\u201d available at https:\/\/github.com\/spatialstatisticsupna\/bigDM. We illustrate the models\u2019s behaviour by estimating lung cancer mortality risks in almost 8000 municipalities of Spain during the period 1991-2015. A simulation study is also conducted to evaluate the performance of this new scalable modeling approach in comparison with usual spatio-temporal models in disease mapping.
\nSlides <\/a><\/strong>
\nVideo<\/a><\/strong><\/p>\n
\n
\nTitle: <\/strong>DeepKriging: Spatially Dependent Deep Neural Networks for Spatial Prediction
\nAbstract: <\/strong>In spatial statistics, a common objective is to predict the values of a spatial process at unobserved locations by exploiting spatial dependence. In geostatistics, Kriging provides the best linear unbiased predictor using covariance functions and is often associated with Gaussian processes. However, when considering non-linear prediction for non-Gaussian and categorical data, the Kriging prediction is not necessarily optimal, and the associated variance is often overly optimistic. We propose to use deep neural networks (DNNs) for spatial prediction. Although DNNs are widely used for general classification and prediction, they have not been studied thoroughly for data with spatial dependence. In this work, we propose a novel neural network structure for spatial prediction by adding an embedding layer of spatial coordinates > with basis functions. We show in theory that the proposed DeepKriging method has multiple advantages over Kriging and classical DNNs only with spatial coordinates as features. We also provide density prediction for uncertainty quantification without any distributional assumption and apply the method to PM2.5 concentrations across the continental United States.
\nVideo<\/a><\/strong><\/p>\n
\n
\nTitle: <\/strong>Large-Scale Spatial Data Science with ExaGeoStat
\nAbstract: <\/strong>Spatial data science aims at analyzing the spatial distributions, patterns, and relationships of data over a predefined geographical region. For decades, the size of most spatial datasets was modest enough to be handled by exact inference. Nowadays, with the explosive increase of data volumes, High-Performance Computing (HPC) can serve as a tool to handle massive datasets for many spatial applications. Big data processing becomes feasible with the availability of parallel processing hardware systems such as shared and distributed memory, multiprocessors and GPU accelerators. In spatial statistics, parallel and distributed computing can alleviate the computational and memory restrictions in large-scale Gaussian process inference and prediction. In this talk, we will describe cutting-edge HPC techniques and their applications in solving large-scale spatial problems with the new software ExaGeoStat.
\nVideo<\/a><\/strong><\/p>\n<\/div>\n<\/div>\n
\n
\nCo-authors: <\/strong> Pedro Larra\u00f1aga, Concha Bielza
\nTitle: <\/strong>Explaining Bayesian networks using MAP-independence: Some new properties
\nAbstract: <\/strong>In discrete Bayesian networks, MAP-independence tries to define a notion of variable relevance in a probabilistic inference and uses it as an explanation. In our work, we deepen further into this idea, exploring some properties of the original proposal, expanding them to the continuous domain and lying the ground for new methodologies for explaining Bayesian networks.
\nSlides <\/a><\/strong>
\nVideo<\/a><\/strong><\/p>\n
\n
\nTitle: <\/strong>Opening the black-box of deep learning architecture with Ranked-LRP
\nAbstract: <\/strong>Understanding what Deep Learning models are doing is not always trivial. This is especially true for complex models such as Deep Neural Networks, which are the best-suited algorithms for modeling very complex and nonlinear relationships. But this need to understand has become a must since privacy regulations (GDPR and others) are hardening the use of these models in specific industries. There are several methods to address the explainability issues that Machine Learning models arises. This paper is focused on opening the so-called Deep Neural architectures black-box. This research extends the technique called Layerwise Relevant Propagation (LRP) enhancing its properties to compute the most critical paths in different deep neural architectures using multicriteria analysis. We call this technique Ranked-LRP and it was tested on four different datasets and tasks, including classification and regression tasks. The results show the worth of our proposal.
\nVideo<\/a><\/strong><\/p>\n
\n
\nTitle: <\/strong>Can neural networks be explained using polynomial regressions and Taylor series?
\nAbstract: <\/strong>While neural networks are one of the main actual trends in machine learning and artificial intelligence, they are still considered not easily interpretable and therefore they are usually referred as black boxes. Here we present a new approach to this problem by finding a relationship between the weights of a trained feed forward neural network and the coefficients of a polynomial regression that performs almost equivalently as the original neural network. This is achieved through Taylor expansion at the activation functions of each neuron, and then the resulting expressions are joint in order to obtain a combination of the original network weights that are associated with each term of a polynomial regression. The order of this polynomial regression is determined by the order used in the Taylor expansion and the number of layers in the neural network. This proposal has been empirically tested covering a wide range of different situations, showing its effectiveness and opening the door to extending this methodology to a more broad range of types of neural networks. This kind of relationship between modern machine learning techniques and more traditional statistical approaches can help solve interpretability concerns and provide new tools to develop their theoretical foundations. In this case, polynomial regression coefficients have a much easier interpretation than neural network weights and it significantly reduces the number of parameters.
\nSlides <\/a><\/strong>
\nVideo<\/a><\/strong><\/p>\n
\n
\nCoauthors: <\/strong> Emilio Carrizosa, Dolores Romero Morales
\nTitle: <\/strong>Counterfactual Explanations via Mathematical Optimization
\nAbstract: <\/strong>Due to the increasing use of complex machine learning models, often seen as \u201cblack boxes\u201d, it has become more and more important to be able to understand and explain their behaviour, and thus ensure transparency and fairness. An effective class of post-hoc explanations are counterfactual explanations, i.e. minimal perturbations of the predictor variables to change the prediction for a specific instance. We propose a multi-objective mathematical formulation for different state-of-the-art models based on scores, including tree ensemble classifiers and linear models. We formulate the problem at individual and group level. Real-world data has been used to illustrate our method.
\nSlides <\/a><\/strong>
\nVideo<\/a><\/strong><\/p>\n<\/div>\n<\/div>\n
\n
\nTitle: <\/strong>Ensemble distance-based regression and classification for large sets of mixed-type data
\nAbstract: <\/strong>The distance-based linear model (DB-LM) extends the classical linear regression to the framework of mixed-type predictors or when the only available information is a distance matrix between regressors (as it sometimes happens with big data). The main drawback of these DB methods is their computational cost, particularly due to the eigendecomposition of the Gram matrix. In this context, ensemble regression techniques provide a useful alternative to fitting the model to the whole sample. This work analyzes the performance of three subsampling and aggregation techniques in DB regression on two specific large, real datasets. We also analyze, via simulations, the performance of bagging and DB logistic regression in the classification problem with mixed-type features and large sample sizes.
\nSlides <\/a><\/strong><\/p>\n
\n
\nTitle: <\/strong>Bayesian methods for variable selection. Challenges of the XXI Century.
\nAbstract: <\/strong>Model selection and, in particular Variable selection is, without doubt, one of the most difficult procedures in science. Along history it has been approached from different points of view as well as from different paradigms such as Frequentist or Bayesian statistics. Specifically in this talk we will review how Bayesian Statistics can deal with variable selection, trying to understand the advantages of this paradigm. Also we will try to point to the new challenges that the Era of high dimensional data adds to this already difficult task and how Bayes may deal with it.<\/p>\n
\n
\nCo-authors: <\/strong> M. Carmen Aguilera-Morillo; Rosa E. Lillo
\nTitle: <\/strong>fPQR: A quantile based dimension reduction technique for regression.
\nAbstract: <\/strong>Partial least squares (PLS) is a well known dimensionality reduction technique used as an alternative to ordinary least squares (OLS) in collinear or high dimensional scenarios. Being based on OLS estimators, PLS is sensitive to the presence of outliers or heavy tailed distributions. Opposed to this, quantile regression (QR) is a technique that provides estimates of the conditional quantiles of a response variable as a function of the covariates. The usage of the quantiles makes the estimates more robust against the presence of heteroscedasticity or outliers than OLS estimators. In this work, we introduce the fast partial quantile regression algorithm (fPQR), a quantile based technique that shares the main advantages of PLS: it is a dimension reduction technique that obtains uncorrelated scores maximizing the quantile covariance between predictors and responses. But additionally, it is also a robust, quantile linked methodology suitable for dealing with outliers, heteroscedastic or heavy tailed datasets. The median estimator of the PQR algorithm is a robust alternative to PLS, while other quantile levels can provide additional information on the tails of the responses.<\/p>\n
\n
\nCo-authors: <\/strong> Rafael Blanquero, Emilio Carrizosa, M. Remedios Sillero-Denamiel
\nTitle: <\/strong>Variable selection for Na\u00efve Bayes classification
\nAbstract: <\/strong>The Na\u00efve Bayes has proven to be a tractable and efficient method for classification in multivariate analysis. However, features are usually correlated, a fact that violates the Na\u00efve Bayes’ assumption of conditional independence, and may deteriorate the method’s performance. Moreover, datasets are often characterized by a large number of features, which may complicate the interpretation of the results as well as slow down the method’s execution.<\/p>\n
\n
\nTitle: <\/strong>From learning with fair regularizers to physics aware models
\nAbstract: <\/strong>In recent years, Machine Learning (ML) models have increased its capability and lead to solutions of real world problems. Some of those problems directly affect people’s lives, like for instance autonomous driving cars, learning from social networks or bank loan prediction.
\nWhen dealing with real data scenarios, Machine Learning models could lead to biased decisions over protected variables, this could incur in moral and\/or legal violations. In this talk we cover some independence regularizers to overcome these model limitations.
\nIn particular, we revise the Fair Kernel Learning (FKL) method and introduce its probabilistic formulation, the Fair Gaussian Process. Furthermore, we introduce a new setting for using that FKL method to obtain more physically plausible models.
\nSlides <\/a><\/strong><\/p>\n
\n
\nTitle: <\/strong>Mathematical frameworks for fair learning: review of methods and study of the price for fairness
\nAbstract: <\/strong>A review of the main fairness definitions and fair learning methodologies proposed in the literature over the last years is presented from a mathematical point of view. Following an independence-based approach, we consider how to build fair algorithms and the consequences on the degradation of their performance compared to the possibly unfair case. This corresponds to the price for fairness given by the criteria statistical parity or equality of odds. Novel results giving the expressions of the optimal fair classifier and the optimal fair predictor (under a linear regression gaussian model) in the sense of equality of odds are presented.
\nSlides <\/a><\/strong><\/p>\n
\n
\nTitle: <\/strong>Certification Aspects in Future AI-Based High-Integrity Systems
\nAbstract: <\/strong>The trend towards increased autonomy functions in high-integrity systems, like those in planes and cars, causes disruptive changes to the certification process. At software level, the challenge relates to the increasing use of Artificial Intelligence (AI) based software to provide the levels of accuracy required. At the hardware level, it relates to the use of high-performance heterogeneous multi-core processors to provide the required level of computing performance and the impact multi-cores have on functional safety including software timing aspects. In this talk we will cover some of the main challenges brought by both, AI software and multi-cores, to the certification process of high-integrity systems. We will also discuss potential research paths to address those challenges.
\nVideo<\/a><\/strong><\/p>\n
\n
\nTitle: <\/strong> Attraction-Repulsion clustering: an approach to fair clustering through diversity enhancement
\nAbstract: <\/strong>We consider the problem of diversity enhancing clustering, i.e, developing clustering methods which produce clusters that favour diversity with respect to a set of protected attributes such as race, sex, age, etc. In the context of fair clustering, diversity plays a major role when fairness is understood as demographic parity. To promote diversity, we introduce perturbations to the distance in the unprotected attributes that account for protected attributes in a way that resembles attraction-repulsion of charged particles in Physics. These perturbations are defined through dissimilarities with a tractable interpretation. Cluster analysis based on attraction-repulsion dissimilarities penalizes homogeneity of the clusters with respect to the protected attributes and leads to an improvement in diversity. An advantage of our approach, which falls into a pre-processing set-up, is its compatibility with a wide variety of clustering methods and whit non-Euclidean data. We illustrate the use of our procedures with both synthetic and real data and provide discussion about the relation between diversity, fairness, and cluster structure.<\/p>\n<\/div>\n<\/div>\n
\n
\nTitle: <\/strong>From Multivariate Quantiles to Copulas and Statistical Depth, and Back
\nAbstract: <\/strong>The univariate concept of quantile function — the inverse of a distribution function– plays a fundamental role in Probability and Statistics. In dimension two and higher, however, inverting
\ntraditional distribution functions does not lead to any satisfactory notion. In their quest for the Grail of an adequate definition, statisticians dug out two extremely fruitful theoretical pathways: copula transforms, where marginal quantiles are privileged over global ones, and depth functions, where a center-outward ordering is substituting the more traditional South-West\/North-East one. We show how a recent center-outward redefinition, based on measure transportation ideas, of the concept of distribution function reconciles and fine-tunes these two approaches, and eventually yields a notion of multivariate quantile matching, in arbitrary dimension d, all the properties that make univariate quantiles a successful and vital tool of statistical inference.<\/p>\n
\n
\nTitle: <\/strong>Consensus-Based Interacting Particle Systems and Mean-field PDEs for Optimization and Sampling
\nAbstract: <\/strong>We will start by doing a quick review on consensus models for swarming. Stability of patterns in these models will be briefly discussed. Then we provide an analytical framework for investigating the efficiency of a consensus-based model for tackling global optimization problems. We justify the optimization algorithm in the mean-field sense showing the convergence to the global minimizer for a large class of functions. An efficient algorithm for large dimensional problems is introduced. Theoretical results on consensus estimates will be illustrated by numerical simulations.<\/p>\n
\n
\nTitle: <\/strong>Central Limit Theorems for General Transportation Costs
\nAbstract: <\/strong>One of the main ways to quantify the distance between distributions is the well known Wasserstein metric. In Statistics and Machine Learning applications it is increasingly common to deal with measures supported on a high dimensional space. Some recents results show that the Wasserstein metric suffers from the curse of dimensionality, which means that its empirical approximation becomes worse as dimension grows. We will explain a new method based on the Efron-Stein inequality and on the sequential compactness of the closed unit ball in $L^2 (P)$ for the weak topology that improves a result of del Barrio and Loubes (2019) and states that, even if the empirical Wasserstein metric converges with slow rate, its oscillations around its mean are asymptotically Gaussian with rate $\\sqrt{n}$, $n$ being the sample size, which means that the curse of dimensionality is avoided in such a case. Finally, we will present some applications of these results to statistical and data science problems.<\/p>\n
\n
\nTitle: <\/strong>Optimal transport for kernel Gaussian processes.
\nAbstract: <\/strong>we propose to define Gaussian Processes indexed by multidimensional distributions. In the framework where the distributions can be modeled as i.i.d realizations of a measure on the set of distributions, we prove that the kernel defined as the quadratic distance between the transportation maps, that transport each distribution to the barycenter of the distributions, provides a valid covariance function. In this framework, we study the asymptotic properties of this process, proving micro ergodicity of the parameters.<\/p>\n<\/div>\n<\/div>\n
\n
\nTitle: <\/strong>Adversarial Machine Learning. An overview
\nAbstract: <\/strong>Adversarial machine learning aims at robustifying machine learning algorithms against possible actions from adversaries. Most earlier work in AML has modelled the confrontation between learning systems and adversaries as a 2-agent game from a game theoretic perspective. After briefly overviewing previous work, we shall present an alternative framework based on adversarial risk analysis.
\nVideo<\/a><\/strong><\/p>\n
\n
\nTitle: <\/strong>Bayesian approaches to protecting classifiers from attacks
\nAbstract: <\/strong>A major area within adversarial machine learning deals with producing classifiers that are robust to adversarial data manipulations. This talk will present formal Bayesian approaches to this problem considering settings in which robustification takes place at training time and at operation time.
\nVideo<\/a><\/strong><\/p>\n
\n
\nTitle: <\/strong>Augmented probability simulation for optimization in adversarial machine learning
\nAbstract: <\/strong>Adversarial machine learning from an adversarial risk analysis perspective entails a cumbersome computational procedure in which one first simulates from the attacker problem to forecast attacks and then includes such forecasts in the Defender problem to be optimized. We shall present how the procedure may be streamlined with the aid of augmented probability simulation approaches.
\nVideo<\/a><\/strong><\/p>\n
\n
\nTitle: <\/strong>Adversarial machine learning for financial applications
\nAbstract: <\/strong>Numerous business applications entail dynamic competitive decision environments under uncertainty. We shall sketch how adversarial machine learning methods may be used in such domains, illustrating the ideas with problems in relation to pension funds, loans and the stock market.
\nVideo<\/a><\/strong><\/p>\n<\/div>\n<\/div>\n
\n
\nTitle: <\/strong>Minimax Classification with 0-1 Loss and Performance Guarantees