Data Analytics Research Seminar

Upcoming seminars

6 March 2025, 10:45-12:00

Johanna Ziegel (ETH Zurich)

- Title: (Conformal) isotonic distributional regression
- Abstract: Isotonic distributional regression (IDR) is a nonparametric distributional regression approach under a monotonicity constraint. It has found application as a generic method for uncertainty quantification, in statistical postprocessing of weather forecasts, and in distributional single index models. IDR has favorable in-sample calibration and optimality properties, which allow to conformalize it and obtain out-of-sample online guarantees.

Joint work with Sam Allen, Georgios Gavrilopoulos, Tilmann Gneiting, Alexander Henzi, Eva-Maria Walz

20 March 2025, 10:30-11:45

Arthur Gretton (Gatsby Computational Neuroscience Unit)

- Title: TBA
- Abstract: TBA

3 April 2025, 10:30-11:45

Pierre Del Moral (INRIA Bordeaux)

- Title: TBA
- Abstract: TBA

5 June 2025, 10:30-11:45

Julie Josse (INRIA Montpellier)

- Title: TBA
- Abstract: TBA

Past seminars 2024-2025

3 October 2024,10:30-11:45

Jun Yang (University of Copenhagen)

- Title: Stereographic Markov Chain Monte Carlo
- Abstract: High-dimensional distributions, especially those with heavy tails, are notoriously difficult for off-the-shelf MCMC samplers: the combination of unbounded state spaces, diminishing gradient information, and local moves results in empirically observed ``stickiness'' and poor theoretical mixing properties -- lack of geometric ergodicity. In this talk, we introduce a new class of MCMC samplers that map the original high-dimensional problem in Euclidean space onto a sphere and remedy these notorious mixing problems. In particular, we develop random-walk Metropolis type algorithms as well as versions of the Bouncy Particle Sampler that are uniformly ergodic for a large class of light and heavy-tailed distributions and also empirically exhibit rapid convergence in high dimensions. In the best scenario, the proposed samplers can enjoy the ``blessings of dimensionality'' that the convergence is faster in higher dimensions.

Joint work with Krzysztof Łatuszyński and Gareth O. Roberts

17 October 2024,10:30-11:45

Chiara Amorino (Universitat Pompeu Fabra)

- Title: Minimax rate for multivariate data under componentwise local differential privacy constraint
- Abstract: Our research analyses the balance between maintaining privacy and preserving statistical accuracy when dealing with multivariate data that is subject to componentwise local differential privacy (CLDP). With CLDP, each component of the private data is made public through a separate privacy channel. This allows for varying levels of privacy protection for different components or for the privatization of each component by different entities, each with their own distinct privacy policies. It also covers the practical situations where it is impossible to privatize jointly all the components of the raw data.

We develop general techniques for establishing minimax bounds that shed light on the statistical cost of privacy in this context, as a function of the privacy levels $\alpha_1, \dots , \alpha_d$ of the $d$ components.

We demonstrate the versatility and efficiency of these techniques by presenting various statistical applications. Specifically, we examine nonparametric density and covariance estimation under CLDP, providing upper and lower bounds that match up to constant factors, as well as an associated data-driven adaptive procedure. Furthermore, we quantify the probability of extracting sensitive information from one component by exploiting the fact that, on another component which may be correlated with the first, a smaller degree of privacy protection is guaranteed. If time permits, we will finally discuss how to extend this concept to time-dependent data.

14 November 2024,10:30-11:45

Ingrid Van Keilegom (KU Leuven)

- Title: Copula Based Cox Proportional Hazards Models for Dependent Censoring
- Abstract: Most existing copula models for dependent censoring in the literature assume that the parameter defining the copula is known. However, prior knowledge on this dependence parameter is often unavailable. In this article we propose a novel model under which the copula parameter does not need to be known. The model is based on a parametric copula model for the relation between the survival time (T) and the censoring time (C), whereas the marginal distributions of T and C follow a semiparametric Cox proportional hazards model and a parametric model, respectively. We show that this model is identified, and propose estimators of the nonparametric cumulative hazard and the finite-dimensional parameters. It is shown that the estimators of the model parameters and the cumulative hazard function are consistent and asymptotically normal. We also investigate the performance of the proposed method using finite-sample simulations. Finally, we apply our model and estimation procedure to a follicular cell lymphoma dataset. Supplementary materials for this article are available online.

28 November 2024,10:45-12:00

François-Xavier Briol (University College London)

- Title: Robust and Conjugate Gaussian Process Regression
- Abstract: To enable closed form conditioning, a common assumption in Gaussian process (GP) regression is independent and identically distributed Gaussian observation noise. This strong and simplistic assumption is often violated in practice, which leads to unreliable inferences and uncertainty quantification. Unfortunately, existing methods for robustifying GPs break closed-form conditioning, which makes them less attractive to practitioners and significantly more computationally expensive. In this work, we demonstrate how to perform provably robust and conjugate Gaussian process (RCGP) regression at virtually no additional cost using generalised Bayesian inference. RCGP is particularly versatile as it enables exact conjugate closed form updates in all settings where standard GPs admit them. To demonstrate its strong empirical performance, we deploy RCGP for problems ranging from Bayesian optimisation to sparse variational Gaussian processes.

12 December 2024,10:30-11:45

Gil Kur (ETH Zürich)

- Title: On the Role of Gaussian Covariates in Minimum Norm Interpolation
- Abstract: In the literature on benign overfitting in linear models, also referred to as minimum norm interpolation, it is typically assumed that the covariates follow a Gaussian distribution. Existing proofs heavily rely on the Gaussian Minimax Theorem (GMT), making them inapplicable to other distributions in the linear setting. In our work, we are the first to establish matching rates for sub-Gaussian covariates in $\ell_p$-linear regression through a novel approach inspired by modern functional analysis. In this talk, we provide an overview of this proof and explore the role of Gaussian covariates in benign overfitting from a purely geometric perspective.

9 January 2025,10:30-11:45

O. Deniz Akyildiz (Imperial College London)

- Title: Diffusion-based Maximum Marginal Likelihood Estimation: Overdamped, Accelerated, and Proximal
- Abstract: In this talk, I will summarize recent progress and challenges in maximum marginal likelihood estimation (MMLE) – focusing on the methods based on Langevin diffusions. I will first introduce the problem and the necessary background on Langevin diffusions, together with recent results on Langevin-based MMLE estimators [1-3], detailing the interacting particle Langevin algorithm (IPLA) [3] which is a recent Langevin-based MMLE method with explicit theoretical guarantees akin to Langevin Monte Carlo methods. I will then cover extensions of these methods, specifically acceleration [4] and methods for MMLE in nondifferentiable statistical models [5] with convergence and complexity results. Finally, if time permits, I will talk about the application of IPLA to inverse problems [6].

[1] Kuntz, Juan, Jen Ning Lim, and Adam M. Johansen. “Particle algorithms for maximum likelihood training of latent variable models.” International Conference on Artificial Intelligence and Statistics. PMLR , 2023.

[2] De Bortoli, Valentin, et al. “Efficient stochastic optimisation by unadjusted Langevin Monte Carlo: Application to maximum marginal likelihood and empirical Bayesian estimation.” Statistics and Computing 31 (2021): 1-18.

[3] Akyildiz, Ö. D., Crucinio, F. R., Girolami, M., Johnston, T., & Sabanis, S. (2023). Interacting particle langevin algorithm for maximum marginal likelihood estimation. arXiv preprint arXiv:2303.13429.

[4] Oliva, P. F. V., & Akyildiz, O. D. (2024). Kinetic Interacting Particle Langevin Monte Carlo. arXiv preprint arXiv:2407.05790.

[5] Encinar, P. C., Crucinio, F. R., & Akyildiz, O. D. (2024). Proximal Interacting Particle Langevin Algorithms. arXiv preprint arXiv:2406.14292.

[6] Glyn-Davies, A., Duffin, C., Kazlauskaite, I., Girolami, M., & Akyildiz, Ö. D. (2024). Statistical Finite Elements via Interacting Particle Langevin Dynamics. arXiv preprint arXiv:2409.07101.

30 January 2025, 10:00-11:15

Giuseppe Cavaliere (University of Bologna/University of Exeter)

- Title: Bootstrap Diagnostic tests
- Abstract: Violations of the assumptions underlying classical asymptotic theory frequently lead to unreliable statistical inference. In this talk I propose a novel bootstrap-based diagnostic procedure to detect such violations. The suggested approach (i) focuses on the distance between the conditional distribution of a bootstrap statistic and the (limiting) Gaussian distribution, and (ii) proposes a method to assess whether this distance is large enough to indicate the invalidity of the asymptotic approximation. The method, which is computationally straightforward, involves applying standard normality tests to a set of bootstrap repetitions of a reference estimator or test statistic, in order to assess significant deviations from the Gaussian distribution. I discuss under what conditions the randomness in the data mixes with the randomness in the bootstrap repetitions in a way such that the diagnostics asymptotically (a) induce no pre-testing bias under the null, (b) can be performed using the same critical values in a broad range of applications, and (c) consistently detect deviations from asymptotic Gaussianity. To demonstrate the practical relevance and broad applicability of our diagnostic procedure, I discuss five scenarios where the asymptotic Gaussian approximation fails: (i) detecting infinite variance innovations in a location model for i.i.d. data; (ii) identifying non-stationary behavior in autoregressive time series; (iii) parameters near or at the boundary of the parameter space; (iv) invalidity of the delta method due to (near-)rank deficiency in the implied Jacobian matrix; and (v) weak instruments in instrumental variable regression.

Joint with Luca Fanelli and Iliyan Georgiev

13 February 2025, 10:45-12:00

Olivier Scaillet (University of Geneva and Swiss Finance Institute)

- Title: Sparse spanning portfolios and under-diversification with second-order stochastic dominance
- Abstract: We develop and implement methods for determining whether relaxing sparsity constraints on portfolios improves the investment opportunity set for risk-averse investors. We formulate a new estimation procedure for sparse second-order stochastic spanning based on a greedy algorithm and Linear Programming. We show the optimal recovery of the sparse solution asymptotically whether spanning holds or not. From large equity datasets, we estimate the expected utility loss due to possible under-diversification, and find that there is no benefit from expanding a sparse opportunity set beyond 45 assets. The optimal sparse portfolio invests in 10 industry sectors and cuts tail risk when compared to a sparse mean-variance portfolio. On a rolling-window basis, the number of assets shrinks to 25 assets in crisis periods, while standard factor models cannot explain the performance of the sparse portfolios.

Archive 2013-2024

2023-2024

5 October 2023,10:30-11:45

Sirio Legramanti (University of Bergamo)

- Title: Weighting covariates in Bayesian nonparametric clustering: an application to transportation networks
- Abstract: In clustering, observed individual data are often accompanied by covariates that can assist the clustering process itself. This is the case, for example, of transportation networks, where each node has spatial coordinates, and it is often desirable that clusters of nodes are spatially cohesive. In fact, the obtained clusters may be used to inform public policy decisions, and it may be preferable that such policies are uniform over neighboring areas. Naturally, depending on the application, different notions of closeness can be used to define such neighborhoods, thus potentially requiring to transform the spatial covariates.

Motivated by real-world data about subscriptions to the public transportation system of Bergamo (Italy) and its surroundings, we propose a method to incorporate properly transformed spatial covariates into a state-of-the-art stochastic block model, while inferring the weight of covariates. (Joint work with Valentina Ghidini and Raffaele Argiento)

19 October 2023,10:30-11:45

Badr-Eddine Chérief-Abdellatif (LPSM, CNRS)

- Title: Label Shift Quantification via Distribution Feature Matching
- Abstract: Quantification learning deals with the task of estimating the target label distribution under label shift. In this talk, we present a unifying framework, distribution feature matching (DFM), that recovers as particular instances various estimators introduced in previous literature. We derive a general performance bound for DFM procedures and extend this analysis to study robustness of DFM procedures in the misspecified setting under departure from the exact label shift hypothesis, in particular in the case of contamination of the target by an unknown distribution.

16 November 2023,10:30-11:45

Nikolaus Schweizer (Tilburg University)

- Title: Solving Maxmin Optimization Problems via Population Games
- Abstract: The Iteratively Reweighted Least Squares (IRLS) method is well known in numerical analysis as a useful technique for solving minmax function approximation problems on a finite grid. We extend the method so that it can be applied to find maxmin solutions of more general multicriteria problems. As in the original IRLS method, the key idea is to find the maxmin decision as an optimizer of a suitably weighted sum of monotonic transformations of the criterion functions. The method is effective when transformations can be found that make the resulting weighted-sum problems easy to solve. The relevant weights are determined by an iterative scheme. For this, we use a discrete-time version of the celebrated replicator equation of evolutionary game theory, also known in machine learning as the exponential multiplicative weights algorithm. The iterative process can be viewed as the co-evolution of a population of "testers" jointly with the decision maker, which produces the maxmin solution from a symmetric Nash equilibrium in a population game. This establishes a connection to game theory that is quite different from the usual one via two-person zero-sum games. Examples are provided to show the use of the generalized IRLS method in collective investment and in decision making under uncertainty.

(Joint work with Anne Balter and Johannes M. Schumacher)

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4264811

14 December 2023,10:30-11:45

Artem Prokhorov (University of Sidney)

- Title: A machine learning attack on illegal trading
- Abstract: We design an adaptive framework for the detection of illegal trading behavior. Its key component is an extension of a pattern recognition tool, originating from the field of signal processing and adapted to modern electronic systems of securities trading. The new method combines the flexibility of dynamic time warping with contemporary approaches from extreme value theory to explore large-scale transaction data and accurately identify illegal trading patterns. Importantly, our method does not need access to any confirmed illegal transactions for training. We use a high-frequency order book dataset provided by an international investment firm to show that the method achieves remarkable improvements over alternative approaches in the identification of suspected illegal insider trading cases.

https://doi.org/10.1016/j.jbankfin.2022.106735

1 February 2024,10:30-11:45

Gábor Lugosi (Universitat Pompeu Fabra)

- Title: Network archaeology: a review of recent results
- Abstract: Large networks that change dynamically over time are ubiquitous in various areas such as social networks, and epidemiology. These networks are often modeled by random dynamics which, despite being relatively simple, give a quite accurate macroscopic description of real networks. "Network archaeology" is an area of combinatorial statistics in which one studies statistical problems of inferring the past properties of such growing networks. In this talk we discuss some simple network models and review recent results on revealing the past of the networks.
29 February 2024,10:30-11:45

Claire Boyer (LPSM, Sorbonne Université)

- Title: Some statistical insights on physics-informed machine learning
- Abstract: Physics-informed machine learning combines the expressiveness of data-based approaches with the interpretability of physical models. In this context, we consider a general regression problem where the empirical risk is regularized by a partial differential equation that quantifies the physical inconsistency. We prove that for linear differential priors, the problem can be formulated as a kernel regression task, giving a rigorous framework to analyze physics-informed ML. In particular, the physical prior can help in boosting the estimator convergence.

The direct implementation of physics-informed kernel estimators can be tedious, and practitioners often resort to physics-informed neural networks (PINNs) instead. We offer some food for thought and statistical insight into the proper use of PINNs.

14 March 2024,10:30-11:45

Matteo Barigozzi (Università di Bologna)

- Title: Title: High-dimensional dynamic matrix factor models
- Abstract: High-dimensional matrix-variate time series data are becoming increasingly popular in economics and finance. This has stimulated the development of matrix factor models to achieve significant dimension reduction. This paper proposes an approximate dynamic matrix factor model that accounts for the time series nature of the data, and develops an EM algorithm to perform quasi-maximum likelihood estimation of the model parameters. The algorithm is further extended to estimate the dynamic matrix factor model on a dataset with an arbitrary pattern of missing data. We prove consistency of the estimated row and column loadings matrices and of the matrix factors. The finite sample properties of the proposed estimation strategies are assessed through a large simulation study and an application to a financial dataset.

Matteo Barigozzi and Luca Trapin

28 March 2024,10:30-11:45

Davide La Vecchia (University of Geneva)

- Title: Saddlepoint techniques for the statistical analysis of time series
- Abstract: Saddlepoint techniques provide numerically accurate, small sample approximations to the distribution of estimators and test statistics. While a complete theory on saddlepoint techniques is available in the case of independent observations, much less attention has been devoted to the time series setting. This talks contributes to fill this gap. Under short and/or long range serial dependence, for Gaussian and non Gaussian processes, the talk shows how to derive and implement saddlepoint approximations for Whittle's estimator, a frequency domain M-estimator. The derivation is based on the treatment of the standardized periodogram ordinates as (i.) i.d. random variables. Comparisons of the saddlepoint techniques to other methods are presented: the numerical exercises show that the saddlepoint approximations yield accuracy improvements over extant methods, while preserving analytical tractability and avoiding resampling. The talks starts with a gentle introduction to saddlepoint techniques in the i.i.d. setting and with a review of the basic frequency domain tools for time series analysis. The results are based on joint works with E. Ronchetti and A. Moor.

25 April 2024,10:30-11:45

Vincent Fortuin (Helmholtz AI/TUM)

- Title: Use Cases for Bayesian Deep Learning in the Age of ChatGPT
- Abstract: Many researchers have pondered the same existential questions since the release of ChatGPT: Is scale really all you need? Will the future of machine learning rely exclusively on foundation models? Should we all drop our current research agenda and work on the next large language model instead? In this talk, I will try to make the case that the answer to all these questions should be a convinced “no” and that now, maybe more than ever, should be the time to focus on fundamental questions in machine learning again. I will provide evidence for this by presenting three modern use cases of Bayesian deep learning in the areas of self-supervised learning, interpretable additive modeling, and neural network sparsification. Together, these will show that the research field of Bayesian deep learning is very much alive and thriving and that its potential for valuable real-world impact is only just unfolding.

2 May 2024,10:30-11:45

Gérard Ben Arous (NYU)

- Title: Dynamical spectral transition for optimization in very high dimensions
- Abstract: In recent work with Reza Gheissari (Northwestern), Aukosh Jagannath (Waterloo) we gave a general context for the existence of projected low dimensional “effective dynamics” of Stochastic Gradient Descent in very high dimensional Data Science problems. These effective dynamics (and, in particular, their so-called ‘critical regime”) define a dynamical system in finite dimensions which may be quite complex, and rules the performance of the learning algorithm.

The next step is to understand how the system finds these “summary statistics”. This is done in the last work with the same authors and with Jiaoyang Huang (Wharton, U-Penn). This is based on a dynamical spectral transition of Random Matrix Theory: along the trajectory of the optimization path, the Gram matrix or the Hessian matrix develop outliers which carry these effective dynamics.

I will naturally first come back to the Random Matrix Tools needed here (the behavior of the edge of the spectrum and the BBP transition).

And then illustrate the use of this point of view on a few central examples of ML: classification for Gaussian mixtures, and the XOR task.

References: NeurIPS 2022, Best paper award, CPAM March 2024, ICLR May 2024, and Arxiv 2310.03010.

23 May 2024,10:30-11:45

Céline Duval (Université de Lille)

- Title: Geometry of excursion sets: computing the surface area from discretized points
- Abstract: The excursion sets of a smooth random field carries relevant information in its various geometric measures. After an introduction of these geometrical quantities showing how they are related to the parameters of the field, we focus on the problem of discretization. From a computational viewpoint, one never has access to the continuous observation of the excursion set, but rather to observations at discrete points in space. It has been reported that for specific regular lattices of points in dimensions 2 and 3, the usual estimate of the surface area of the excursions remains biased even when the lattice becomes dense in the domain of observation. We show that this limiting bias is invariant to the locations of the observation points and that it only depends on the ambiant dimension. (based on joint works with H. Biermé, R. Cotsakis, E. Di Bernardino and A. Estrade)

6 June 2024,10:30-11:45

Peter Radchenko (University of Sydney)

- Title: Modeling with Categorical Features via Exact Fusion and Sparsity Regularization
- Abstract: We study the high-dimensional linear regression problem with categorical predictors that have many levels. We propose a new estimation approach, which performs model compression via two mechanisms by simultaneously encouraging (a) clustering of the regression coefficients to collapse some of the categorical levels together; and (b) sparsity of the regression coefficients. We formulate our estimator as a solution to a mixed integer program, and provide a row generation procedure to speed-up the computation. We also present a fast approximate algorithm for our method that obtains high-quality feasible solutions via block coordinate descent; the main building block of our algorithm is an exact solver for the univariate case. We establish new theoretical guarantees for both the prediction and the cluster recovery performance of our estimator. Our numerical experiments on synthetic and real datasets demonstrate that our proposed estimator tends to outperform the state-of-the-art.

13 June 2024,10:30-11:45

Gilles Stoltz (Laboratoire de mathématiques d'Orsay, CNRS - Université Paris-Saclay & HEC Paris)

- Title: Contextual stochastic bandits with budget constraints and fairness application
- Abstract: We review the setting and fundamental results of contextual stochastic bandits, where at each round some vector-valued context x_t is observed and K actions are available, each action a providing a stochastic reward with expectation given by some (partially unknown) function of x_t and a. The aim is to maximize the cumulative rewards obtained, or equivalently, to minimize the regret. This requires maintaining a good balance between the estimation (a.k.a., exploration) of the function and the exploitation of the estimates built. The literature also considers additional budget constraints (leading to so-called contextual bandits with knapsacks): actions now provide rewards but also costs. The literature also illustrated that costs may model fairness constraints. We will review these two lines of work and describe our own contribution in this respect, related to a more direct strategy, able to handle \sqrt{T} cost constraints over T rounds, which is exactly what is needed for fairness applications. The recent results discussed at the end of the talk will be based on the joint work by Evgenii Chzhen, Christophe Giraud, Zhen Li, and Gilles Stoltz, Small total-cost constraints in contextual bandits with knapsacks, with application to fairness, Neurips, 2023.

2022-2023

20 October 2022 from 10.30am to 11.45am (1h15 per talk including 30 minutes of broad introduction and 15 min questions)

Lu Yu (CREST-ENSAE)

- Title: Mirror Descent Strikes Again: Optimal Stochastic Convex Optimization under Infinite Noise Variance
- Abstract: We study stochastic convex optimization under infinite noise variance. Specifically, when the stochastic gradient is unbiased and has uniformly bounded (1 + κ)-th moment, for some κ ∈ (0, 1], we quantify the convergence rate of the Stochastic Mirror Descent algorithm with a particular class of uniformly convex mirror maps, in terms of the number of iterations, dimensionality and related geometric parameters of the optimization problem. Interestingly this algorithm does not require any explicit gradient clipping or normalization, which have been extensively used in several recent empirical and theoretical works. We complement our convergence results with information-theoretic lower bounds showing that no other algorithm using only stochastic first-order oracles can achieve improved rates. Our results have several interesting consequences for devising online/streaming stochastic approximation algorithms for problems arising in robust statistics and machine learning.

3 November 2022 from 10.30am to 11.45am (1h15 per talk including 30 minutes of broad introduction and 15 min questions)

Nicolas Schreuder (Genova University)

- Title: Fair statistical learning: a study of the Demographic Parity constraint
- Abstract: In various domains, statistical algorithms trained on personal data take pivotal decisions which influence our lives on a daily basis. Recent studies show that a naive use of these algorithms in sensitive domains may lead to unfair and discriminating decisions, often inheriting or even amplifying biases present in data. In the first part of the talk, I will introduce and discuss the question of fairness in machine learning through concrete examples of biases coming from the data and/or from the algorithms. In a second part, I will demonstrate how statistical learning theory can help us better understand and overcome some of those biases. In particular, I will present a selection of recent results from two of my papers on the Demographic Parity constraint:

- A minimax framework for quantifying risk-fairness trade-off in regression (with E. Chzhen), Ann. Statist. 50(4): 2416-2442(Aug.2022).

- Fair learning with Wasserstein barycenters for non-decomposable performance measures (with S. Gaucher and E. Chzhen), arXiv preprint arXiv:2209.00427.

17 November 2022 from 10.30am to 11.45am (1h15 per talk including 30 minutes of broad introduction and 15 min questions)

Alfred Galichon (NYU)

- Title: Estimating Matching Models: from theory to empirics
- Abstract: I will review a methodology for the estimation of models of matching, with a focus on family economics. The theoretical foundations, the econometrics toolbox, and some empirical results will be discussed. This talk is partly a review of the existing literature, and partly based on two new papers:

- https://arxiv.org/abs/2204.00362.

- http://humcap.uchicago.edu/RePEc/hka/wpaper/Chiappori_Fiorio_Galichon_etal_2022_assortative-matching-income.pdf.

8 December 2022 from 10.30am to 11.45am (1h15 per talk including 30 minutes of broad introduction and 15 min questions)

Guillaume A. Pouliot (The University of Chicago)

- Title: An Exact t-Test
- Abstract: I give a short review and selective survey of randomization inference. Surprisingly, the methodological question of how to produce marginal exact and asymptotically robust inference for a regression coefficient in the multivariate linear model with general design matrix appears to be unresolved in the literature. We produce a test statistic which delivers such inference.
6 April 2023 from 12.00pm to 1.15pm (1h15 per talk including 30 minutes of broad introduction and 15 min questions)

Dion Bongaert (RSM Erasmus University)

- Title: Reverse Engineering Mutual Fund Trades
- Abstract: In this paper we present a novel method for imputing daily mutual fund trades from data on fund returns, total net assets, and fund holdings at the, respectively daily, monthly, and quarterly frequencies. Therefore, our method works with standard CRSP mutual fund data. We set up an (under-identified) system of linear equations and solve the under-identification issue by an iterative method that applies random and adaptive constraints on trade incidence. The method produces daily, position-level trade estimates with associated confidence levels. Validation and simulation studies using proprietary fund trading data show high accuracy, especially for larger and more relevant trades.
30 May 2023 from 12.00pm to 1.15pm (1h15 per talk including 30 minutes of broad introduction and 15 min questions)

Cesare Robotti (Warwick Business School)

- Title: Priced Risk in Corporate Bonds
- Abstract: Recent studies document strong empirical support for multifactor models that aim at explaining the cross-sectional variation in corporate bond expected excess returns. We revisit these findings and provide evidence that common factor pricing in corporate bonds is exceedingly difficult to establish. Based on portfolio- and bond-level analyses, we demonstrate that previously proposed bond risk factors, with traded liquidity as the only marginal exception, do not provide any incremental pricing information to the corporate bond market factor. This implies that the bond CAPM is never outperformed by other traded and nontraded factor models in pairwise and multiple model comparison tests.

2021-2022

9 June 2022 from 2.00 to 4.00pm (45 minutes per talk plus a 30 minutes coffee break) in Room N517
- Alexandra Carpentier (Universität Potsdam).
- Karim Lounici (Ecole Polytechnique).

12 May 2022 from 2.00 to 4.00pm (45 minutes per talk plus a 30 minutes coffee break)
- Victor-Emmanuel Brunel (ENSAE).
- George Deligiannidis (University of Oxford).

12 April 2022 from 2.00 to 4.00pm (45 minutes per talk plus a 30 minutes coffee break) in Room N517
- Gilles Stupfler (ENSAI). Asymmetric least squares techniques for extreme risk assessment
- Robert Adamek (Maastricht University). Local Projection Inference in High Dimensions

3 March 2022 from 2.00 to 4.00pm (45 minutes per talk plus a 30 minutes coffee break) on Zoom
- Giacomo Zanella (Bocconi University). Robust leave-one-out cross-validation for high-dimensional Bayesian models
- Matthew Graham (University College London). Manifold MCMC methods for Bayesian inference in diffusion models

13 December 2021 from 2.30 to 4.30pm (45 minutes per talk plus a 30 minutes coffee break) in Room N517
- Christian Brownlees (Universitat Pompeu Fabra). Empirical Risk Minimization for Time Series: Nonparametric Performance Bounds for Prediction
- Anders Kock (University of Oxford). Consistency of p-norm based tests in high dimensions: characterization, monotonicity, domination

24 November 2021 from 2.30 to 4.30pm (45 minutes per talk plus a 30 minutes coffee break) in Room N517
- Umut Simsekli (INRIA). Towards Building a Heavy-Tailed Theory of Stochastic Gradient Descent for Deep Neural Networks
- Valentin De Bortoli (University of Oxford). Diffusion Schrödinger Bridge with Applications to Score-Based Generative Modeling

2018-2019

November 22, 2018 - 1:00 pm to 3:00 pm - ESSEC Cergy (N305)

Prof. Taoufik Bouezmarni (Laval University)

Extended Lorenz curves for general random variables

Prof. Matei Demetrescu (Kiel University)

Nonlinear Predictability of Stock Returns? Parametric vs. non parametric inference in predictive regressions

October 16, 2018 - 5:15 pm to 6:15 pm - ESSEC LA Défense (CNIT), s. 344

Prof. Arijit Chakrabarty (Indian Statistical Institute, Kolkata)

Spectra of Adjacency and Laplacian Matrices of inhomogeneous Erdös-Rényi Graphs

2017-2018 Program:

TIME SERIES WORKSHOP 2018, Wednesday April 11 - 2018, 2:30 pm to 5:40 pm, Room N516

Organizers: Prof. Luc Bauwens, CORE - UCL, Fellow of the Institute of Advanced Studies UCP Université Paris-Seine, Guillaume Chevillon, ESSEC Business School, Prof. Jeroen Rombouts, ESSEC Business School

March 29, 2018 : 5th Empirical Finance Workshop - Cergy (KLAB)
December 14, 2017 - 1:00 pm to 3:00 pm - Cergy (Room N305):

Prof. Xavier D’haultfoeuille (ENSAE - CREST)

Testing Rational Expectations Using Data Combination

Prof. Artem Prokhorov (University of Sydney)

On Semiparametric Estimation using Bernstein Copulas

2016-2017 Program:

July 4, 2017 - from 10:30 am to 12:00 pm - Cergy Room N105:

Prof. Aurore Delaigle (University of Melbourne)

Analyzing Partially Observed Functional Data

April 21, 2017 - from 1:00 pm to 4:00 pm - Cergy Room N305:

Prof. Valentina Corradi (University of Surrey)

Improved Tests for robust forecast comparison

Prof. Jean-David Fermanian (CREST)

The behavior of dealers and clients on the European corporate bond market: the case of Multi-dealer-to-client platforms

Prof. Bas Werker (Tilburg University)

Arbitrage Pricing Theory for Idiosyncratic Variance Factors

March 30-31, 2017: 25th Annual Symposium of the Society for Nonlinear Dynamics and Econometrics (SNDE)
March 15, 2017 : 4th Empirical Finance Workshop - Cergy (KLAB)
March 2, 2017 - from 1:45 pm to 4:00 pm - Cergy Room N305:

Prof. Karim ABADIR (Imperial College London)

Macro and financial markets: The memory of an elephant

Prof. Joerg Breitung (University of Cologne)

Multivariate tests for asset price bubbles

February 24, 2017 - from 2:00 pm to 5:00 pm - IBM Bois Colombes :

Internet of Things & Predictive Analytics

Reda Gomery (Deloitte), Marc Van Der Laan (AT&T), Thomas Watteyne (INRIA), Georges Uzbelger (IBM)

November 25, 2016 - from 11:45 am to 1:15 - Cergy, Room N405:

Prof. Juhyun Park, (Lancaster University)

Estimation of functional sparsity in nonparametric varying coefficient models

November 17-18, 2016: The 2016 8th French Econometrics Conference (FEC2016)
November 15, 2016, from 1:15 to 4:00 pm (Room E125):

Yu-Wei Hsieh (University of Southern California)

Seminar on the Econometrics of Matching models

2015-2016 Program

March 16, 2016: 3rd Empirical Finance Workshop
May 31, 2016:

Prof. Christophe CROUX (Katholieke Universiteit Leuven)
Sparse Cointegration
Prof. Nikolay GOSPODINOV (Federal Reserve Bank of Atlanta)
Spurious Inference in Reduced-Rank Asset-Pricing Models
Prof. Otilia BOLDEA (Tilburg University)

Break-point Estimation in Panel data with fixed effects

November 5-6, 2015: Advances in Time Series and Forecasting Conference

September 25, 2015: Workshop on Time Series Econometrics

September 24, 2015:

Prof. Cristina DAVINO (Università de Macerata, Italy) -Quantile Regression an overview of properties and applications

2014-15 Program

June: Siem Jan Koopman (VU, Amsterdam)
May: Second Workshop on ICT and Innovation Forecasting; From Theory to Practice & Applications
April:
- 1. Esther Ruiz (UC3 Madrid)
  2. Genaro Succarat (BI Norwegian Business School)
March: WORKSHOP ON MODELLING & FORECASTING MOMENT RISK PREMIA
Jan-March 2015: the seminars are part of the Working Group on Risk - CREAR
10th December (Banque de France): ESSEC/Banque de France workshop on Expectations and Forecasting
6th and 7th November (La Défense): European Seminar on Bayesian Econometrics
October:
- 1. Ingrid VAN KEILEGOM (Université Catholique de Louvain)
  2. Paul DOUKHAN (Université de Cergy-Pontoise)

2013-14 Program LINK HERE

SUBPAGES (1): 2013-14 PROGRAM OF ECONOMETRICS & STATISTICS SEMINARS