sada-lab.ch | Anastasis Kratsios (PostDoc '22)

Anastasis was a member of the SADA group until 2022. He is now working as an Assistant Professor at McMaster University, Hamilton, Ontario.

Expertise:

Mathematics: Approximation theory, analysis on metric spaces, geometric topology, mathematical finance, optimal transport.
Theoretic Machine Learning: Geometric deep learning, approximation theory of deep neural networks, meta-learning.

Select Papers

A. Kratsios, B. Zamanlooy, I. Dokmanić, and T. Liu: Universal Approximation Under Constraints is Possible with Transformers, ICLR - International Conference on Learning Representations, 2022 (Spotlight and Top 3%).
A. Kratsios and C. Hyndman: NEU: A Meta-Algorithm for Universal UAP-Invariant Feature Representation, JMLR - Journal of Machine Learning Research - Volume 22, 2021.
A. Kratsios, P. Casgrain: Optimizing Optimizers: Regret-optimal gradient descent algorithms, COLT - 34th Conference on Learning Theory, 2021.

Select Preprints

A. Acciaio, A. Kratsios, and G. Pammer: Metric Hypertransformers are Universal Adapted Maps, 2022.
A. Kratsios and L. Papon: Universal Approximation Theorems for Differentiable Geometric Deep Learning, Second Round at JMLR - Journal of Machine Learning Research 2021.

Most Recent Publication

A. Kratsios, B. Zamanlooy: Learning Sub-Patterns in Piece-Wise Continuous Functions, Neurocomputing, accepted2022.

Research interests

Geometric Deep Learning

Developed the first neural model which can proveably approximate any function while implementing exact constraint satisfaction with B. Zamanlooy, I. Dokmanic, T. Liu.
Developed a simple framework for building universal approximators between any differentiable manifold with L. Papon and E. Bilokopytov.
Introduced the universal the first known class of UAP (Universal Approximation Property)-Invariant feature maps with C. Hyndman.
Identified the first known homotopic obstructions to non-Euclidean universal approximation with L. Papon.

Foundations of Data Science

Developed the first probability-measure valued universal approximator and showed that it can approximate any regular conditional distribution.
Introduced the first regret-optimal descent algorithms via control and variational approach to meta-optimization with P. Casgrain.
Developed the first deep neural model capable of uniformly approximating any piecewise continuous function with finitely many pieces with B. Zamanlooy.

Mathematical Finance

Introduced the first penalty for arbitrage-free learning with C. Hyndman.
Introduce the fastest matrix completion algorithm for rapid low-rank + sparse decomposition of asset's covariance matrices with J. Teichmann, C. Herrera, F. Krach, and Google Research's P. Ruyssen.

Publications

2022

Universal Approximation Under Constraints is Possible with Transformers

Anastasis Kratsios, Behnoosh Zamanlooy, Dokmanic Ivan and Liu Tianlin

(ICLR) The Tenth International Conference on Learning Representations, 2022

URL BibTeX

@inproceedings{kratsios2022universal,
title={Universal Approximation Under Constraints is Possible with Transformers},
author={Kratsios, Anastasis and Zamanlooy, Behnoosh and Ivan, Dokmanic and Tianlin, Liu},
booktitle={(ICLR) The Tenth International Conference on Learning Representations},
year={2022},
url={https://openreview.net/forum?id=JGO8CvG5S9},
note={under review}
}

Learning Sub-Patterns in Piecewise Continuous Functions

Anastasis Kratsios and Behnoosh Zamanlooy

Neurocomputing, 2022

URL BibTeX

@article{KRATSIOS2022,
title = {Learning Sub-Patterns in Piecewise Continuous Functions},
journal = {Neurocomputing},
year = {2022},
issn = {0925-2312},
doi = {https://doi.org/10.1016/j.neucom.2022.01.036},
url = {https://www.sciencedirect.com/science/article/pii/S092523122200056X},
author = {Anastasis Kratsios and Behnoosh Zamanlooy},
}

Metric Hypertransformers are Universal Adapted Maps

Beatrice and Kratsios Anastasis Acciaio

arXiv e-prints, 2022

BibTeX

@article{kratsiosMHT2022,
       author = {{Acciaio}, Beatrice and {Kratsios}, Anastasis and {Pammer}, Gudmund},
        title = "{Metric Hypertransformers are Universal Adapted Maps}",
      journal = {arXiv e-prints},
     keywords = {Computer Science - Machine Learning, Computer Science - Neural and Evolutionary Computing, Mathematics - Metric Geometry, Mathematics - Probability, Quantitative Finance - Computational Finance, 68T07, 49Q22, 41A65, 30L99, 60G25, 60H35},
         year = 2022,
        month = jan,
          eid = {arXiv:2201.13094},
        pages = {arXiv:2201.13094},
}

Piecewise-Linear Activations or Analytic Activation Functions: Which Produce More Expressive Neural Networks?

Anastasis Kratsios and Behnoosh Zamanlooy

ArXiV, 2022

BibTeX

@article{AB_2022_B,
  title={Piecewise-Linear Activations or Analytic Activation Functions: Which Produce More Expressive Neural Networks?},
  author={Kratsios, Anastasis and Zamanlooy, Behnoosh},
  year={2022},
  month = {April},
  journal = {{ArXiV}}
}

2021

The Universal Approximation Property

Anastasis Kratsios

Annals of Mathematics and Artificial Intelligence, 2021

URL BibTeX

@article{kratsiosUAPGeneral2020,
  abstract = {The universal approximation property of various machine learning models is currently only understood on a case-by-case basis, limiting the rapid development of new theoretically justified neural network architectures and blurring our understanding of our current models'potential. This paper works towards overcoming these challenges by presenting a characterization, a representation, a construction method, and an existence result, each of which applies to any universal approximator on most function spaces of practical interest. Our characterization result is used to describe which activation functions allow the feed-forward architecture to maintain its universal approximation capabilities when multiple constraints are imposed on its final layers and its remaining layers are only sparsely connected. These include a rescaled and shifted Leaky ReLU activation function but not the ReLU activation function. Our construction and representation result is used to exhibit a simple modification of the feed-forward architecture, which can approximate any continuous function with non-pathological growth, uniformly on the entire Euclidean input space. This improves the known capabilities of the feed-forward architecture.},
  author = {Kratsios, Anastasis},
  da = {2021/06/01},
  doi = {10.1007/s10472-020-09723-1},
  id = {Kratsios2021},
  isbn = {1573-7470},
  journal = {Annals of Mathematics and Artificial Intelligence},
  number = {5},
  pages = {435--469},
  title = {The Universal Approximation Property},
  ty = {JOUR},
  url = {https://doi.org/10.1007/s10472-020-09723-1},
  volume = {89},
  year = {2021},
  Bdsk-Url-1 = {https://doi.org/10.1007/s10472-020-09723-1}}

Universal Approximation Theorems for Differentiable Geometric Deep Learning

Anastasis Kratsios and Leonie Papon

arXiv BibTeX

@article{kratsios2021universal,
      title={Universal Approximation Theorems for Differentiable Geometric Deep Learning},
      author={Anastasis Kratsios and Leonie Papon},
      year={2021},
      eprint={2101.05390},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

The Universal Approximation Property

Anastasis Kratsios

Annals of Mathematics and Artificial Intelligence, 2021

URL BibTeX

@article{kratsiosUAPGeneral2020,
  abstract = {The universal approximation property of various machine learning models is currently only understood on a case-by-case basis, limiting the rapid development of new theoretically justified neural network architectures and blurring our understanding of our current models'potential. This paper works towards overcoming these challenges by presenting a characterization, a representation, a construction method, and an existence result, each of which applies to any universal approximator on most function spaces of practical interest. Our characterization result is used to describe which activation functions allow the feed-forward architecture to maintain its universal approximation capabilities when multiple constraints are imposed on its final layers and its remaining layers are only sparsely connected. These include a rescaled and shifted Leaky ReLU activation function but not the ReLU activation function. Our construction and representation result is used to exhibit a simple modification of the feed-forward architecture, which can approximate any continuous function with non-pathological growth, uniformly on the entire Euclidean input space. This improves the known capabilities of the feed-forward architecture.},
  author = {Kratsios, Anastasis},
  da = {2021/06/01},
  doi = {10.1007/s10472-020-09723-1},
  id = {Kratsios2021},
  isbn = {1573-7470},
  journal = {Annals of Mathematics and Artificial Intelligence},
  number = {5},
  pages = {435--469},
  title = {The Universal Approximation Property},
  ty = {JOUR},
  url = {https://doi.org/10.1007/s10472-020-09723-1},
  volume = {89},
  year = {2021},
  Bdsk-Url-1 = {https://doi.org/10.1007/s10472-020-09723-1}}

NEU: A Meta-Algorithm for Universal UAP-Invariant Feature Representation

Anastasis Kratsios and Cody Hyndman

Journal of Machine Learning Research, 2021

URL BibTeX

@article{kratsios2021neu,
  author  = {Anastasis Kratsios and Cody Hyndman},
  title   = {NEU: A Meta-Algorithm for Universal UAP-Invariant Feature Representation},
  journal = {Journal of Machine Learning Research},
  year    = {2021},
  volume  = {22},
  number  = {92},
  pages   = {1-51},
  url     = {http://jmlr.org/papers/v22/18-803.html}
}

Denise: Deep Robust Principal Component Analysis for Positive Semidefinite Matrices

Calypso Herrera, Florian Krach, Anastasis Kratsios, Pierre Ruyssen and Josef Teichmann

arXiv BibTeX

@misc{kratsios2021denise,
      title={Denise: Deep Robust Principal Component Analysis for Positive Semidefinite Matrices},
      author={Calypso Herrera and Florian Krach and Anastasis Kratsios and Pierre Ruyssen and Josef Teichmann},
      year={2021},
      eprint={2004.13612},
      archivePrefix={arXiv},
      primaryClass={stat.ML}
}

A Canonical Transform for Strengthening the Local $L^p$-Type Universal Approximation Property

Anastasis Kratsios and Behnoosh Zamanlooy

arXiv BibTeX

@misc{kratsios2021canonical,
      title={A Canonical Transform for Strengthening the Local $L^p$-Type Universal Approximation Property},
      author={Anastasis Kratsios and Behnoosh Zamanlooy},
      year={2021},
      eprint={2006.14378},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Optimizing Optimizers: Regret-optimal gradient descent algorithms

Philippe Casgrain and Anastasis Kratsios

Proceedings of Thirty Fourth Conference on Learning Theory (Mikhail Belkin, Samory Kpotufe, eds.), 2021

URL BibTeX

@InProceedings{pmlr-v134-kratsios21a,
  title =    {Optimizing Optimizers: Regret-optimal gradient descent algorithms},
  author =       {Casgrain, Philippe and Kratsios, Anastasis},
  booktitle =    {Proceedings of Thirty Fourth Conference on Learning Theory},
  pages =    {883--926},
  year =    {2021},
  editor =    {Belkin, Mikhail and Kpotufe, Samory},
  volume =    {134},
  series =    {Proceedings of Machine Learning Research},
  month =    {15--19 Aug},
  publisher =    {PMLR},
  pdf =    {http://proceedings.mlr.press/v134/casgrain21a/casgrain21a.pdf},
  url =    {https://proceedings.mlr.press/v134/casgrain21a.html},
  abstract =    {This paper treats the task of designing optimization algorithms as an optimal control problem. Using regret as a metric for an algorithm’s performance, we study the existence, uniqueness and consistency of regret-optimal algorithms. By providing first-order optimality conditions for the control problem, we show that regret-optimal algorithms must satisfy a specific structure in their dynamics which we show is equivalent to performing \emph{dual-preconditioned gradient descent} on the value function generated by its regret. Using these optimal dynamics, we provide bounds on their rates of convergence to solutions of convex optimization problems. Though closed-form optimal dynamics cannot be obtained in general, we present fast numerical methods for approximating them, generating optimization algorithms which directly optimize their long-term regret. These are benchmarked against commonly used optimization algorithms to demonstrate their effectiveness.}
}

Universal Regular Conditional Distributions

arXiv e-prints, 2021

arXiv BibTeX

@ARTICLE{kratsios2021universalRCD,
       author = {{Kratsios}, Anastasis},
        title = "{Universal Regular Conditional Distributions}",
      journal = {arXiv e-prints},
     keywords = {Computer Science - Machine Learning, Computer Science - Neural and Evolutionary Computing, Mathematics - Metric Geometry, Mathematics - Probability, Statistics - Machine Learning, 68T07, 28A50, 49Q22, 54C65},
         year = 2021,
        month = may,
          eid = {arXiv:2105.07743},
        pages = {arXiv:2105.07743},
archivePrefix = {arXiv},
       eprint = {2105.07743},
 primaryClass = {cs.LG},
}

2020

Non-Euclidean Universal Approximation

Anastasis Kratsios and Ievgen Bilokopytov

(NeurIPS) Advances in Neural Information Processing Systems (H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, H. Lin, eds.), 2020

URL BibTeX

@inproceedings{KratsiosNeurips2020,
 author = {Kratsios, Anastasis and Bilokopytov, Ievgen},
 booktitle = {(NeurIPS) Advances in Neural Information Processing Systems},
 editor = {H. Larochelle and M. Ranzato and R. Hadsell and M. F. Balcan and H. Lin},
 pages = {10635--10646},
 publisher = {Curran Associates, Inc.},
 title = {Non-Euclidean Universal Approximation},
 url = {https://proceedings.neurips.cc/paper/2020/file/786ab8c4d7ee758f80d57e65582e609d-Paper.pdf},
 volume = {33},
 year = {2020}
}

Deep Arbitrage-Free Learning in a Generalized HJM Framework via Arbitrage-Regularization

Anastasis Kratsios and Cody Hyndman

Risks, 2020

URL BibTeX

@article{KratsiosAFReg,
AUTHOR = {Kratsios, Anastasis and Hyndman, Cody},
TITLE = {Deep Arbitrage-Free Learning in a Generalized HJM Framework via Arbitrage-Regularization},
JOURNAL = {Risks},
VOLUME = {8},
YEAR = {2020},
NUMBER = {2},
ARTICLE-NUMBER = {40},
URL = {https://www.mdpi.com/2227-9091/8/2/40},
ISSN = {2227-9091},
ABSTRACT = {A regularization approach to model selection, within a generalized HJM framework, is introduced, which learns the closest arbitrage-free model to a prespecified factor model. This optimization problem is represented as the limit of a one-parameter family of computationally tractable penalized model selection tasks. General theoretical results are derived and then specialized to affine term-structure models where new types of arbitrage-free machine learning models for the forward-rate curve are estimated numerically and compared to classical short-rate and the dynamic Nelson-Siegel factor models.},
DOI = {10.3390/risks8020040}
}

The entropic measure transform

Renjie Wang, Cody Hyndman and Anastasis Kratsios

Canadian Journal of Statistics, 2020

URL BibTeX

@article{EMT_2021,
author = {Wang, Renjie and Hyndman, Cody and Kratsios, Anastasis},
title = {The entropic measure transform},
journal = {Canadian Journal of Statistics},
volume = {48},
number = {1},
pages = {97-129},
keywords = {Affine term-structure, defaultable bond price, forward-backward stochastic differential equations, forward price, free energy, futures price, optimal stochastic control, quadratic term-structure, relative entropy},
doi = {https://doi.org/10.1002/cjs.11537},
url = {https://onlinelibrary.wiley.com/doi/abs/10.1002/cjs.11537},
eprint = {https://onlinelibrary.wiley.com/doi/pdf/10.1002/cjs.11537},
year = {2020}
}