### Context and goal

The amount of digital data is currently increasing at an exponential rate, with a staggering
1.8 zettabytes in 2011, up from 1.2 zettabytes in the previous years. This data
deluge is now considered a wonderful opportunity to extract previously unknown information
from the data, and therefore a major leverage for scientific advances. Yet such big data
raises many tough issues, including: storage, transferring, processing, and interpreting. These
issues are clearly major challenges of the upcoming decades.
This evolution is mirrored both in **mathematical optimization and machine learning** research by the increasing number of
large datasets now available to researchers, along with the number of benchmarks
and challenges associated to these datasets exhibiting large scales in all the dimensions of
learning problems: the number of examples, the number of features, the number of tasks, and
the number of models. The availability of such large datasets makes it possible to build more
accurate and richer models and algorithms.

The goal of **Gargantua** is to form an alliance between researchers resp. from **mathematical optimization and machine learning**
to tackle these challenges.

The project **Gargantua** is planned to merge with the "brother" Mastodons project **Display** (see recent workshop).

### Highlights

#### Conditional Gradient Algorithms

- Smoothed Composite Conditional Gradient
- Composite Conditional Gradient
- Linearly-Convergent Conditional Gradient
- Affine-invariant Conditional Gradient

#### Incremental and Stochastic Proximal Gradient Algorithms

- Incremental Majorization-Minimization
- Stochastic Majorization-Minimization
- Fast Incremental Gradient
- Efficient Stochastic Gradient

#### Nonsmooth optimization and decomposition methods

- Level bundle methods with uncontrolled inexact oracles
- Stabilized Benders Decomposition
- Stochastic Composite Mirror-Prox

#### Metric learning for partitioning and alignment problems

- Large margin metric learning for alignment
- Large margin metric learning for partitioning
- Optimal Model Selection for Multiple Change-point Detection

### Scientific Events

- Gargantua-Display workshop
- Summer School "High-dimensional Learning and Optimization"
- Tutorial "Frank-Wolfe and Greedy Optimization" (ICML 2014)
- Grenoble Optimization Day
- Workshop "Optimization and Statistical Learning" (Les Houches)

### Current project members

**Massih-Reza Amini**, LIG, UJF, Grenoble**Sylvain Arlot**, SIERRA-INRIA and ENS, Paris**Alexandre d'Aspremont**, CNRS and ENS, Paris**Francis Bach**, SIERRA-INRIA and ENS, Paris**Bernhard Beckermann**, UPP, Lille**Christophe Biernacki**, MODAL-INRIA and UPP, Lille**Alain Celisse**, UPP, Lille**Zaid Harchaoui (Leader)**, LEAR-INRIA and LJK,Grenoble**Roland Hildebrand**, CNRS and LJK,Grenoble**Julien Jacques**, UPP, Lille**Anatoli Juditsky**, UJF and LJK,Grenoble**Simon Lacoste-Julien**, SIERRA-INRIA and ENS, Paris**Julien Mairal**, LEAR-INRIA and LJK,Grenoble**Jerome Malick**, CNRS and LJK,Grenoble

### Meetings in 2014

- December 9th, 2014
- September 10th, 2014
- July 11th, 2014
- April 24th, 2014
- January 31st, 2014

### Meetings in 2013

- November 26th, 2013
- June 11th, 2013
- September 10th, 2013

### Annual workshop, November 26th, 2013

The workshop was held in the Seminar room 1 of LJK (Tour IRMA, 51 rue des Mathematiques, Campus de Saint Martin d'Heres, 38041 Grenoble).

- 9:15. Introduction
- First half-day
- 09:30.
**Francis Bach, SIERRA-Inria and ENS, Paris** - 10:30.
**Zaid Harchaoui, LEAR-Inria and LJK, Grenoble** **Spotlights**- 11:30.
**Massih-Reza Amini, LIG, Grenoble** - 12:30. Lunch pause
- Second half-day
- 14:00.
**Jerome Malick, CNRS and LJK, Grenoble** - 15:00.
**Julien Mairal, LEAR-Inria and LJK, Grenoble** - 16:00. Coffee
- 16:30.
**Anatoli Juditsky** - 17:30. Planning for 2014

*Beyond stochastic gradient descent for large-scale machine learning*[slides.pdf]

*Frank-Wolfe/conditional gradient algorithms for large-scale machine learning*[slides.pdf]

*On Flat versus Hierarchical Classification in Large-Scale Taxonomies*[slides.pdf]

*Exploiting uncontrolled information in nonsmooth optimization*[slides.pdf]

*Incremental and Stochastic Majorization-Minimization Algorithms for Large-Scale Optimization*[slides.pdf]

*Hypothesis testing with convex optimization*[slides.pdf]