Technical Committee

June 2, 2021

Agenda of the day

9.10 am – 10 am Presentation of the technical achievements and ongoing work by O. Grisel
10 am – 12 pm Feedback and exposition of each partner of the consortium
1 pm – 2 pm Collaborative drafting session of the updated roadmap
2 pm – 5 pm Afternoon discussions on Discord


Scikit-learn @ Fondation Inria Alexandre Gramfort (Inria, advisory committee of the consortium)

Olivier Grisel (consortium engineer)

Guillaume Lemaître (consortium engineer)

Jérémie Du Boisberranger (consortium engineer)

Chiara Marmo (consortium COO)

Loic Esteve (Inria engineer)

Mathis Batoul (Inria intern)

Julien Jerphanion (Inria engineer)

Gaël Varoquaux (Consortium Director)

Consortium partners AXA:

  • Thibault Laugel
  • Valeriy Ischenko
  • Xavier Renard


  • Leo Dreyfus Schmidt
  • Samuel Ronsin


  • Norbert Preining


  • Xavier Dupré


  • Sébastian Conort
Scikit-learn community Adrin Jalali (community member of the advisory board, at Zalando)

Joel Nothman (community member of the advisory board, at university of Sydney)

Priority list for the consortium at Inria, year 2021–2022

From the discussion during the technical committee, the scikit-learn Consortium at Inria defined the following list of priorities for the coming year:

On the community side:

  • Organise more regular technical sprints (possibly by inviting past sprint contributors to try to foster a long term relationship and hopefully recruit new maintainers).
    • Better preparation for issues
    • Plan with greater advance
  • Renew the organization of beginners’ workshops for prospective contributors, probably before a sprint.
  • Organize a workshop on uncertainty quantification and calibration and possibly followed by 2 days of sprint.
  • Conduct a new edition of its 2013 survey among all scikit-learn users.Quantification of fairness issues and potential de-biasing

Detailed minutes of the meeting

Exposure of partners’ comments and priorities


  • Feedback on the Fujitsu Sprint.
    • Beginner introduction to scikit-learn development in Japanese
    • Resources from the MOOC and scikit-learn doc translated to Japanese
    • 25 participants
    • 11 PRs / 3 merged
    • 9 gave feedback
    • Planning recurrent sprint twice a year
    • Planning to enlarge the audience to Korea and China. Potential interest of other partners in the same time zone (Valeriy Ischenko from AXA, is located in Singapore)
    • Also targeting returning people
  • Want to contribute to the Open Source ecosystem and make Japanese companies more Open Source aware. Language is a barrier to wider adoption.
    • Possible todo: identify a “quick start” to help onboarding, which could then be translated to a local language
  • No specific technical priority since the priority is to get Fujitsu use and contribute to OSS


  • AXA is interested in the general improvements as well as in the recent MOOC
  • Focus on interpretability in particular when dealing with regulators
  • What user expects and trying to figure out guidelines on usage and limitations of such methods
  • Problem with education on naive usage of interpretability methods
    • Further improvements on documentation and MOOC chapter on interpretability could help
  • End of June: exercise to implement interpretability tools for a credit scoring use case
  • Interested in following general scikit-learn developments (features, performance, documentation)
  • Data Scientists at AXA are asked to study the interpretability of their models. They mostly are relying on SHAPE and LIME as those projects are well known and accessible.
  • Also fairness:
    • Research at AXA on fairness assessment, mentioned problem with lack of access on protected attributes in Europe (but laws is evolving)
    • Integration with e.g. would benefit from progress on SLEP006
  • Proposal (from Adrin) to add interpretability tools in scikit-learn-extra,
  • Adrin also mention the new European proposal
  • AXA is also available in helping in sprint for Far and Middle East


  • Interested in following the development synergies among consortium / community members
  • Interested in the MOOC
  • Focus on management of operational risk (related to documentation and data scientist education)
  • Better crafting pipelines : building / versioning / diff of the pipelines
    • Question: yaml to configure pipelines, with a goal of standardizing (only a subset of features)
      • Easier to diff / read, automate documentation of pipelines
      • Streamline the pipeline auditing process using a web interface for instance
    • Suggestion:
      • function to persist back and forth only a subset of pipelines
        • Both reading and writing is important
        • Toml or yaml?
      • Section in docs to operationalize / good practice to build a pipeline
        • People write hairy code to build such of this pipeline
    • Declarative pipelines could help integration with CI/CD
    • Need is compensated via custom configuration based pipeline deployments
    • TODO: Could be answered by better documentation on the deployment / operations and good practices
    • MLFlow was mentioned as a candidate for such use cases
  • Development related to feature names / data frame / interpretability is important
  • Interested in improvements of the documentation for MLOps good practices
  • Interested in uncertainty estimation (QuantileRegression) and calibration
    • Todo: organise workshop on uncertainty and calibration


  • Interested on performance benchmarking effort in particular when comparing to ONNXRuntime
  • ONNX dev update:
    • CPU optims
    • WASM deployment target
    • ARM deployment target
    • Debate on feature vs size of ONNXRuntime project (and runtime binaries?)
    • Ability to make it custom models
    • Extensions for text
    • Support for sparse features becoming more important for customers
    • Make it possible to export FunctionTransformer with custom python code using the numpy API and a decorator
  • Known issues with sklearn to ONNX conversions:
    • Float vs double in decision trees can cause rare but problematic discrepancies in the decision function
    • Should sklearn try to use floats as much as possible in tree-based models?
    • TODO? Better cleaner homogeneous support
    • Mention ONNX in documentation about operationalizing pipelines
  • Seek feedback from users.


  • Focus on MLOps problems (version machine learning experiments, compare models, pre-deployment model update checks)
    • Detecting distribution drift (research topic) + correcting drift
    • Model inspection and model failure detection: machine learning diagnostics (leakage, overfitting, parameter choice, compare to dummy classifier performance, class imbalance vs choice of metrics)
    • Stress test center: test the model with perturbations / missing values, change of feature scales: detect and notify or assert robustness
  • Model document generator: summarize modeling choices
  • Uncertainty estimation / calibration
    • Education issue : calibrated classifier does not always work, you have to check
    • Conformal Predictions / quantile regression coverage (still new / research)
  • AutoML effort:
    • Recent blog post: “Distributed Hyperparameter Search: How It’s Done in Dataiku
    • Happy with inclusion of Successive Halving
    • Benchmarks with multiple datasets / hyper params: interested in joining efforts / contributing (potential TODO)
    • Interests in programmatically defining good starting points for HP grids (potential TODO)
      • Ran experiments to rank which HP should be tuned first
      • Non stupid behaviors
      • Dataiku changed defaults HPs based on the results of their experiments
  • Moving away from single Python version deployments
    • Some customers won’t upgrade to Python 3 and are stuck to Python 2
  • Interested in feature engineering:
    • Spline features in particular
  • Interested in categorical features in scikit-learn-contrib / impact coding
  • Similar issues for single vs double precision issue when deploying exported trees (no action wanted)
  • TODO: Interested in moving forward the PR on monotonicity constraints for decision trees