TECHNICAL COMMITTEE / November 5, 2020

Technical Committee November 5, 2020 Presentation of the technical achievements and ongoing work (O. Grisel) Priority list for the consortium at Inria, year 2020–2021 From the discussion during the technical committee, the scikit-learn Consortium at Inria defined the following list of priorities for the coming year: Continue effort helping with project maintenance to keep the target to release twice a year (+ bugfix releases). Continue developments of the…

Scaling up the benchmark infrastructure of scikit-learn

The Scikit-Learn Consortium at Inria foundation proposes an internship for scaling up the benchmark infrastructure of scikit-learn. The goals of the internship are: Development of an automated benchmark suite to monitor Scikit-learn’s efficiency against third party libraries like daal4py, cuML and ONNX. Analysis of the results: identify which scikit-learn models are the most under-performing and try to understandthe root cause by reading the source code and analyzing the…

Implementing a faster KMeans in scikit-learn 0.23

The 0.23 version of scikit-learn was released a few days ago, bringing new features, bug fixes and optimizations. In this post we will focus on the rework of KMeans, a long going work started almost two years ago. Better scalability on machines with many cores was the main objective of this journey. It forced us to touch core challenges of low-level parallelism. KMeans clustering Before describing the optimization…

TECHNICAL COMMITTEE / February 3, 2020

Technical Committee February 3, 2020 Presentation of the technical achievements and ongoing work (O. Grisel) Priority list for the consortium at Inria, year 2020–2021 From the discussion during the technical committee, the scikit-learn Consortium at Inria defined the following list of priorities for the coming year: Continue effort helping with project maintenance to keep the target to release twice a year (+ bugfix releases). Development of the model…

Paris sprint of the Decade: happy birthday scikit-learn!

At the end of January 2020 a scikit-learn sprint took place in the Paris offices of Dataiku. Sonia and Léo from Dataiku deserve a special thanks for so nicely taking care of us! We had three days of coding in excellent company, introduced by a beginner's workshop aimed to lower the entry cost to the first Pull Request on scikit-learn. As a side effect the team had the…

Time to come out! scikit-learn 0.22

A new look and many new features for this 0.22 scikit-learn release. Just a bit earlier than Santa visiting, this past month some special Elves have worked really hard to keep the target of releasing scikit-learn twice a year. Come take a look at some of the many surprises this remarkable package contains. With big data come big responsibilities New features for plotting and interpretability Models fitted by…

Fujitsu joins the Consortium

Fujitsu Laboratories join the Consortium. Fujitsu will thereby contribute to the sustainability of the scikit-learn development community. More information is available via the press releases published by Inria and Fujitsu.

TECHNICAL COMMITTEE / July 4, 2019

Technical Committee July 4, 2019   Priority list for the consortium at Inria, year 2019–2020   From the discussion during the technical committee, the scikit-learn consortium at Inria defined the following list of priorities for the coming year: Continue effort to help with project maintenance to keep the target to release twice a year. Development of the “inspect” module: Help finalize the pull requests for the newly introduced…

Scikit-learn sprint in Paris

Three weeks ago, we organized a scikit-learn sprint in the AXA’s offices in Paris. No less than 37 persons attended the sprint during the week. Such effort is equivalent to a 6 man-month! While the sprint was organized by the scikit-learn fondation @ inria, it united a much wider group of contributors and it was funded by other organizations (see below). Improvements to scikit-learn This sprint saw the…