A joblib sprint for better parallelization in Python
Contributing to the whole python ecosystem is crucial and has always been a strong will for…
Comment les priorités du Consortium sont elles définies?
Le Consortium scikit-learn @ Inria défini une feuille de route tous les six à huit mois pendant le Comité Technique. Les feuilles de route définies jusqu'à maintenant sont consultables ici. Pourquoi une feuille de route? Les membres du Consortium fournissent leur soutien financier sans aucune contrepartie. La définition d'une feuille de route concernant son développement logiciel et plus en général ses activités est une étape importante dans la…
Generalized Linear Models have landed in scikit-learn
While scikit-learn already had some Generalized Linear Models (GLM) implemented, e.g. LogisticRegression, other losses than mean squared error and log-loss were missing. As the world is almost (surely) never normally distributed, regression tasks might benefit a lot from the new PoissonRegressor, GammaRegressor and TweedieRegressor estimators: using those GLMs for positive, skewed data is much more appropriate than ordinary least squares and might lead to more adequate models. Starting…
Advisory Committee / February 8th 2021
Presentation of the activities of the Consortium during the last year (C. Marmo, G. Varoquaux): Questions and comments: Fujitsu Fujitsu actively participates in the Consortium remote events. Fujitsu would be glad to increase Japan contributions to scikit-learn. Fujitsu suggests organizing a sprint for Japan time zone, and starting a discussion about good practices to organize online sprints with the team there. Microsoft More information about the MOOC are…
Ingénieur en apprentissage automatique parallèle
L'équipe scikit-learn d'Inria recherche un ingénieur pour accélérer l'apprentissage automatique avec scikit-learn en améliorant son utilisation du calcul parallèle. Le travail sera réparti sur scikit-learn et ses dépendances de calcul parallèle (joblib, threadpoolctl et éventuellement CPython). Cython+ est un projet de développement financé par le gouvernement français pour développer l'infrastructure de calcul parallèle en Python autour des applications numériques. Voir l'offre de poste pour plus de détails.
Scaling up the benchmark infrastructure of scikit-learn
The Scikit-Learn Consortium at Inria foundation proposes an internship for scaling up the benchmark infrastructure of scikit-learn. The goals of the internship are: Development of an automated benchmark suite to monitor Scikit-learn’s efficiency against third party librarieslike daal4py and cuML. Analysis of the results: identify which scikit-learn models are the most under-performing and try to understandthe root cause by reading the source code and analyzing the space and…
Implementing a faster KMeans in scikit-learn 0.23
The 0.23 version of scikit-learn was released a few days ago, bringing new features, bug fixes and optimizations. In this post we will focus on the rework of KMeans, a long going work started almost two years ago. Better scalability on machines with many cores was the main objective of this journey. It forced us to touch core challenges of low-level parallelism. KMeans clustering Before describing the optimization…
Paris sprint of the Decade: happy birthday scikit-learn!
At the end of January 2020 a scikit-learn sprint took place in the Paris offices of Dataiku. Sonia and Léo from Dataiku deserve a special thanks for so nicely taking care of us! We had three days of coding in excellent company, introduced by a beginner's workshop aimed to lower the entry cost to the first Pull Request on scikit-learn. As a side effect the team had the…
Time to come out! scikit-learn 0.22
A new look and many new features for this 0.22 scikit-learn release. Just a bit earlier than Santa visiting, this past month some special Elves have worked really hard to keep the target of releasing scikit-learn twice a year. Come take a look at some of the many surprises this remarkable package contains. With big data come big responsibilities New features for plotting and interpretability Models fitted by…
Fujitsu rejoint le Consortium
Fujitsu Laboratories rejoint le Consortium. Fujitsu contribuera ainsi au soutien de la communauté qui développe scikit-learn. Pour plus d'informations le communiqué de presse Inria et Fujitsu sont en ligne.