An academic journal in statistics and machine learning promoting reproducibility and alternative publication mode
April 8, 2025
Editorial board
IT support
Communication
Stat. learning, DR INRAE
Paris-Saclay University
Statistique, DR CNRS
IMT Toulouse
CS/Stats/ML, IR CNRS
IMAG, Montpellier University
Machine Learning
CR MinesParisTech
Machine learning, CR CNRS
Grenoble Alpes University
Statistics, MCF
Institut Agro Rennes-Angers
Stats/dev, IR
Jean Leray, Univ. Nantes
Official launch at the end of 2021
AAP Science Ouverte 🤩
Origin (~ 2020s)
Mission carried out at the French statistical society (SFdS)
Assessment
😔 Multiplication of “traditional” journals, often predatory journal…
😱 ↘ of publication quality and time dedicated to each article (on author or reviewer sides) [1]
😔 Not enough valorization of “negative” results or source codes/case studies
😱 Issue with scientific reproducibility (analyses, experiments) [2–7]
Point of view
Scientific perimeter
Promote contributions in stat/ML that provide insight into which models/methods are appropriate to address a scientific question
Open access
Reproducible
Numerical reproducibility is a necessary condition (Source code and data should be available)
Fundamentally, it provides three things:
Tools to reproduce the results (that’s like cooking)
A “recipe” to reproduce the results (still like cooking)
A path to understanding the results and the process that led to them (unlike cooking…1)
Notebook and literate programming
text (markdown) + math (\(\LaTeX\)) + code (Python/R/Julia), references (bib\(\TeX\))
Environment management, Compilation, Multi-format publication (pdf, html)
Continuous integration/Continuous deployment (CI/CD)
with template notebook document + doc + pre-configured compilation and publication setup
Let’s go, locally (same spirit as Jupyter/Rmarkdown notebooks)
Configuration file versionned and used during CI compilation/publication action
A git push
command will trigger your article compilation (including computations) and publication as a github page1
See the preconfigured .github/workflows/build.yml
file for the github action configuration2
If the CI process succeeds, both HTML and PDF versions are published on the github-page associated to the repository
https://openreview.net/group?id=Computo
Submit:
After a “traditionnal” review process, a 3 step procedure:
including
🥲 Fully operational + doi, ISSN
🙂 15 published articles articles, 5 under review (more details here)
🙂 x presentations (Montpellier, Toronto, Humastica, Grenoble, RR2023, etc.)
🙂 French reproducible research network
🤯 Difficult to find reviewers
🙂 Referencing and Visibility: Mir@bel, Open Policy/Sherpa Romeo -> DOAJ
🤯 Google Scholar: dark black box
🤔 How to build on institutional support?
🤔 Changing of practices in the scientific community?
github
: dynamic, large user community but not institutional and limited computing resources. Switch to a French institutional gitlab forge?markdown
Rmarkdown
Pandoc
Credit: Pratik89Roy CC-BY-SA-4.0 from Wikimedia
The global scientific workflow of a reproducible process split into two types of steps
Process to obtain (intermediate) results outside of the notebook environment, for a list of reasons (non-exclusive to each other):
Notebook rendering with the results of the external process
Requirement
If the notebook contains everything to produce the final document
\(\Rightarrow\) “Direct reproducibility” in the sense that the notebook is the only thing needed to reproduce the results.
Ultimately, the workflow must end with a direct reproducibility step which concludes the whole process.
data produced by the external process \(\Rightarrow\) transferred to the notebook environment.
Requirement
Not only the intermediate results are provided, but also the code to transfer it in the notebook environment.
There are a variety of software solutions to do so.
joblib.Memory
, caching mechanism for python functions, save the results of a function call to disk, and load it back later..RData
file format, can be loaded back in R with the load()
function..csv
, .tsv
, .json
, etc.) is also a solution..joblib
directory or .Rdata
file) could be committed to the git repository, and directly loaded in the notebook environment.