Skip to contentSkip to frontmatterSkip to Backmatter

Proceedings of SciPy 2022

SciPy 2022, the 21st annual Scientific Computing with Python conference, was held in Austin, TX July 11-17, 2022. 39 peer reviewed articles were published in the conference proceedings. Full proceedings, posters and slides, and organizing committee can be found at https://proceedings.scipy.org/articles/majora-212e5952-046.

Low Level Feature Extraction for Cilia Segmentation

Cilia are organelles found on the surface of some cells in the human body that sweep rhythmically to transport substances. Dysfunction of ciliary motion is often indicative of diseases known as ciliopathies, which disrupt the functionality of macroscopic structures within the lungs, kidneys and other organs.

Meekail Zain, Eric Miller, Shannon P Quinn et al.
https://doi.org/10.25080/majora-212e5952-026
Enabling Active Learning Pedagogy and Insight Mining with a Grammar of Model Analysis

Modern engineering models are complex, with dozens of inputs, uncertainties arising from simplifying assumptions, and dense output data. While major strides have been made in the computational scalability of complex models, relatively less attention has been paid to user-friendly, reusable tools to explore and make sense of these models.

Zachary del Rosario
https://doi.org/10.25080/majora-212e5952-025
Automatic random variate generation in Python

The generation of random variates is an important tool that is required in many applications. Various software programs or packages contain generators for standard distributions like the normal, exponential or Gamma, e.g., the programming language R and the packages SciPy and NumPy in Python.

Christoph Baumgarten and Tirth Patel
https://doi.org/10.25080/majora-212e5952-007
atoMEC: An open-source average-atom Python code

Average-atom models are an important tool in studying matter under extreme conditions, such as those conditions experienced in planetary cores, brown and white dwarfs, and during inertial confinement fusion.

Timothy J. Callow, Daniel Kotik, Eli Kraisler et al.
https://doi.org/10.25080/majora-212e5952-006
Monaco: A Monte Carlo Library for Performing Uncertainty and Sensitivity Analyses

This paper introduces *monaco*, a Python library for conducting Monte Carlo simulations of computational models, and performing uncertainty analysis (UA) and sensitivity analysis (SA) on the results.

W. Scott Shambaugh
https://doi.org/10.25080/majora-212e5952-024
A Python Pipeline for Rapid Application Development (RAD)

Rapid Application Development (RAD) is the ability to rapidly prototype an interactive interface through frequent feedback, so that it can be quickly deployed and delivered to stakeholders and customers.

Scott D. Christensen, Marvin S. Brown, Robert B. Haehnel et al.
https://doi.org/10.25080/majora-212e5952-023
Variational Autoencoders For Semi-Supervised Deep Metric Learning

Deep metric learning (DML) methods generally do not incorporate unlabelled data. We propose borrowing components of the variational autoencoder (VAE) methodology to extend DML methods to train on semi-supervised datasets.

Nathan Safir, Meekail Zain, Curtis Godwin et al.
https://doi.org/10.25080/majora-212e5952-022
Wailord: Parsers and Reproducibility for Quantum Chemistry

Data driven advances dominate the applied sciences landscape, with quantum chemistry being no exception to the rule. Dataset biases and human error are key bottlenecks in the development of reproducible and generalized insights.

Rohit Goswami
https://doi.org/10.25080/majora-212e5952-021
RocketPy: Combining Open-Source and Scientific Libraries to Make the Space Sector More Modern and Accessible

In recent years we are seeing exponential growth in the space sector, with new companies emerging in it. On top of that more people are becoming fascinated to participate in the aerospace revolution, which motivates students and hobbyists to build more High Powered and Sounding Rockets.

João Lemes Gribel Soares, Mateus Stano Junqueira, Oscar Mauricio Prada Ramirez et al.
https://doi.org/10.25080/majora-212e5952-020
Improving PyDDA's atmospheric wind retrievals using automatic differentiation and Augmented Lagrangian methods

Meteorologists require information about the spatiotemporal distribution of winds in thunderstorms in order to analyze how physical and dynamical processes govern thunderstorm evolution. Knowledge of such processes is vital for predicting severe and hazardous weather events.

Robert Jackson, Rebecca Gjini, Sri Hari Krishna Narayanan et al.
https://doi.org/10.25080/majora-212e5952-01f
pyDAMPF: a Python package for modeling mechanical properties of hygroscopic materials under interaction with a nanoprobe
Willy Menacho, Gonzalo Marcelo Ramírez-Ávila, and Horacio V. Guzman
https://doi.org/10.25080/majora-212e5952-01e
popmon: Analysis Package for Dataset Shift Detection

popmon is an open-source Python package to check the stability of a tabular dataset.

Simon Brugman, Tomas Sostak, Pradyot Patil et al.
https://doi.org/10.25080/majora-212e5952-01d
Experience report of physics-informed neural networks in fluid simulations: pitfalls and frustration

Though PINNs (physics-informed neural networks) are now deemed as a complement to traditional CFD (computational fluid dynamics) solvers rather than a replacement, their ability to solve the Navier-Stokes equations without given data is still of great interest.

Pi-Yueh Chuang and Lorena A. Barba
https://doi.org/10.25080/majora-212e5952-005
The Geoscience Community Analysis Toolkit: An Open Development, Community Driven Toolkit in the Scientific Python Ecosystem

The Geoscience Community Analysis Toolkit (GeoCAT) team develops and maintains data analysis and visualization tools on structured and unstructured grids for the geosciences community in the Scientific Python Ecosystem (SPE).

Orhan Eroglu, Anissa Zacharias, Michaela Sizemore et al.
https://doi.org/10.25080/majora-212e5952-01c
Design of a Scientific Data Analysis Support Platform

Software data analytic workflows are a critical aspect of modern scientific research and play a crucial role in testing scientific hypotheses.

Nathan Martindale, Jason Hite, Scott Stewart et al.
https://doi.org/10.25080/majora-212e5952-01b
Temporal Word Embeddings Analysis for Disease Prevention

Human languages' semantics and structure constantly change over time through mediums such as culturally significant events. By viewing the semantic changes of words during notable events, contexts of existing and novel words can be predicted for similar, current events.

Nathan Jacobi, Ivan Mo, Albert You et al.
https://doi.org/10.25080/majora-212e5952-01a
Global optimization software library for research and education

Machine learning models are often represented by functions given by computer programs. Optimization of such functions is a challenging task because traditional derivative based optimization methods with guaranteed convergence properties cannot be used.

Nadia Udler
https://doi.org/10.25080/majora-212e5952-019
Phylogeography: Analysis of genetic and climatic data of SARS-CoV-2

Due to the fact that the SARS-CoV-2 pandemic reaches its peak, researchers around the globe are combining efforts to investigate the genetics of different variants to better deal with its distribution. This paper discusses phylogeographic approaches to examine how patterns of divergence within SARS-CoV-2 coincide with geographic features, such as climatic features.

Aleksandr Koshkarov, Wanlin Li, My-Linh Luu et al.
https://doi.org/10.25080/majora-212e5952-018
Search for Extraterrestrial Intelligence: GPU Accelerated TurboSETI

A common technique adopted by the Search For Extraterrestrial Intelligence (SETI) community is monitoring electromagnetic radiation for signs of extraterrestrial technosignatures using ground-based radio observatories.

Luigi Cruz, Wael Farah, and Richard Elkins
https://doi.org/10.25080/majora-212e5952-004
pyAudioProcessing: Audio Processing, Feature Extraction, and Machine Learning Modeling

pyAudioProcessing is a Python based library for processing audio data, constructing and extracting numerical features from audio, building and testing machine learning models, and classifying data with existing pre-trained audio classification models or custom user-built models.

Jyotika Singh
https://doi.org/10.25080/majora-212e5952-017
A New Python API for Webots Robotics Simulations

Webots is a popular open-source package for 3D robotics simulations. It can also be used as a 3D interactive environment for other physics-based modeling, virtual reality, teaching or games. Webots has provided a simple API allowing Python programs to control robots and/or the simulated world, but this API is inefficient and does not provide many "pythonic" conveniences.

Justin C. Fisher
https://doi.org/10.25080/majora-212e5952-016
poliastro: a Python library for interactive astrodynamics

Space is more popular than ever, with the growing public awareness of interplanetary scientific missions, as well as the increasingly large number of satellite companies planning to deploy satellite constellations.

Juan Luis Cano Rodríguez and Jorge Martínez Garrido
https://doi.org/10.25080/majora-212e5952-015
Likeness: a toolkit for connecting the social fabric of place to human dynamics

The ability to produce richly-attributed synthetic populations is key for understanding human dynamics, responding to emergencies, and preparing for future events, all while protecting individual privacy. The Likeness toolkit accomplishes these goals.

Joseph V. Tuccillo and James D. Gaboardi
https://doi.org/10.25080/majora-212e5952-014
Keeping your Jupyter notebook code quality bar high (and production ready) with Ploomber

This paper walks through the ploomber interactive tutorial.

Ido Michael
https://doi.org/10.25080/majora-212e5952-013
Awkward Packaging: building Scikit-HEP

Scikit-HEP has grown rapidly over the last few years, not just to serve the needs of the High Energy Physics (HEP) community, but in many ways, the Python ecosystem at large. AwkwardArray, boost-histogram/hist, and iminuit are examples of libraries that are used beyond the original HEP focus. In this paper we will look at key packages in the ecosystem.

Henry Schreiner, Jim Pivarski, and Eduardo Rodrigues
https://doi.org/10.25080/majora-212e5952-012
Incorporating Task-Agnostic Information in Task-Based Active Learning Using a Variational Autoencoder

It is often much easier and less expensive to collect data than to label it. Active learning (AL) responds to this issue by selecting which unlabeled data are best to label next.

Curtis Godwin, Meekail Zain, Nathan Safir et al.
https://doi.org/10.25080/majora-212e5952-011
Codebraid Preview for VS Code: Pandoc Markdown Preview with Jupyter Kernels

Codebraid Preview is a VS Code extension that provides a live preview of Pandoc Markdown documents with optional support for executing embedded code. Unlike typical Markdown previews, all Pandoc features are fully supported because Pandoc itself generates the preview.

Geoffrey M. Poore
https://doi.org/10.25080/majora-212e5952-010
Pylira: deconvolution of images in the presence of Poisson noise

All physical and astronomical imaging observations are degraded by the finite angular resolution of the camera and telescope systems. The recovery of the true image is limited by both how well the instrument characteristics are known and by the magnitude of measurement noise.

Axel Donath, Aneta Siemiginowska, Vinay Kashyap et al.
https://doi.org/10.25080/majora-212e5952-00f
Python vs. the pandemic: a case study in high-stakes software development

When it became clear in early 2020 that COVID-19 was going to be a major public health threat, politicians and public health officials turned to academic disease modelers like us for urgent guidance.

Cliff C. Kerr, Robyn M. Stuart, Dina Mistry et al.
https://doi.org/10.25080/majora-212e5952-00e
Bayesian Estimation and Forecasting of Time Series in statsmodels

Statsmodels, a Python library for statistical and econometric analysis, has traditionally focused on frequentist inference, including in its models for time series data.

Chad Fulton
https://doi.org/10.25080/majora-212e5952-00d
USACE Coastal Engineering Toolkit and a Method of Creating a Web-Based Application

In the early 1990s the Automated Coastal Engineering Systems, ACES, was created with the goal of providing state-of-the-art computer-based tools to increase the accuracy, reliability, and cost-effectiveness of Corps coastal engineering endeavors.

Amanda Catlett, Theresa R. Coumbe, Scott D. Christensen et al.
https://doi.org/10.25080/majora-212e5952-003
Papyri: better documentation for the scientific ecosystem in Jupyter

We present here the idea behind Papyri, a framework we are developing to provide a better documentation experience for the scientific ecosystem.

Matthias Bussonnier and Camille Carvalho
https://doi.org/10.25080/majora-212e5952-00c
Python for Global Applications: teaching scientific Python in context to law and diplomacy students

For students across domains and disciplines, the message has been communicated loud and clear: data skills are an essential qualification for today’s job market.

Anna Haensch and Karin Knudson
https://doi.org/10.25080/majora-212e5952-00b
The myth of the normal curve and what to do about it

Reliance on the normal curve as a tool for measurement is almost a given. It shapes our grading systems, our measures of intelligence, and importantly, it forms the mathematical backbone of many of our inferential statistical tests and algorithms.

Allan Campopiano
https://doi.org/10.25080/majora-212e5952-00a
A Novel Pipeline for Cell Instance Segmentation, Tracking and Motility Classification of Toxoplasma Gondii in 3D Space

Toxoplasma gondii is the parasitic protozoan that causes disseminated toxoplasmosis, a disease that is estimated to infect around one-third of the world's population. TSeg is developed for segmenting, tracking, and classifying the motility phenotypes of T. gondii in 3D microscopic images.

Seyed Alireza Vaezi, Gianni Orlando, Mojtaba Fazli et al.
https://doi.org/10.25080/majora-212e5952-009
Utilizing SciPy and other open source packages to provide a powerful API for materials manipulation in the Schrödinger Materials Suite

The use of several open source scientific packages in the Schrödinger Materials Science Suite will be discussed.

Alexandr Fonari, Farshad Fallah, and Michael Rauch
https://doi.org/10.25080/majora-212e5952-008
Galyleo: A General-Purpose Extensible Visualization Solution

Galyleo is an open-source, extensible dashboarding solution integrated with JupyterLab.

Rick McGeer, Andreas Bergen, Mahdiyar Biazi et al.
https://doi.org/10.25080/majora-212e5952-002
Semi-Supervised Semantic Annotator (S3A): Toward Efficient Semantic Labeling

Most semantic image annotation platforms suffer severe bottlenecks when handling large images, complex regions of interest, or numerous distinct foreground regions in a single image. We have developed the Semi-Supervised Semantic Annotator (S3A) to address each of these issues and facilitate rapid collection of ground truth pixel-level labeled data.

Nathan Jessurun, Daniel E. Capecci, Olivia P. Dizon-Paradis et al.
https://doi.org/10.25080/majora-212e5952-001
The Advanced Scientific Data Format (ASDF): An Update

We report on progress in developing and extending the new (ASDF) format we have developed for the data from the James Webb and Nancy Grace Roman Space Telescopes since we reported on it at a previous Scipy.

Perry Greenfield, Edward Slavich, William Jamieson et al.
https://doi.org/10.25080/majora-212e5952-000