Skip to contentSkip to frontmatterSkip to Backmatter

Posters and Slides

Accepted Paper Slides

It’s Time for the Atmospheric Science Community to ACT Together

The Atmospheric data Community Toolkit (ACT) is an open-source Python library for working with n-dimensional atmospheric time-series datasets. ACT contains functions for every aspect of the research lifecycle.

Adam Theisen

10.25080/majora-1b6fd038-020

Adopting static typing in scientific projects

Are you interested in adding typing to your existing codebase, but are not sure how to get started? Are you worried about managing the typing process without pausing your project’s development?

In this talk, we’ll embrace the fact that a large project’s transition toward typing will likely happen over the course of many months, concurrently with ongoing development. However, that doesn’t mean that getting started with typing has to be difficult! We’ll share with you two examples of adopting typing in existing open-source codebases (100k and 40k lines of Python). We’ll particularly focus on the typing experience from the perspective of project maintainers, contributors, and users of these Python libraries.

We will discuss useful tools and strategies, surprising difficulties, the types of bugs and errors we found, and how the addition of typing changes the overall development experience. By the end of this talk, you’ll be able to confidently manage the migration toward typing in your own codebase.

Predrag Gruevski, Colin Carroll

10.25080/majora-1b6fd038-021

cuCIM - A GPU image I/O and processing library

A presentation introducing RAPIDS cuCIM, a library for image I/O and processing on GPUs

Gregory R. Lee, Gigon Bae, Benjamin Zaitlen, John Kirkham, Rahul Choudhury

10.25080/majora-1b6fd038-022

Distributed statistical inference with pyhf powered by funcX

In high energy physics (HEP) a core component of analysis of data collected at the Large Hadron Collider is performing statistical inference for binned models to extract physics information. The statistical fitting tools used in HEP have traditionally been implemented in C++, but in recent years pyhf, a pure-Python library with automatic differentiation and hardware acceleration, has grown in use for analysis related statistical inference problems. The fitting of multiple different hypotheses for new physics signatures (signals) is a computational problem that lends itself easily to parallelization, but is hampered on HPC environments by the additional tooling overhead required, which can be very difficult to master. Through use of funcX, a pure-Python high performance function serving system designed to orchestrate scientific workloads across heterogeneous computing resources, pyhf can be used as a highly scalable (fitting) function as a service (FaaS) on HPCs.

Matthew Feickert

10.25080/majora-1b6fd038-023

Accepted Posters

Towards a Scientific Workflow Description: a yt Project Prototype for Interdisciplinary Analysis

Scientific workflow description provides an alternative to the cognitive overhead of learning a new software package and use of imperative programming paradigms often used with python. This description is encoded in a JSON schema, accessed by the user through a configuration file, and run using python modules that attach the configuration file to the code which produces output. We use yt, an computational astrophyics tool, to demonstrate how a domain specific software can operate within a descriptive framework.

Samantha Walkow, Dr. Chris Havlin, Dr. Matthew Turk, Dr. Corentin Cadiou

10.25080/majora-1b6fd038-017

Using Python for Analysis and Verification of Mixed-mode Signal Chains for Analog Signal Acquisition

Accurate, precise, and low-noise sensor measurements are essential before any machine can learn about (or artificial-intelligently make decisions about) the physical world. Modern, highly integrated signal acquisition devices can perform analog signal conditioning, digitization, and digital filtering on a single silicon device, greatly simplifying system electronics. However, a complete understanding of the signal chain properties is still required to correctly utilize and debug these modern devices.

Mark Thoren, Cristina Suteu

10.25080/majora-1b6fd038-018

Speeding Up Molecular Dynamics Trajectory Analysis with MPI Parallelization

Edis Jakupovic, Oliver Beckstein

10.25080/majora-1b6fd038-019

Social Media Analysis using Natural Language Processing Techniques

Social media is very popularly used every day with daily content viewing and/or posting that in turn influences people around this world in a variety of ways. Social media platforms, such as YouTube, have a lot of activity that goes on every day in terms of video posting, watching and commenting. While we can open the YouTube app on our phones and look at videos and what people are commenting, it only gives us a limited view as to kind of things others around us care about and what is trending amongst other consumers of our favorite topics or videos. Crawling some of this raw data and performing analysis on it using Natural Language Processing (NLP) can be tricky given the different styles of language usage by people in today’s world. This effort highlights the YouTube’s open Data API and how to use it in python to get the raw data, data cleaning using NLP tricks and Machine Learning in python for social media interactions, and extraction of trends and key influential factors from this data in an automated fashion using pyYouTubeAnalysis.

Jyotika Singh

10.25080/majora-1b6fd038-01a

Programmatically Identifying Cognitive Biases Present in Software Development

Mitigating bias in AI-enabled systems is a topic of great concern within the research community. We began developing an approach to identify a subset of cognitive biases that may be present in development artifacts (e.g., version control commit messages): anchoring bias, availability bias, confirmation bias, and hyperbolic discounting. We developed multiple natural language processing (NLP) models to identify and classify the presence of bias in text originating from software development artifacts.

Amanda E. Kraft, Matthew Widjaja, Trevor M. Sands, Brad J. Galego

10.25080/majora-1b6fd038-01b

Visualize 3D scientific data in a Pythonic way like matplotlib

Do you want to visualize 3D scientific data in a Pythonic way like matplotlib? If you want, this poster is for you. This poster is the introduction of PyVista.

Tetsuo Koyama

10.25080/majora-1b6fd038-01c

causal-curve: tools to perform causal inference given a continuous treatment

There are a multitude of scenarios in both research and industry where this would be useful to evaluate the impact of a continuous “treatment” on an outcome of interest in a causal inference framework. Unfortunately, we are not aware of an established python package that is able to perform this. The causal-curve package attempts to fill that gap, providing users with tools to generate causal dose-response curves (AKA causal curves).

Roni Kobrosly

10.25080/majora-1b6fd038-01d

SciPy 2021: An Accurate Implementation of the Studentized Range Distribution for Python

As data becomes more and more accessible, it can be tempting to misuse data analysis techniques to find statistically significant results, a practice known as ‘p-hacking’. Tukey’s HSD (Honestly Significant Difference) test is one of several tests that guards against this practice by using the studentized range distribution to compute p-values that account for the number of comparisons performed. Implementations of Tukey’s HSD already exist within the scientific Python ecosystem, but they rely on approximations of the studentized range distribution that may not behave well outside of their intended range and, even within the intended range, are only accurate to a few digits. In this document, we present a fast, highly accurate, and direct implementation of the studentized range distribution for SciPy, and we demonstrate its speed and accuracy.

Samuel Wallan, Dominic Chmiel, Matt Haberland

10.25080/majora-1b6fd038-01e

Cell Tracking in 3D using Deep Learning Segmentations

Live-cell imaging is highly used technique to study cell migration and dynamics over tile. Automated analysis of florescently membrane-labelled cells can be highly challenging due to their irregular shape, variability in size and dynamic movement across Z planes making it difficult to detect and track them. Ze introduce a detailed analysis pipeline to perform segmentation with accurate shape information, combined with BTrackmate, a customized codebase of popular ImageJ/Fiji software Trackmate, to perform cell tracking inside the tissue of interest. We also created an interface in Napari to visualize the tracks along a chosen view making it possible to follow a cell along the plane of motion. We provide a detailed protocaol to implement this pipeline in a new dataset, together with the required Jupyter notebooks.

Varun Kapoor, Claudia Carabana

10.25080/majora-1b6fd038-01f

SciPy Tools Plenaries

Awkward Array

Tools update on Awkward Array.

Jim Pivarski

10.25080/majora-1b6fd038-024

SciPy Tools Plenary on Matplotlib

Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. This presentation summarizes changes over the past year, new features, and future plans.

Elliott Sales de Andrade

10.25080/majora-1b6fd038-025

NumPy – Annual Update

Presentation about the highlights and milestones of the NumPy project in 2020-2021

Inessa Pawson

10.25080/majora-1b6fd038-026

SciPy Tools Plenary: Jupyter Updates

Project Jupyter creates open source software, standards, and services for interactive computing. This presentation covers recent milestones and ideas for people to contribute across the Jupyter ecosystem.

Isabela Presedo-Floyd, Matthias Bussonnier

10.25080/majora-1b6fd038-027

Scientific Python Ecosystem Coordination

Planning for the Next Decade of Scientific Python: outline of first phase

K. Jarrod Millman, Stéfan van der Walt

10.25080/majora-1b6fd038-028

SciPy: SciPy 2021 Tools Track

2021 updates and outlooks in SciPy

Pamphile T. Roy

10.25080/majora-1b6fd038-029

SciPy Tools Plenary: scikit-image annual update

A brief update on recent improvements and future plans for scikit-image.

Gregory R. Lee

10.25080/majora-1b6fd038-02a

Lightning Talks

Social Media Analysis using Natural Language Processing Techniques

Demonstration of social media noise and cleaning methods, followed by trend analysis on YouTube with NLP and statistics using pyYouTubeAnalysis.

Jyotika Singh

10.25080/majora-1b6fd038-015

seaborn-image : image data visualization in Python

High level API for attractive and descriptive image visualization in Python built on top of matplotlib

Sarthak Jariwala

10.25080/majora-1b6fd038-016