Proceedings of SciPy 2015
SciPy 2015, the 14th annual Scientific Computing with Python conference, was held July 6-12, 2015 in Austin, Texas. 30 peer reviewed articles were published in the conference proceedings. Full proceedings and organizing committee can be found at https://
The notion of capturing each execution of a script and workflow and its associated metadata is enormously appealing and should be at the heart of any attempt to make scientific simulations repeatable and reproducible.
This article presents an open-source Python software package, dubbed RTGraph, to visualize, process and record physiological signals (electrocardiography, electromyography, etc.) in real-time. RTGraph has a multiprocess architecture.
We introduce BigBang, a new Python toolkit for analyzing online collaborative communities such as those that build open source software. Mailing lists serve as critical communications infrastructure for many communities, including several of the open source software development communities that build scientific Python packages.
The humble mathematical relation, a fundamental (if implicit) component in computational algorithms, is conspicuously absent in most standard container collections, including Python’s. In this paper, we present the basics of a relation container, and why you might use it instead of other methods.
In this paper we demonstrate how Python can be used throughout the entire life cycle of a graduate program in Data Science. In interdisciplinary fields, such as Data Science, the students often come from a variety of different backgrounds where, for example, some students may have strong mathematical training but less experience in programming.
Advances in sequencing, proteomics, transcriptomics and metabolomics are giving us new insights into the microbial world and dramatically improving our ability to understand microbial community composition and function at high resolution.
The deformation of the Earth surface reflects the action of several forces that act inside the planet. To understand how the Earth surface evolves complex models must be built to reconcile observations with theoretical numerical simulations.
Probabilistic graphical models are useful tools for modeling systems governed by probabilistic structure. Bayesian networks are one class of probabilistic graphical model that have proven useful for characterizing both formal systems and for reasoning with those systems.
TrendVis is a plotting package that uses matplotlib to create information-dense, sparkline-like, quantitative visualizations of multiple disparate data sets in a common plot area against a common variable.
The National Oceanic and Atmospheric Administration (NOAA) Air Resources Laboratory's HYSPLIT (HYbrid Single Particle Lagrangian Transport) model Drax98, Drax97 uses a hybrid Langrangian and Eulerian calculation method to compute air parcel trajectories and particle dispersion and deposition simulations.
Dask enables parallel and out-of-core computation. We couple blocked algorithms with dynamic and memory aware task scheduling to achieve a parallel and out-of-core NumPy clone. We show how this extends the effective scale of modern hardware to larger datasets and discuss how these ideas can be more broadly applied to other parallel collections.
This paper describes a tool for astronomical research implemented as an IPython notebook with a widget interface. The notebook uses Astropy, a community-developed package of fundamental tools for astronomy, and Astropy affiliated packages, as the back end.
Hydrological terrain analysis is important for applications such as environmental resource, agriculture, and flood risk management. It is based on processing of high-resolution, tiled digital elevation model (DEM) data for geographic regions of interest.
This paper will take the audience through the story of how an electrical and computer engineering faculty member has come to embrace Python, in particular IPython Notebook (IPython kernel for Jupyter), as an analysis and simulation tool for both teaching and research in signal processing and communications.
Time series analysis has been a dominant technique for assessing relations within datasets collected over time and is becoming increasingly prevalent in the scientific community; for example, assessing brain networks by calculating pairwise correlations of time series generated from different areas of the brain.
The growing availability of large, multidimensional data sets has created demand for high-performance, interactive visualization tools. VisPy leverages the GPU to provide fast, interactive, and beautiful visualizations in a high-level API.
In this work, a new python package, PyRK (Python for Reactor Kinetics), is introduced. PyRK has been designed to simulate, in zero dimensions, the transient, coupled, thermal-hydraulics and neutronics of time-dependent behavior in nuclear reactors.
Automated telescopes are capable of generating images more quickly than they can be inspected by a human, but detailed information on the performance of the telescope is valuable for monitoring and tuning of their operation.
The structural cohesion model is a powerful sociological conception of cohesion in social groups, but its diffusion in empirical literature has been hampered by computational problems. We present useful heuristics for computing structural cohesion that allow a speed-up of one order of magnitude over the algorithms currently available.
Scientific visualization typically requires large amounts of custom coding that obscures the underlying principles of the work and makes it difficult to reproduce the results. Here we describe how the new HoloViews Python package, when combined with the IPython Notebook and a plotting library, provides a rich, interactive interface for flexible and nearly code-free visualization of your results while storing a full record of the process for later reproduction.
Agent-based modeling is a computational methodology used in social science, biology, and other fields, which involves simulating the behavior and interaction of many autonomous entities, or agents, over time.
BLAS, LAPACK, and other libraries like them have formed the underpinnings of much of the scientific stack in Python. Until now, the standard practice in many packages for using BLAS and LAPACK has been to link each Python extension directly against the libraries needed.
The James Webb Space Telescope (JWST) is the successor to the Hubble Space Telescope (HST) and is currently expected to be launched in late 2018. The Space Telescope Science Institute (STScI) is developing the software systems that will be used to provide routine calibration of the science data received from JWST.
Built on Google App Engine (GAE), RealMassive encountered challenges while attempting to scale its recommendation engine to match its nationwide, multi-market expansion. To address this problem, we borrowed a conceptual model from spectral data processing to transform our domain-specific problem into one that the GAE's search engine could solve.
VTK and ParaView are leading software packages for data analysis and visualization. Since their early years, Python has played an important role in each package. In many use cases, VTK and ParaView serve as modules used by Python applications.
This paper introduces PyEDA, a Python library for electronic design automation (EDA). PyEDA provides both a high level interface to the representation of Boolean functions, and blazingly-fast C extensions for fundamental algorithms where performance is essential.
This document describes version 0.4.0 of librosa: a Python package for audio and music signal processing. At a high level, librosa provides implementations of a variety of common functions used throughout the field of music information retrieval.
We have been involved with teaching Python to biomedical scientists since 2005. In all, seven courses have been taught: 5 at the University of Pittsburgh, as a required course for biomedical informatics graduate students.
Probabilistic Graphical Models (PGM) is a technique of compactly representing a joint distribution by exploiting dependencies between the random variables. It also allows us to do inference on joint distributions in a computationally cheaper way than the traditional methods.
Using data from the National Survey of Family Growth (NSFG), we investigate marriage patterns among women in the United States We describe and predict age at first marriage for successive generations based on decade of birth.