THE PROCEEDINGS

Python in Science Conferences

The SciPy Conference is a cross-disciplinary gathering focused on the use and development of the Python language in scientific research. This event strives to bring together both users and developers of scientific tools, as well as academic research and state of the art industry.

From the 2023 Proceedings

Data Reduction Network

Multidimensional categorical data is widespread but not easily visualized using standard methods. For example, questionnaire data generally consists of questions with categorical responses. Popular methods of handling categorical data include one-hot encoding and enumeration, which applies an unwarranted and potentially misleading notional order to the data. To address this, we introduce a novel visualization method named Data Reduction Network.
Haoyin Xu, Haw-minn Lu, José Unpingco
https://doi.org/10.25080/gerudo-f2bc6f59-012

libyt: a Tool for Parallel In Situ Analysis with yt

In the era of exascale computing, storage and analysis of large scale data have become more important and difficult. We present libyt, an open source C++ library, that allows researchers to analyze and visualize data using yt or other Python packages in parallel during simulation runtime.
Shin-Rong Tsai, Hsi-Yu Schive, Matthew J. Turk
https://doi.org/10.25080/gerudo-f2bc6f59-011

Pandera: Going Beyond Pandas Data Validation

Data quality remains a core concern for practitioners in machine learning, data science, and data engineering, and many specialized packages have emerged to fulfill the need of validating and monitoring data and models. This paper outlines pandera’s motivation and challenges that took it from being a pandas-only data validation framework to one that is extensible to other non-pandas-compliant dataframe-like libraries.
Niels Bantilan
https://doi.org/10.25080/gerudo-f2bc6f59-010

aPhyloGeo-Covid: A Web Interface for Reproducible Phylogeographic Analysis of SARS-CoV-2 Variation using Neo4j and Snakemake

The gene sequencing data, along with the associated lineage tracing and research data generated throughout the Coronavirus disease 2019 (COVID-19) pandemic, constitute invaluable resources that profoundly empower phylogeography research. To optimize the utilization of these resources, we have developed an interactive analysis platform called aPhyloGeo-Covid.
Wanlin Li, Nadia Tahiri
https://doi.org/10.25080/gerudo-f2bc6f59-00f

See All 2023 Articles

The annual SciPy Conferences allows participants from academic, commercial, and governmental organizations to:

  • showcase their latest Scientific Python projects,
  • learn from skilled users and developers, and
  • collaborate on code development.

The conferences generally consists of multiple days of tutorials followed by two-three days of presentations, and concludes with 1-2 days developer sprints on projects of interest to the attendees.

NumFOCUS - Insight Software Consortium (ITK) (n.d.)

References