publications | Saransh Chopra

2025

Vector: JIT-compilable mathematical manipulations of ragged Lorentz vectors

Chopra, Saransh, Schreiner, Henry, Rodrigues, Eduardo, Eschle, Jonas, and Pivarski, Jim

Journal of Open Source Software 2025

Abs HTML

Mathematical manipulation of vectors is a crucial component of data analysis pipelines in high energy physics, enabling physicists to transform raw data into meaningful results that can be visualized. More specifically, high energy physicists work with 2D and 3D Euclidean vectors, and 4D Lorentz vectors that can be used as physical quantities, such as position, momentum, and forces. Given that high energy physics data are not uniform, vector manipulation frameworks or libraries are expected to work readily on non-uniform or ragged data, data with variable-sized rows (or a nested data structure with variable-sized entries); thus, the library is expected to perform operations on an entire ragged structure in minimum passes. Furthermore, optimizing memory usage and processing time has become essential with the increasing computational demands at the Large Hadron Collider (LHC), the world’s largest particle accelerator. Vector is a Python library for creating and manipulating 2D, 3D, and Lorentz vectors, especially arrays of vectors, to solve common physics problems in a NumPy-like (Harris et al., 2020) way. The library enables physicists to operate on high energy physics data in a high level language without compromising speed. The library is already in use at LHC and is a part of frameworks, like Coffea (Gray et al., 2023), employed by physicists across multiple high energy physics experiments.

2024

PyHEP.dev 2024 Workshop Summary Report, August 26-30 2024, Aachen, Germany

Alshehri, Azzah, Bürger, Jan, Chopra, Saransh, Eich, Niclas, Eppelt, Jonas, Erdmann, Martin, Eschle, Jonas, Fackeldey, Peter, Farkas, Maté, Feickert, Matthew, Fillinger, Tristan, Fischer, Benjamin, Gerlach, Lino Oscar, Hartmann, Nikolai, Heidelbach, Alexander, Held, Alexander, Ivanov, Marian I, Molina, Josué, Nikitenko, Yaroslav, Osborne, Ianna, Padulano, Vincenzo Eduardo, Pivarski, Jim, Praz, Cyrille, Rieger, Marcel, Rodrigues, Eduardo, Shadura, Oksana, Smieško, Juraj, Stark, Giordon Holtsberg, Steinfeld, Judith, and Warkentin, Angela

2024

Abs HTML

The second PyHEP.dev workshop, part of the “Python in HEP Developers” series organized by the HEP Software Foundation (HSF), took place in Aachen, Germany, from August 26 to 30, 2024. This gathering brought together nearly 30 Python package developers, maintainers, and power users to engage in informal discussions about current trends in Python, with a primary focus on analysis tools and techniques in High Energy Physics (HEP). The workshop agenda encompassed a range of topics, such as defining the scope of HEP data analysis, exploring the Analysis Grand Challenge project, evaluating statistical models and serialization methods, assessing workflow management systems, examining histogramming practices, and investigating distributed processing tools like RDataFrame, Coffea, and Dask. Additionally, the workshop dedicated time to brainstorming the organization of future PyHEP.dev events, upholding the tradition of alternating between Europe and the United States as host locations. This document, prepared by the session conveners in the weeks following the workshop, serves as a summary of the key discussions, salient points, and conclusions that emerged.
Predicting efficacy of antiseizure medication treatment with machine learning algorithms in North Indian population

Kaushik, Mahima, Mahajan, Siddhartha, Machahary, Nitin, Thakran, Sarita, Chopra, Saransh, Tomar, Raj Vardhan, Kushwaha, Suman S., Agarwal, Rachna, Sharma, Sangeeta, Kukreti, Ritushree, and Biswal, Bibhu

Epilepsy Research 2024

Abs HTML

Purpose This study aimed to develop a classifier using supervised machine learning to effectively assess the impact of clinical, demographical, and biochemical factors in accurately predicting the antiseizure medications (ASMs) treatment response in people with epilepsy (PWE). Methods Data was collected from 786 PWE at the Outpatient Department of Neurology, Institute of Human Behavior and Allied Sciences (IHBAS), New Delhi, India from 2005 to 2015. Patients were followed up at the 2nd, 4th, 8th, and 12th month over the span of 1 year for the drugs being administered and their dosage, the serum drug levels, the frequency of seizure control, drug efficacy, the adverse drug reactions (ADRs), and their compliance to ASMs. Several features, including demographic details, medical history, and auxiliary examinations electroencephalogram (EEG) or Computed Tomography (CT) were chosen to discern between patients with distinct remission outcomes. Remission outcomes were categorized into ‘good responder (GR)’ and ‘poor responder (PR)’ based on the number of seizures experienced by the patients over the study duration. Our dataset was utilized to train seven classical machine learning algorithms i.e Extreme Gradient Boost (XGB), K-Nearest Neighbor (KNN), Support Vector Classifier (SVC), Decision Tree (DT), Random Forest (RF), Naïve Bayes (NB) and Logistic Regression (LR) to construct classification models. Results Our research findings indicate that 1) among the seven algorithms examined, XGB and SVC demonstrated superior predictive performances of ASM treatment outcomes with an accuracy of 0.66 each and ROC-AUC scores of 0.67 (XGB) and 0.66 (SVC) in distinguishing between PR and GR patients. 2) The most influential factor in discerning PR to GR patients is a family history of seizures (no), education (literate) and multitherapy with Chi-square (χ2) values of 12.1539, 8.7232 and 13.620 respectively and odds ratio (OR) of 2.2671, 0.4467, and 1.9453 each. 3). Furthermore, our surrogate analysis revealed that the null hypothesis for both XGB and SVC was rejected at a 100 % confidence level, underscoring the significance of their predictive performance. These findings underscore the robustness and reliability of XGB and SVC in our predictive modelling framework. Significance Utilizing XG Boost and SVC-based machine learning classifier, we successfully forecasted the likelihood of a patient’s response to ASM treatment, categorizing them as either PR or GR, post-completion of standard epilepsy examinations. The classifier’s predictions were found to be statistically significant, suggesting their potential utility in improving treatment strategies, particularly in the personalized selection of ASM regimens for individual epilepsy patients.

2022

liionpack: A Python package for simulating packs of batteries with PyBaMM

Tranter, Thomas G., Timms, Robert, Sulzer, Valentin, Planella, Ferran Brosa, Wiggins, Gavin M., Karra, Suryanarayana V., Agarwal, Priyanshu, Chopra, Saransh, Allu, Srikanth, Shearing, Paul R., and Brett, Dan J.

Journal of Open Source Software 2022

Abs HTML

Electrification of transport and other energy intensive activities is of growing importance as it provides an underpinning method to reduce carbon emissions. With an increase in reliance on renewable sources of energy and a reduction in the use of more predictable fossil fuels in both stationary and mobile applications, energy storage will play a pivotal role and batteries are currently the most widely adopted and versatile form. Therefore, understanding how batteries work, how they degrade, and how to optimize and manage their operation at large scales is critical to achieving emission reduction targets. The electric vehicle (EV) industry requires a considerable number of batteries even for a single vehicle, sometimes numbering in the thousands if smaller cells are used, and the dynamics and degradation of these systems, as well as large stationary power systems, is not that well understood. As increases in the efficiency of a single battery become diminishing for standard commercially available chemistries, gains made at the system level become more important and can potentially be realised more quickly compared with developing new chemistries. Mathematical models and simulations provide a way to address these challenging questions and can aid the engineer and designers of batteries and battery management systems to provide longer lasting and more efficient energy storage systems.