About me

I am Nima Dindarsafa, currently pursuing a Master's degree in Data Science and Artificial Intelligence at Saarland University. I am also a Research Assistant at the CISPA Helmholtz Center for Information Security in Saarbrücken, Germany. Prior to this, I completed my Bachelor's degree in Physics at the University of Tehran in Tehran, Iran.

My research interests center on making artificial intelligence systems safer, more secure, and more reliable. During my Master's studies, my research has focused on identifying datasets used to train large language models, as well as developing fairer methods for assessing the capabilities of these models. I am particularly interested in applying statistical methods to machine learning problems and designing approaches that provide provable guarantees grounded in statistical theory.

During my Bachelor's studies, I was also interested in the intersection of machine learning methods and quantum computing. My Bachelor's thesis explored the application of Boltzmann Machines to boson sampling, reflecting my broader interest in combining ideas from physics, computation, and learning theory.

In addition, I have a strong interest in mathematics, especially optimization methods in neural networks and their role in improving the behavior and performance of modern AI systems.

Publications

Practical Identification of Training Datasets in Large Language Models

Nima Dindarsafa, Bihe Zhao, Piotr Trzaskowski, Filip Szympliński, Krzysztof Wodnicki, Franziska Boenisch, Adam Dziedzic

Under Review · NeurIPS 2026

This research aims to develop a practical black-box Dataset Inference method for determining whether an LLM was trained on a suspect dataset by estimating per-token probabilities from label-only generated outputs, avoiding the need for gray-box probability access. It also replaces rigid p-value testing with e-value–based sequential testing, enabling anytime-valid and adaptive evidence accumulation as more samples are queried. Together, these methodological advances allow dataset inference to be applied to real-world LLM APIs under fully black-box access.

Read the paper →

Figure illustrating the 1D-CNN seismic collapse prediction framework

A 1D-CNN deep learning framework for seismic collapse prediction of jacket offshore platforms with Bayesian neural architecture search

Mohammad Zarrin, Liborio Cavaleri, Amirhosein Rezaei, Nima Dindarsafa

Ocean Engineering Journal · 2026

This study proposes a methodology for seismic limit-state and collapse prediction of Jacket Type Offshore Platforms using a 1D-CNN trained directly on earthquake accelerogram time-series data. The study develops a systematic model-selection framework based on Bayesian optimization to tune the neural architecture and hyperparameters, reducing reliance on manual trial-and-error. It further uses nested cross-validation for unbiased performance evaluation and introduces a stochastic optimization approach for stratified K-fold dataset preparation based on incremental dynamic analysis. The optimized 1D-CNN framework is compared with tuned MLP and SVM models and applied to estimate collapse fragility curves for a case-study offshore platform.

Read the paper →