$ whoami
I specialize in designing, developing, and deploying computational tools to solve problems alongside subject matter experts in a wide range of disciplines. My career began with using data science, AI/ML, molecular simulations, and other advanced modeling tools to make data-driven discoveries in fields like material science, nuclear chemistry, food science, and biology @NIST; later I moved into building data-accelerated analytic, operations, and engineering solutions for partners in the cyber, aeronautical, and astronautical industries. My formal training includes a PhD in Chemical Engineering with a concentration in computational thermodynamics and a certificate in Computational and Information Science @Princeton. As you can see, I tend to name code after birds; check out the public aviary below.
![]() |
![]() |
![]() |
![]() | ![]() | ![]() |
![]() |
![]() |
$ quickstart
$ cat /home/mahynski/.profile | more
$ tar -xzvf mahynski.archive.tar.gz
![]() |
![]() |
![]() |
![]() |
![]() |
Developing reproducible, transparent modeling pipelines and methods requires standardized open-source tools. While working @NIST, I developed PyChemAuth to help chemometricians, cheminformatics professionals, and other researchers build end-to-end data science workflows from exploratory data analysis, to model optimization and comparison, to public distribution. Most data-driven projects below rely on this package. Check out the workshop and API Examples for more information if you find it helpful.
Developing tools for advanced stable isotope and trace element metrology

- PyChemAuth
- A short course in chemometric carpentry to systematically build these tools
- Trace Element Correlation Explorer Demo
- SITE database @NIST (should be live soon!)
π§ Predicting fluid phase thermodynamic properties with deep learning and coarse-grained modeling

- Modern implementation of thermodynamic extrapolation tools @NIST can be found here: thermoextrap
- This is also implemented in FEASST, an open-source Monte Carlo simulation package
- Harmonizing Statistical Associating Fluid Theory (SAFT) with molecular simulations (coming soon!)
- Industrial Fluid Properties Simulation Challenge
- "Predicting low-temperature free energy landscapes with flat-histogram monte carlo methods," N. A. Mahynski, M. A. Blanco, J. R. Errington, V. K. Shen, J. Chem. Phys. 146, 074101 (2017).
- "Predicting structural properties of fluids by thermodynamic extrapolation," N. A. Mahynski, S. Jiao, H. W. Hatch, M. A. Blanco, V. K. Shen, J. Chem. Phys. 148, 194105 (2018).
- "Flat-histogram monte carlo as an efficient tool to evaluate adsorption processes involving rigid and deformable molecules," M. Witman, N. A. Mahynski, B. Smit, J. Chem. Theory Comput. 14, 6149β6158 (2018).
- "Flat-histogram extrapolation as a useful tool in the age of big data," N. A. Mahynski, H. W. Hatch, M. Witman, D. A. Sheen, J. R. Errington, V. K. Shen, Molecular Simulation 1β13 (2020).
π Authenticating food labeling claims with machine learning and statistical modeling

- Collection of datasets and models on HuggingFace.
- "Comparing Machine Learning Models to Chemometric Ones to Detect Food Fraud: A Case Study in Slovenian Fruits and Vegetables" (coming soon!). Also see the associated GitHub repo.
- Chemometric differentiation of ginseng species (coming soon!)
- Thanks to all the great folks from the IAEA's CRP D52042 Implementation of Nuclear Techniques for AuthentiCaTion of Foods with High-Value Labelling Claims (INTACT Food) Project!

π¦ Analyzing trends in biorepositories using explainable machine learning

- Collection of datasets and models on HuggingFace.
- "Building Interpretable Machine Learning Models to Identify Chemometric Trends in Seabirds of the North Pacific Ocean," N. A. Mahynski, J. M. Ragland, S. S. Schuur, V. K. Shen, Environ. Sci. Technol. 56, 14361-14374 (2022). Also see the associated GitHub repo.
- Predicting the geographic provenance of American oysters (coming soon!)
π¦ Biomarkers and -omics applications
Understanding complex biochemical systems requires advanced tools, many of which have been greatly improved by advancements in artifical intelligence. Much of my background in this area involves predicting or interpreting spectral measurements, such as mass spectra or HSQC NMR. The majority of this work in ongoing and will be made available here when it is complete!
- FINCHnmr: Identifying compounds in complex biochemical mixtures using HSQC NMR.
- STARLINGrt: Interactive retention time visualization for analyzing gas chromatography mass spectrometry (GCMS) retention times.
- Check out Database Infrastructure for Mass Spectrometry (DIMSpec) and associated training resources.
- Determining fertility biomarkers of Atlantic Salmon (coming soon!)
β’οΈ Identifying materials using non-targeted analysis methods

- Collection of datasets and models on HuggingFace.
- "Classification and authentication of materials using prompt gamma ray activation analysis," N. A. Mahynski, J. I. Monroe, D. A. Sheen, R. L. Paul, H.-H. Chen-Mayer, V. K. Shen, J. of Radioanal. and Nucl. Chem. 332, 3259β3271 (2023). Also see the associated GitHub repo.
- Authenticating Materials with Imaged PGAA Spectra (coming soon!). Also see associated GitHub repo.
π Designing colloidal self-assembly by tiling Escher-like patterns

- "Programming interfacial porosity and symmetry with Escherized colloids," N. A. Mahynski, V. K. Shen, J. Chem. Theory Comp. 20, 2209β2218 (2024). Also see the associated GitHub repo.
- "Derivable genetic programming for two-dimensional colloidal materials," N. A. Mahynski, B. Han, D. Markiewitz, J. Chem. Phys. 157, 114112 (2022).
- "Symmetry-derived structure directing agents for two-dimensional crystals of arbitrary colloids," N. A. Mahynski, V. K. Shen, Soft Matter 17, 7853-7866 (2021).
- "Grand canonical inverse design of multicomponent colloidal crystals," N. A. Mahynski, R. Mao, E. Pretti, V. K. Shen, J. Mittal, Soft Matter 16, 3187 (2020).
- "Symmetry-based crystal structure enumeration in two dimensions," E. Pretti, V. K. Shen, J. Mittal, N. A. Mahynski, J. Phys. Chem. A. 124, 3276-3285 (2020).
- "Using symmetry to elucidate the importance of stoichiometry in colloidal crystal assembly," N. A. Mahynski, E. Pretti, V. K. Shen, J. Mittal, Nat. Commun. 10, 2028 (2019).
- For an interactive experience, check out Craig Kaplan's online demo of the tiles, and modifications thereof, this theory is built on.
π¬ Extractive summarization of scientific data and documents with large language models
Natural language processing (NLP) tools have seen incredible advances in recent years. Modern AI tools enable text extraction, document summarization, and corpus querying using natural language that provides a new avenue to interact with data. Retrieval augmented generation (RAG) is a particularly useful tool for interacting with data that has privacy concerns associated with it. RAG systems enable one to parse, query and have a "conversation" with these documents enabling one to retrieve information, create summaries and extract data. RAGs are:
- Based on specific document(s)
- Can cite their sources, making them more trustworthy
- Do not require retraining or fine-tuning of an underlying large language model
With the right prompt optimization and topic modeling their performance can be increased even further for domain-specific applications.