This tutorial is about getting in touch with the Julia programming language that "makes it easy to express many object-oriented and functional programming patterns". It mainly focuses on (i) set up the Julia environment, (ii) run a set of simple examples on creating matrices, plotting charts, and executing simple for-loops with CUDA, and (iii) go through introductory examples on machine learning (Regression and Decision Trees).

Engineering, Generic

Machine Learning / AI

Tutorial

https://github.com/EuroCC-Greece/ml-julia

Pytorch

Facebook's AI Research lab (FAIR)

An open source machine learning framework that accelerates the path from research prototyping to production deployment.

Earth System Sciences, Engineering, Life Sciences, Materials and Chemical Sciences, Other

Machine Learning / AI

Software

https://pytorch.org/

TensorFlow

Google Brain

TensorFlow is an end-to-end open source platform for machine learning.

Earth System Sciences, Engineering, Life Sciences, Materials and Chemical Sciences, Other

Machine Learning / AI

Software

https://www.tensorflow.org/

We employ the diffusion map approach as a nonlinear dimensionality reduction technique to extract a dynamically relevant, low-dimensional description of n-alkane chains in the ideal-gas phase and in aqueous solution. In the case of C8 we find the dynamics to be governed by torsional motions. For C16 and C24 we extract three global order parameters with which we characterize the fundamental dynamics, and determine that the low free-energy pathway of globular collapse proceeds by a “kink and slide” mechanism, whereby a bend near the end of the linear chain migrates toward the middle to form a hairpin and, ultimately, a coiled helix. The low-dimensional representation is subtly perturbed in the solvated phase relative to the ideal gas, and its geometric structure is conserved between C16 and C24. The methodology is directly extensible to biomolecular self-assembly processes, such as protein folding.

Engineering, Materials and Chemical Sciences

Machine Learning / AI

Paper

https://www.pnas.org/content/107/31/13597

Concise, accurate descriptions of physical systems through their conserved quantities abound in the natural sciences. In data science, however, current research often focuses on regression problems, without routinely incorporating additional assumptions about the system that generated the data. Here, we propose to explore a particular type of underlying structure in the data: Hamiltonian systems, where an “energy” is conserved. Given a collection of observations of such a Hamiltonian system over time, we extract phase space coordinates and a Hamiltonian function of them that acts as the generator of the system dynamics. The approach employs an autoencoder neural network component to estimate the transformation from observations to the phase space of a Hamiltonian system. An additional neural network component is used to approximate the Hamiltonian function on this constructed space, and the two components are trained jointly. As an alternative approach, we also demonstrate the use of Gaussian processes for the estimation of such a Hamiltonian. After two illustrative examples, we extract an underlying phase space as well as the generating Hamiltonian from a collection of movies of a pendulum. The approach is fully data-driven and does not assume a particular form of the Hamiltonian function.

Engineering, Materials and Chemical Sciences

Machine Learning / AI

Paper

https://aip.scitation.org/doi/10.1063/1.5128231

Molecular simulation is an important and ubiquitous tool in the study of microscopic phenomena in fields as diverse as materials science, protein folding and drug design. While the atomic-level resolution provides unparalleled detail, it can be non-trivial to extract the important motions underlying simulations of complex systems containing many degrees of freedom. The diffusion map is a nonlinear dimensionality reduction technique with the capacity to systematically extract the essential dynamical modes of high-dimensional simulation trajectories, furnishing a kinetically meaningful low-dimensional framework with which to develop insight and understanding of the underlying dynamics and thermodynamics. We survey the potential of this approach in the field of molecular simulation, consider its challenges, and discuss its underlying concepts and means of application. We provide examples drawn from our own work on the hydrophobic collapse mechanism of n-alkane chains, folding pathways of an antimicrobial peptide, and the dynamics of a driven interface.

Engineering, Materials and Chemical Sciences

Machine Learning / AI

Paper

https://www.sciencedirect.com/science/article/pii/S0009261411004957

A central problem in data analysis is the low dimensional representation of high dimensional data and the concise description of its underlying geometry and density. In the analysis of large scale simulations of complex dynamical systems, where the notion of time evolution comes into play, important problems are the identification of slow variables and dynamically meaningful reaction coordinates that capture the long time evolution of the system. In this paper we provide a unifying view of these apparently different tasks, by considering a family of diffusion maps, defined as the embedding of complex (high dimensional) data onto a low dimensional Euclidean space, via the eigenvectors of suitably defined random walks defined on the given datasets.

Engineering, Materials and Chemical Sciences

Machine Learning / AI

Paper

https://www.sciencedirect.com/science/article/pii/S1063520306000534

The occurrence of instabilities in chemically reacting systems, resulting in unsteady and spatially inhomogeneous reaction rates, is a widespread phenomenon. In this article, we use nonlinear signal processing techniques to extract a simple, but accurate, dynamic model from experimental data of a system with spatiotemporal variations. The approach consists of a combination of two steps. The proper orthogonal decomposition [POD or Karhunen-Loève (KL) expansion] allows us to determine active degrees of freedom (important spatial structures) of the system. Projection onto these “modes” reduces the data to a small number of time series. Processing these time series through an artificial neural network (ANN) results in a low-dimensional, nonlinear dynamic model with almost quantitative predictive capabilities.

Machine Learning / AI

Paper

https://aiche.onlinelibrary.wiley.com/doi/abs/10.1002/aic.690390110