moxel.utils

This module provides helper functions for creating voxels.

Note

Currently, interactions are modelled with the Lennard-Jones (LJ) potential.

Attention

Consider playing with the n_jobs parameter to get the best performance for your system:

from timeit import timeit

setup = 'from moxel.utils import voxels_from_file'
n_jobs = [1, 2, 8, 16]  # Modify this according to your system.

for n in n_jobs:
    stmt = f'voxels_from_file("path/to/cif", n_jobs={n})'
    time = timeit(stmt=stmt, setup=setup, number=1)
    print(f'Time with {n} jobs: {time:.3f} s')
class moxel.utils.Grid(grid_size=25, cutoff=10, epsilon=50, sigma=2.5)[source]

Bases: object

A 3D energy grid over a crystal structure.

Parameters:
  • grid_size (int, default=25) – Number of grid points along each dimension.

  • cutoff (float, default=10) – Cutoff radius (β„«) for the LJ potential.

  • epsilon (float, default=50) – Epsilon value (Ξ΅/K) of the probe atom.

  • sigma (float, default=2.5) – Sigma value (Οƒ/β„«) of the probe atom.

structure

Available only after Grid.load_structure() has been called.

Type:

pymatgen.core.structure.Structure

structure_name

Available only after Grid.load_structure() has been called.

Type:

str

cubic_box

Available only after Grid.calculate() has been called.

Type:

bool

voxels

Available only after Grid.calculate() has been called.

Type:

array of shape (grid_size,)*3

calculate(cubic_box=False, length=30, potential='lj', n_jobs=None)[source]

Iterate over the grid and return voxels.

For computational efficiency and to assure (approximately) the same spatial resolution, the grid is overlayed over a supercell scaled according to MIC, see mic_scale_factors().

If lattice angles are significantly different than 90Β°, to avoid distortions set cubic_box to True. In this case, the grid is overlayed over a cubic box of size length centered at the origin but periodicity is no longer guaranteed.

Parameters:
  • potential (str, default='lj') – The potential used to calculate voxels. Currently, only the LJ potential is supported.

  • cubic_box (bool, default=False) – If True, the simulation box is cubic.

  • length (float, default=30) – The size of the cubic box in Γ…. Takes effect only if cubic_box == True.

  • n_jobs (int, optional) – Number of jobs to run in parallel. If None, then the number returned by os.cpu_count() is used.

Returns:

voxels – The energy voxels as \(e^{-\beta \mathcal{V}}\), to ensure numerical stability.

Return type:

array of shape (grid_size,)*3

Notes

For structures that can not be processsed, their voxels are filled with zeros.

lj_potential(coords)[source]

Calculate LJ potential at cartesian or fractional coordinates.

Parameters:

coordinates (array_like of shape (3,)) – If cubic_box == True cartesian. Else, fractional.

Returns:

energy – Energy as \(e^{-\beta \mathcal{V}}\), to ensure numerical stability.

Return type:

float

load_structure(pathname)[source]

Load a crystal structure from a file in a format supported by pymatgen.core.Structure.from_file().

Parameters:

pathname (str) – Pathname to the file.

moxel.utils.batch_clean(batch_dirname)[source]

Clean a single batch.

The batch must have the form:

batch
β”œβ”€β”€voxels.npy
└──names.json

Cleaning is required since the voxels for some structures might be zero, see Grid.calculate(). After cleaning, the directory has the form:

batch
β”œβ”€β”€voxels.npy
β”œβ”€β”€names.json
β”œβ”€β”€clean_voxels.npy
└──clean_names.json
Parameters:

batch_dirname (str) – Pathname to the directory which requires cleaning.

Returns:

exit_status – If no voxels are missing 0 else 1.

Return type:

int

moxel.utils.load_json(fname)[source]

Load a .json file.

Parameters:

fname (str) – Pathname to the .json file.

Returns:

names

Return type:

list

moxel.utils.mic_scale_factors(r, lattice_vectors)[source]

Return scale factors to satisfy minimum image convention [MIC].

Parameters:
  • r (float) – The cutoff radius used in MIC convetion.

  • lattice_vectors (array of shape (3, 3)) – The lattice vectors of the unit cell. Each row corresponds to a lattice vector.

Returns:

scale_factors – scale_factors[i] scales lattice_vectors[i].

Return type:

array of shape (3,)

References

[MIC]
  1. Smith, β€œThe Minimum Image Convention in Non-Cubic MD Cells”, 1989.

moxel.utils.voxels_from_dir(cif_dirname, out_pathname, grid_size=25, cutoff=10, epsilon=50, sigma=2.5, cubic_box=False, length=30, n_jobs=None)[source]

Calculate voxels from a directory of .cif files and save them under out_pathname as numpy.array of shape (n_samples, grid_size, grid_size, grid_size), where n_samples == len(cif_pathnames).

After processing the following files are created:

out_pathname
    β”œβ”€β”€voxels.npy
    └──names.json

The file names.json stores the names of the materials as a list, which might be useful for later indexing.

Parameters:
  • cif_dirname (str) – Pathname to the directory containing the .cif files.

  • out_pathname (str) – Pathname to the directory under which voxels are stored.

  • grid_size (int, default=25) – Number of grid points along each dimension.

  • cutoff (float, default=10) – Cutoff radius (β„«) for the LJ potential.

  • epsilon (float, default=50) – Epsilon value (Ξ΅/K) of the probe atom.

  • sigma (float, default=25) – Sigma value (Οƒ/β„«) of the probe atom.

  • cubic_box (bool, default=False) – If True, the simulation box is cubic.

  • length (float, default=30) – The size of the cubic box in Γ…. Takes effect only if cubic_box == True.

  • n_jobs (int, optional) – Number of jobs to run in parallel. If None, then the number returned by os.cpu_count() is used.

Notes

  • Samples in output array follow the order in sorted(os.listdir(cif_dirname)).

  • For structures that can not be processsed, their voxels are filled with zeros.

moxel.utils.voxels_from_file(cif_pathname, grid_size=25, cutoff=10, epsilon=50, sigma=2.5, cubic_box=False, length=30, n_jobs=None, only_voxels=True)[source]

Return voxels from .cif file.

Parameters:
  • cif_pathname (str) – Pathname to the .cif file.

  • grid_size (int, default=25) – Number of grid points along each dimension.

  • cutoff (float, default=10) – Cutoff radius (β„«) for the LJ potential.

  • epsilon (float, default=50) – Epsilon value (Ξ΅/K) of the probe atom.

  • sigma (float, default=25) – Sigma value (Οƒ/β„«) of the probe atom.

  • cubic_box (bool, default=False) – If True, the simulation box is cubic.

  • length (float, default=30) – The size of the cubic box in Γ…. Takes effect only if cubic_box == True.

  • n_jobs (int, optional) – Number of jobs to run in parallel. If None, then the number returned by os.cpu_count() is used.

  • only_voxels (bool, default=True) – Determines out type.

Returns:

out – If only_voxels == True, array of shape (grid_size,)*3. Otherwise, Grid.

Return type:

array or Grid

Notes

For structures that can not be processsed, their voxels are filled with zeros.

moxel.utils.voxels_from_files(cif_pathnames, out_pathname, grid_size=25, cutoff=10, epsilon=50, sigma=2.5, cubic_box=False, length=30, n_jobs=None)[source]

Calculate voxels from a list of .cif files and store them under out_pathname as numpy.array of shape (n_samples, grid_size, grid_size, grid_size), where n_samples == len(cif_pathnames).

After processing the following files are created:

out_pathname
    β”œβ”€β”€voxels.npy
    └──names.json

The file names.json stores the names of the materials as a list, which might be useful for later indexing.

Parameters:
  • cif_pathnames (list) – List of pathnames to the .cif files.

  • out_pathname (str) – Pathname to the directory under which voxels are stored.

  • grid_size (int, default=25) – Number of grid points along each dimension.

  • cutoff (float, default=10) – Cutoff radius (β„«) for the LJ potential.

  • epsilon (float, default=50) – Epsilon value (Ξ΅/K) of the probe atom.

  • sigma (float, default=25) – Sigma value (Οƒ/β„«) of the probe atom.

  • cubic_box (bool, default=False) – If True, the simulation box is cubic.

  • length (float, default=30) – The size of the cubic box in Γ…. Takes effect only if cubic_box == True.

  • n_jobs (int, optional) – Number of jobs to run in parallel. If None, then the number returned by os.cpu_count() is used.

Notes

  • Samples in output array follow the order in cif_pathnames.

  • For structures that can not be processsed, their voxels are filled with zeros.