Generating initial structures

Generating initial pool of structures

After the cluster expansion settings is specified, the next step is to generate initial structures to start training the CE model. New structures for training CE model are generated using NewStructures class, which contains several methods for generating structures. The initial pool of structures is generated using generate_initial_pool() method as

>>> from clease.structgen import NewStructures
>>> ns = NewStructures(settings, generation_number=0, struct_per_gen=10)
>>> ns.generate_initial_pool()

The generate_initial_pool() method generates one structure per concentration where the number of each constituing element is at maximum/minimum. In the case of AuCu alloy, there are two extrema: Au and Cu. Consequently, generate_initial_pool() generates two structures for training.

Note

  • generation_number is used to track at which point you generated the structures.

  • struct_per_gen specifies the maximum number of structures to be generated for that generation number.

The generated structures are automatically stored in the database with several key-value pairs specifying their features. The genereated keys are:

key

description

gen

generation number

struct_type

‘’initial’’ for input structures, “final” for converged structures after calculation

size

size of the supercell

formula_unit

reduced formula unit representation independent of the cell size

name

name of the structure (formula_unit followed by a number)

converged

Boolean value indicating whether the calculation of the structure is converged

queued

Boolean value indicating whether the calculation is queued in the workload manager

started

Boolean value indicating whether the calculation has started

Generating random pool of structures

As we have generated only two structures for training, we can generate more random structures using generate_random_structures() method by altering the above script with

>>> from clease.structgen import NewStructures
>>> ns = NewStructures(settings, generation_number=0,
...                    struct_per_gen=10)
>>> ns.generate_random_structures()

The script generates 8 additional random structures such that there are 10 structures in generation 0. By default, generate_random_structures() method generates a structure with both random size and concentration. If you prefer to generate random structures with a specific cell size, you can pass template atoms with desired size. For example, you can force the new structures to be \(3 \times 3 \times 3\) supercell by using

>>> from ase.db import connect
>>> ns = NewStructures(settings, generation_number=0,
...                    struct_per_gen=10)
>>>
>>> # get template with the cell size = 3x3x3
>>> atoms = connect('aucu.db').get(id=10).toatoms()
>>>
>>> ns.generate_random_structures(atoms)