New Input Generator Framework in Avogadro 2

Avogadro 1.x had quite a large number of input generators that came from very humble beginnings. They were designed to be easy to write, and to give a simple path from a structure in Avogadro to something that could be used as an input file in one of many codes. Our basic approach was to add a C++ class per program we targeted, with one or two special cases. This meant that to develop an input generator it was necessary to learn some of the Avogadro API, and to at least compile a plugin (matching our compiler, Qt, library versions, etc). It also led to minor differences between the different input generators, and a lot of copying/pasting of boilerplate code.

Avogadro 2 showing an ethane molecule

When developing the input generators for Avogadro 2 as part of the Open Chemistry project we wanted to make it easier to add new generators. We put a lot of thought into how to make this possible, and how to maintain a native look and feel without necessarily making an input generator developer learn C++, Qt, Avogadro and everything that goes along with setting up a development environment. The new input generator framework is largely language agnostic, with a minimum of assumptions. It currently executes the Python interpreter, but that is largely an artifact of the fact we have only developed input generators using Python.

Avogadro 2 NWChem input generator with syntax highlighting

The input generators are executed in a separate process, using several passes to get the display name, options supported, syntax highlighting rules and finally to actually generate the input. The current pass is communicated using command-line arguments, input is passed to the program using standard input and formatted as JSON. The results should be passed back using the standard output stream, and depending on the pass should be JSON results or the actual input file. We also do some post-processing of the input file where the molecular geometry can be inserted following the specified format. This command line API is documented here. The NWChem input generator is the first to add syntax highlighting in an external plugin, the GAMESS input generator shows an approach using C++ ported from Avogadro 1.x.

This approach assures that an input generator cannot possibly crash or hang the interface, licensing is not an issue (separate execution process) and gives input generator developers the freedom to concentrate on turning options into the appropriate input file without worrying about the details of the application it is being used in. With relatively minor modifications Avogadro 2 could look for other file extensions and execute the appropriate interpreter, or simply execute the programs found in a given path. These files can be modified directly, if options change it is currently necessary to restart Avogadro, but if the input generation changes those changes would be reflected in Avogadro the next time the generator was run. Menu entries are added dynamically at program start up, and this concept could be extended to more of Avogadro. The main for the NWChem input generator is shown below,

if <u>_name_</u> == "<u>_main_</u>":
  parser = argparse.ArgumentParser('Generate a NWChem input file.')
  parser.add_argument('--debug', action='store_true')
  parser.add_argument('--print-options', action='store_true')
  parser.add_argument('--generate-input', action='store_true')
  parser.add_argument('--display-name', action='store_true')
  args = vars(parser.parse_args())

  debug = args['debug']

  if args['display_name']:
    print("NWChem")
  if args['print_options']:
    print(json.dumps(getOptions()))
  elif args['generate_input']:
    print(json.dumps(generateInput()))

A snippet of the input generation code is shown below, where a variable is populated with what will be the raw input passed to the code.

def generateInputFile(opts):
  # Extract options:
  title = opts['Title']
  calculate = opts['Calculation Type']
  theory = opts['Theory']
  basis = opts['Basis']
  multiplicity = opts['Multiplicity']
  charge = opts['Charge']
  # Preamble
  nwfile = ""
  nwfile += "echo\n\n"
  nwfile += "start molecule\n\n"
  nwfile += "title \"%s\"\n"%title
  # Coordinates
  nwfile += "geometry units angstroms print xyz autosym\n"
  nwfile += "$$coords:Sxyz$$\n"
  nwfile += "end\n\n"
  # More stuff here...
  return nwfile

We hope that this framework will make it much easier for researchers to customize their input generator scripts to their needs, and we would welcome your feedback on how we could make it even easier. If there are other languages of interest we could add examples, the major requirement is that the language can create a self-contained script or executable that can use standard in/out, has some string handling capabilities and support for JSON.

Share Comments
comments powered by Disqus