POOMA banner

POOMA Particles Documentation


Particles are primarily used in one of two ways in large scientific applications. The first is to track sample particles using Monte Carlo techniques, for example, to gather statistics that describe the conditions of a complex physical system. Particles of this kind are often referred to as "tracers". The second is to perform direct numerical simulation of systems that contain discrete point-like entities such as ions or molecules.

In both scenarios, the application contains one or more sets of particles. Each set has some data associated with it that describes its members' characteristics, such as mass or momentum. Particles typically exist in a spatial domain, and they may interact directly with one another or with field quantities defined on that domain.

This document gives an overview of POOMA's support for particles, then discusses some implementation details. The classes introduced in this tutorial are illustrated by two short programs: one that tracks particles under the influence of a simple one-dimensional harmonic oscillator potential, and another that models particles bouncing off the walls of a closed three-dimensional box. Later on, we will show how particles and fields can interact in a simulation code.


POOMA's Particles class is a container for a heterogeneous collection of particle attributes. The class uses dynamic storage for particle data (in the form of a set of POOMA DynamicArray objects), so that particles can be added or deleted as necessary. It contains a layout object that manages the distribution of particle data across multiple patches, and it applies boundary conditions to particles when attribute data values exceed a prescribed range. In addition, global functions are provided for interpolating data between particle and field element positions.

Each Particles object keeps a list of pointers to its elements' attributes. When an application wants to add or delete particles, it invokes a method on the Particles object, which delegates the call to the layout object for the contained attributes. Particles also provides a member function called sync(), which the application invokes in order to update the global particle count and numbering, update the data distribution across patches, and apply the particle boundary conditions.

Applications define a specific type of particles collection by deriving from the Particles class. The derived class declares data members for the attributes needed to characterize this type of particle. (The types of these data members are discussed below.) The constructor for this derived class should call the method Particles::addAttribute() to register each attribute and add it to the list maintained by Particles. In this way, the Particles class can be extended by the application to accommodate any sort of particle description.

The distribution of particle data stored in DynamicArray objects is directed by a particle layout class. Each particle layout class employs a particular strategy to determine the patch in which a particle's data should be stored. For instance, SpatialLayout keeps each particle in the patch that contains field data for elements that are nearest to the particle's current spatial position. This strategy is useful for cases where the particles need to interact with field data or with neighboring particles.

Particle Attributes

Each particle attribute is implemented as a DynamicArray, a class derived from the one-dimensional specialization of POOMA's Array class. DynamicArray extends the notion of a one-dimensional array to allow applications to add or delete elements at will. When particles are destroyed, the empty slots left behind can be filled by moving elements from the end of the list (BackFill) or by sliding all the remaining elements over and preserving the existing order (ShiftUp). At the same time, DynamicArray objects can be used in data-parallel expressions in the same way as ordinary Array objects, so that the application can update particle attributes such as position and velocity using either a single statement or a loop over individual particles.

At first glance, it might seem more sensible to have applications define a struct that stores all the attribute data for one particle in a single data structure, and then use this as a template argument to the Particles class, which would store a DynamicArray of values of this type. POOMA's designers considered this option, but discarded it. The reason is that most compute-intensive operations in scientific applications are implemented as loops in which one or more separate attributes are read or written. In order to make the evaluation of expressions involving attributes as efficient as possible, it is therefore important to ensure that data are arranged as separate one-dimensional arrays for each attribute, rather than as a single array of structures with one structure per particle. This makes typical computational loops such as

for (int i=0; i<n; ++i)
  x[i] += dt * vx[i];
  y[i] += dt * vy[i];

run more quickly, as it makes much better use of the data cache.

Particle Layout

As mentioned above, each Particles object uses a layout object to determine in which patch a particle's data should be stored. The layout manages the program's requests to re-arrange particle data. With SpatialLayout, for example, the application provides a particle position attribute which is used to determine how particle data should be distributed. The particle layout then directs the Particles object to move particle data from one patch to another as dictated by its strategy. The Particles object in turn delegates this task to the layout object for the particle attributes, which tells each of the attributes using this layout to move their data as needed. All of this is handled by a single call to the method Particles::sync().

Derivation of Particles Subclass

In general, creating a new Particles subclass is a three-step process. The first step is to declare a traits class with typedef's specifying the type of engine the particle attributes will use and the type of particle layout. An example of such a traits class is the following:

struct MyParticleTraits
  typedef MultiPatch<DynamicTag,Dynamic> AttributeEngineTag_t;
  typedef UniformLayout                  ParticleLayout_t;

This traits class will be used to specialize the Particles class template when an application-specific subclass representing a concrete set of particles is derived from it. Particles uses public typedef's to give sensible names to these traits parameters, so that the derived subclass can access them (as shown below). The traits approach is used here to provide flexibility in the Particles design for future extensions. In addition to specifying the attribute engine and particle layout types, this traits class could also set some application-specific parameters.

Currently, there is a fairly limited set of valid choices for attribute engine type and particle layout strategy. Because we require that the particle attributes share a common layout and remain synchronized, we must use a MultiPatch engine with a DynamicLayout. The patch engines inside the MultiPatch engine must have dynamic capabilities. They can be either of type Dynamic or, when running across multiple contexts, of type Remote<Dynamic>. As for the particle layout, POOMA currently provides only two possible strategies: UniformLayout, which just tries to keep a similar number of particles in each patch; and SpatialLayout, which organizes the particles into patches based upon their current spatial position. For the user's convenience, a set of pre-defined particle traits classes with specific choices of attribute engine and particle layout type are provided in the header file Particles/CommonParticleTraits.h. These define all the combinations of multi-patch dynamic and remote dynamic engines with both uniform and spatial layouts. Ordinarily, the user can simply choose one of these pre-defined traits classes for their Particles subclass.

The second step is to actually derive a class from Particles. The new class can be templated on whatever the developer desires, as long as a traits class type is provided for the template parameter of the Particles base class. In the example below, the new class being derived from Particles is templated on the same traits class as Particles. For the sake of convenience, typedef's may be provided for the instantiated parent class and for its layout type. The constructor for the user's subclass then usually takes a concrete layout object of the type specified in the typedef above as a constructor argument:

template <class PT>
class MyParticles : public Particles<PT>
  // instantiated type of base class
  typedef Particles<PT> Base_t;

  // type of particle layout (from traits class via base class)
  typedef typename Base_t::ParticleLayout_t ParticleLayout_t;

  // type of attribute engine tag (from traits class via base class)
  typedef typename Base_t::AttributeEngineTag_t EngineTag_t;
  // some particle attributes as public data members
  DynamicArray<double, EngineTag_t> charge;
  DynamicArray<double, EngineTag_t> mass;
  DynamicArray<int, EngineTag_t>    count;

  // constructor passes particle layout to base class
  MyParticles(const ParticleLayout_t &layout)
  : Particles<PT>(layout)
    // register attributes with base class

Note that the attribute elements in this example have different element types; i.e., charge and mass are of type double, while count is of type int. Attribute elements may in general have any type, including any user-defined type.

Finally, the application-specific class MyParticles is instantiated with a traits class such as MyParticleTraits to create an actual set of particles. A particle layout is declared first, and it is passed as a constructor argument to the instance of the user's class to control the distribution of particle data between patches. This layout object typically has one or more constructor arguments that specify such things as the number of patches the particles are to be distributed over. Here is an example of creating a MyParticles object:

const int numPatches = 10;
MyParticleTraits::ParticleLayout_t layout(numPatches);
MyParticles<MyParticleTraits>      particles(layout);

While this may seem complex at first, each level of indirection or generalization is needed in order to provide flexibility. The type of engine and layout to be used, for example, could be passed directly as template parameters to Particles, rather than being combined together in a traits class. However, this would make user-level code fragile in the face of future changes to the library. If other traits are needed later, they can be added to the traits class in one place, rather than needing to be specified every time a new class is derived from Particles. This bundling also makes it easier to specify the same basic properties (engine and layout) for two or more interacting Particles subclasses.

Synchronization and Related Issues

For efficiency reasons, Particles does not automatically move particle data between patches after every operation, but instead waits for the application to call the method sync(). Particles can also be configured to cache requests to delete particles, rather than deleting them immediately.

Particles::sync() is a member function template taking a single argument. This argument must be one of the particle attributes (i.e., a DynamicArray). SpatialLayout assumes that the attribute given to sync() is the particle positions, and it uses this to update the distribution of particle data so that particles are located on the same patch as nearby field data. Applications must therefore be careful not to mistakenly pass a non-position attribute, such as temperature or pressure, to SpatialLayout via the sync() method.

UniformLayout, which divides particles as evenly as possible between patches, without regard to spatial position, only uses the attribute passed to sync() as a template for the current distribution of particle data. Any attribute with the same distribution as the actual particle data can therefore be used.

The use of a parameter in Particles::sync() is one important difference between the implementation of particles in POOMA I and POOMA II. In the old design, all Particles classes came with a pre-defined attribute named R that was the particles' position, and synchronization always referred to this attribute. The new scheme allows applications to change the attribute that is used to represent the position; e.g., to switch back and forth in a time advance algorithm between a "current" position attribute and a "new" position attribute. It also allows particles to be weighted according to some attribute, so that the distribution scheme load-balances by weight.

Of course, before particle data can be distributed, the particles themselves must be created. Particles provides two methods for doing this. The first, globalCreate(N,renum), creates a specified number of particles N, spread as evenly as possible across all patches. The particles are normally renumbered after the creation operation, although this can be overridden by passing the second parameter (renum) with a value of "false". POOMA automatically uses a numbering scheme in which the particles are ordered by patch number and labeled consecutively within a patch. For instance, if patch 0 has 6 particles and patch 1 has 4 particles, then the particles on patch 0 are labeled 0 through 5, and the particles on patch 1 are labeled 6 through 9.

Particles::create(N,patchID,renum), on the other hand, creates a specified number of particles N on each local context, and adds them to a specific patch (or to the last local patch if none is specified). Once again, the particles are renumbered after this operation unless renum is false. Used in conjunction with the Pooma::context() method, this create() method can be utilized to allocate a specific number of particles on each context and in each local patch within a context. If a program contains a series of calls to the create() method, the user may wish set renum to false to avoid renumbering particles until all of the particle creation tasks have been completed.

After particles have been created (or destroyed), they should be renumbered to ensure that each has a unique ID and that the global domain of the particle attributes is consistent. This is critical to the proper behavior of data-parallel expressions involving attributes. The Particles::renumber() method surveys all the patches to find out what the current local domain of each patch is. It then reconstructs a global domain across all the patches, effectively renumbering the particles from 0 to N-1, where N is the total number of particles. The more complex sync() method applies the particle boundary conditions, performs any deferred particle destroy requests, swaps particles between patches according to the particle layout strategy, and then renumbers the particles by calling renumber(). Programs should call renumber() if they have only created or destroyed particles, but have not done deferred destroy requests, modified particle attributes in a way that would require applying boundary conditions (or have no boundary conditions), and do not need to swap particles across patches. Note that calls to globally synchronizing functions such as renumber() or sync() must be done on all contexts in a SPMD fashion.

If a program does not (implicitly or explicitly) call renumber() after creating or destroying particles, the global domain for the particle attributes will be incorrect. If the program then tries to read or write a view of a particle attribute by indexing with some domain object, it will not get the right section of the data. This failure could be silent if the view that the program requests exists. Alternatively, the requested view could be outside of the (incorrect) global domain, in which case the layout object for the particle attributes will suffer a run-time assertion failure. It is the user's responsibility to ensure that the particle attributes are properly synchronized prior to any data-parallel expression.

There are also two ways to destroy particles. The first way, which always destroys the particles immediately, is implemented by the method Particles::destroy(domain,patchID,renum). This function deletes particles from the local patch indicated by patchID, and domain is assumed to refer to a subset of the local domain in that patch. If patchID is not specified, then domain is assumed to refer to a subset of the global domain of the particle attributes, in which case the function may delete particles from multiple patches. The renum argument indicates whether to renumber the particles after the destroy command is performed, and it is true by default.

The second destruction method is Particles::deferredDestroy(domain,patchID). The meanings of the two arguments are the same as in the destroy() method. This is new in POOMA II, and it only does deferred destruction; i.e., caches the requested indices for use later when performDestroy() is called. Since this method doesn't actually destroy particles right away, there is no need for it to call renumber(). This deferredDestroy() method can be useful when there are many separate destroy requests, because it lumps them all together and amortizes the expense of having to move particle data around and shrink the particle attributes. The performDestroy() method, which causes the cached destruction requests to be executed, always performs renumbering. The performDestroy() method can be called explicitly by the user in order to process and flush any cached destroy requests or implicitly by calling the sync() method.

All particle destroys are implemented using one of two possible methods: BackFill or ShiftUp. With the BackFill method, the "holes" that are left behind when particles are deleted are filled with particle data from the end of the list for the given patch. The ShiftUp method, on the other hand, slides all of the remaining particles up the list in order to fill holes. Thus, the ShiftUp destroy method is plainly slower, but it preserves the existing order of the particles within a given patch. The user can select the preferred destroy method using the setDestroyMethod() function.

Example: Simple Harmonic Oscillator

The example for this tutorial simulates the motion of particles under the influence of a simple one-dimensional harmonic oscillator potential. The code, a version of which is included in the POOMA II release in the examples/Particles/Oscillation directory, is as follows:

001  #include <iostream>
002  #include <stdlib.h>
004  #include "Pooma/Particles.h"
005  #include "Pooma/DynamicArrays.h"
007  // Dimensionality of this problem
008  static const int PDim = 1;
010  // A traits class specifying the engine and layout of a Particles class.
011  template <class EngineTag>
012  struct PTraits
013  {
014    // The type of engine to use in the particle attributes.
015    typedef EngineTag AttributeEngineTag_t;
017    // The type of particle layout to use.  Here we use a UniformLayout,
018    // which divides the particle data up so as to have an equal number
019    // on each patch.
020    typedef UniformLayout ParticleLayout_t;
021  };
023  // A Particles subclass that defines position and velocity as
024  // attributes.
025  template <class PT>
026  class Quanta : public Particles<PT>
027  {
028  public:
029    // Useful typedef's to extract from the base class
030    typedef Particles<PT>                         Base_t;
031    typedef double                                AxisType_t;
032    typedef typename Base_t::ParticleLayout_t     ParticleLayout_t;
033    typedef typename Base_t::AttributeEngineTag_t AttributeEngineTag_t;
034    enum { dimensions = PDim };
036    // Constructor sets up layouts and registers attributes
037    Quanta(const ParticleLayout_t &pl)
038    : Particles<PT>(pl)
039    {
040      addAttribute(x);
041      addAttribute(v);
042    }
044    // X position and velocity are attributes (these could be declared
045    // private and accessed with public methods, but this gives nice syntax)
046    DynamicArray< Vector<dimensions,AxisType_t>, AttributeEngineTag_t > x;
047    DynamicArray< Vector<dimensions,AxisType_t>, AttributeEngineTag_t > v;
048  };
050  // Engine tag type for attributes.  Here we use a MultiPatch engine
051  // with the patches being Dynamic engines, and a DynamicTag, which allows
052  // the patches to change sizes during the application.  This is important
053  // since we may change the number of particles in each patch.
054  typedef MultiPatch<DynamicTag,Dynamic> AttrEngineTag_t;
056  // The particle traits class and layout type for this application
057  typedef PTraits<AttrEngineTag_t> PTraits_t;
058  typedef PTraits_t::ParticleLayout_t PLayout_t;
060  // Simulation control constants
061  const double omega      = 2.0;
062  const double dt         = 1.0 / (50.0 * omega);
063  const int nParticle     = 100;
064  const int nPatch        = 4;
065  const int nIter         = 500;
067  // Main simulation routine.
068  int main(int argc, char *argv[])
069  {
070    // Initialize POOMA and Inform object for output to terminal
071    Pooma::initialize(argc,argv);
072    Inform out(argv[0]);
073    out << "Begin Oscillation example code" << std::endl;
075    // Create a uniform layout object to control particle positions.
076    PLayout_t layout(nPatch);
078    // Create Particles, using our special subclass and the particle layout
079    typedef Quanta<PTraits_t> Particles_t;
080    Particles_t p(layout);
082    // Create particles on one patch, then re-distribute (just to show off)
083    p.create(nParticle, 0);
084    for (int ip=0; ip<nPatch; ++ip)
085    {
086      out << "Current size of patch " << ip << " = "
087          << p.attributeLayout().patchDomain(ip).size()
088          << std::endl;
089    }
091    out << "Resyncing particles object ... " << std::endl;
092    p.sync(p.x);
094    // Show re-balanced distribution.
095    for (int ip=0; ip<nPatch; ++ip)
096    {
097      out << "Current size of patch " << ip << " = "
098          << p.attributeLayout().patchDomain(ip).size()
099          << std::endl;
100    }
102    // Randomize positions in domain [-1,+1], and set velocities to zero.
103    // This is done with a loop because POOMA does not yet have parallel RNGs.
104    typedef Particles_t::AxisType_t Coordinate_t;
105    Vector<PDim,Coordinate_t> initPos;
106    srand(12345U);
107    Coordinate_t ranmax = static_cast<Coordinate_t>(RAND_MAX);
108    for (int ip=0; ip<nParticle; ++ip)
109    {
110      for (int idim=0; idim<PDim; ++idim)
111      {
112        initPos(idim) = 2 * (rand() / ranmax) - 1;
113      }
114      p.x(ip) = initPos;
115      p.v(ip) = Vector<PDim,Coordinate_t>(0.0);
116    }
118    // print initial state
119    out << "Time = 0.0:" << std::endl;
120    out << "Quanta positions:" << std::endl << p.x << std::endl;
121    out << "Quanta velocities:" << std::endl << p.v << std::endl;
123    // Advance particles in each time step according to:
124    //         dx/dt = v
125    //         dv/dt = -omega^2 * x
126    for (int it=1; it<=nIter; ++it)
127    {
128      p.x = p.x + dt * p.v;
129      p.v = p.v - dt * omega * omega * p.x;
130      out << "Time = " << it*dt << ":" << std::endl;
131      out << "Quanta positions:" << std::endl << p.x << std::endl;
132      out << "Quanta velocities:" << std::endl << p.v << std::endl;
133    }
135    // Finalize POOMA
136    Pooma::finalize();
137    return 0;
138  }

As discussed earlier, the program begins by creating a traits class that provides typedef's for the names AttributeEngineTag_t and ParticleLayout_t (lines 11-21). An application-specific class called Quanta is then derived from Particles, without specifying the traits to be used (lines 25-48). This class declares two attributes, to store the particles' x coordinate and velocity. The body of its constructor (lines 40-41) adds these attributes to the attribute list, while passing the actual particle layout object specified by the application up to Particles.

Lines 54, 57 and 58 create some convenience typedef's for the engine and layout that the application will use. Lines 61-65 then define constants describing both the physical parameters to the problem (such as the oscillation frequency) and the computational parameters (the number of particles, the number of patches, etc.). In a real application, many of these values would be variables, rather than hard-wired constants.

After the POOMA library is initialized (line 71), an Inform object is created to manage output. An actual layout is then created (line 76), and is used to create a set of particles (line 80). The particles themselves are created by the call to Particles::create() on line 83. The output on lines 84-89 shows that all particles are initially created in patch 0.

The sync() call on line 92 redistributes particles across the available patches, using the x coordinate as a template for the current particle distribution. As the output from lines 95-100 shows, this distributes the particles across patches as evenly as possible.

The particle positions are randomized on lines 108-116. (A loop is used here rather than a data-parallel expression because parallel random number generation has not yet been integrated into the expression evaluation machinery in this release of POOMA.) After some more output to show the particles' initial positions and velocities, the application finally enters the main timestep loop (lines 126-133). In each time step, particle positions and velocities are updated under the influence of a simple harmonic oscillator force and then printed out. Once the specified number of timesteps has been executed, the library is shut down (line 136) and the application exits.

Boundary Conditions

In addition to an AttributeList, each Particles object also stores a ParticleBCList of boundary conditions to be applied to the attributes. These are generalized boundary conditions in the sense that they can be applied not only to a particle position attribute, but to any sort of attribute or expression involving attributes. POOMA provides typical particle boundary conditions including periodicity, reflection, absorption, reversal (reflection of one attribute and negation of another), and kill (destroying a particle). Boundary conditions can be updated explicitly by calling Particles::applyBoundaryConditions(), or implicitly by calling Particles::sync().

Each boundary condition is assembled by first constructing an instance of the type of boundary condition desired, then invoking the addBoundaryCondition() member function of Particles with three parameters: the subject of the boundary condition (i.e., the attribute or expression to be checked against a specified range), its object (the attribute to be modified when the subject is outside the range), and the actual boundary condition object. The boundary condition is then applied each time applyBoundaryConditions() is invoked.

The subject and object of a boundary condition are usually the same, but this is not required. In one common case, the subject is an expression involving particle attributes, while the object is the Particles object itself. For example, an application's boundary condition might specify that particles are to be deleted if their kinetic energy goes above some limit. The subject would be an expression like 0.5*P.m*P.v*P.v, and the object would be P. The object cannot be the expression 0.5*P.m*P.v*P.v because an expression contains no actual data and thus cannot be modified.

Another case involves the reversal boundary condition, which is used to make particles bounce off walls. Bouncing not only reflects the particle position back inside the wall, but also reverses the particle's velocity component in that direction. The reversal boundary condition therefore needs an additional object besides the original subject.

POOMA provides the pre-defined boundary condition classes listed in the table below.
Class Behavior
AbsorbBC If attribute is outside specified lower or upper bounds, it is reset to the boundary value.
KillBC If particles go outside the given bounds, they are destroyed by putting their index into the deferred destroy list.
PeriodicBC Keeps attributes within a given periodic domain.
ReflectBC If attribute exceeds a given boundary, its value is reflected back inside that boundary.
ReverseBC Reflects the value of the subject attribute if it crosses outside the given domain, and reverses (negates) the value of the object attribute.

Example: Elastic Collision

As an example of how particle boundary conditions are used, consider a set of particles bouncing around in a box in three dimensions. The sample code in file examples/Particles/Bounce/Bounce.cpp shows how this can be implemented using POOMA for the case of perfectly elastic collisions.

001  #include "Pooma/Particles.h"
002  #include "Pooma/DynamicArrays.h"
003  #include "Tiny/Vector.h"
004  #include "Utilities/Inform.h"
005  #include <iostream>
006  #include <stdlib.h>
009  // Dimensionality of this problem
010  static const int PDim = 3;
012  // Particles subclass with position and velocity
013  template <class PT>
014  class Balls : public Particles<PT>
015  {
016  public:
017    // Typedefs
018    typedef Particles<PT>                          Base_t;
019    typedef typename Base_t::AttributeEngineTag_t  AttributeEngineTag_t;
020    typedef typename Base_t::ParticleLayout_t      ParticleLayout_t;
021    typedef double                                 AxisType_t;
022    typedef Vector<PDim,AxisType_t>                PointType_t;
024    // Constructor: set up layouts, register attributes
025    Balls(const ParticleLayout_t &pl)
026    : Particles<PT>(pl)
027    {
028      addAttribute(pos);
029      addAttribute(vel);
030    }
032    // Position and velocity attributes (as public members)
033    DynamicArray<PointType_t,AttributeEngineTag_t>  pos;
034    DynamicArray<PointType_t,AttributeEngineTag_t>  vel;
035  };
037  // Use canned traits class from CommonParticleTraits.h
038  // MPDynamicUniform provides MultiPatch Dynamic Engine for
039  // particle attributes and UniformLayout for particle data.
040  typedef MPDynamicUniform PTraits_t;
042  // Type of particle layout
043  typedef PTraits_t::ParticleLayout_t ParticleLayout_t;
045  // Type of actual particles
046  typedef Balls<PTraits_t> Particle_t;
048  // Number of particles in simulation
049  const int NumPart = 100;
051  // Number of timesteps in simulation
052  const int NumSteps = 100;
054  // Number of patches to distribute particles across.
055  // Typically one would use one patch per processor.
056  const int numPatches = 16;
058  // Main simulation routine
059  int main(int argc, char *argv[])
060  {
061    // Initialize POOMA and output stream
062    Pooma::initialize(argc,argv);
063    Inform out(argv[0]);
065    out << "Begin Bounce example code" << std::endl;
066    out << "-------------------------" << std::endl;
068    // Create a particle layout object for our use
069    ParticleLayout_t particleLayout(numPatches);
071    // Create the Particles subclass object
072    Particle_t balls(particleLayout);
074    // Create some particles, recompute the global domain, and initialize
075    // the attributes randomly.  The globalCreate call will create an equal
076    // number of particles on each patch.  The particle positions are initialized
077    // within a 12 X 20 X 28 domain, and the velocity components are all
078    // in the range -4 to +4.
079    balls.globalCreate(NumPart);
080    srand(12345U);
081    Particle_t::PointType_t initPos, initVel;
082    typedef Particle_t::AxisType_t Coordinate_t;
083    Coordinate_t ranmax = static_cast<Coordinate_t>(RAND_MAX);
084    for (int i = 0; i < NumPart; ++i)
085    {
086      for (int d = 0; d < PDim; ++d)
087      {
088        initPos(d) = 4 * (2 * (d+1) + 1) * (rand() / ranmax);
089        initVel(d) = 4 * (2 * (rand() / ranmax) - 1);
090      }
091      balls.pos(i) = initPos;
092      balls.vel(i) = initVel;
093    }
095    // Display the particle positions and velocities.
096    out << "Timestep 0: " << std::endl;
097    out << "Ball positions: "  << balls.pos << std::endl;
098    out << "Ball velocities: " << balls.vel << std::endl;
100    // Set up a reversal boundary condition, so that particles will
101    // bounce off the domain boundaries.
103    Particle_t::PointType_t lower, upper;
104    for (int d = 0; d < PDim; ++d)
105    {
106      lower(d) = 0.0;
107      upper(d) = (d+1) * 8.0 + 4.0;
108    }
109    ReverseBC<Particle_t::PointType_t> bounce(lower, upper);
110    balls.addBoundaryCondition(balls.pos, balls.vel, bounce);
112    // Simulation timestep loop
113    for (int it=1; it <= NumSteps; ++it)
114    {
115      // Advance ball positions (timestep dt = 1)
116      balls.pos += balls.vel;
118      // Invoke boundary conditions
119      balls.applyBoundaryConditions();
121      // Print out the current particle data
122      out << "Timestep " << it << ": " << std::endl;
123      out << "Ball positions: " << balls.pos << std::endl;
124      out << "Ball velocities: " << balls.vel << std::endl;
125    }
127    // Shut down POOMA and exit
128    Pooma::finalize();
129    return 0;
130  }

After defining the dimension of the problem (line 10), this program defines a class Balls, which represents the set of particles (lines 13-35). Its two attributes represent the particles' positions and velocities (lines 33-34). Note how the type of engine used for storing these attributes is defined in terms of the types exported by the traits class with which Balls is instantiated (AttributeEngineTag_t, line 19). Meanwhile the type used to represent the points is defined in terms of the dimension of the problem (line 22), rather than being made dimension-specific. This style of coding makes it much easier to change the simulation as the program evolves.

Rather than defining a particle traits class explicitly, as the oscillation example above did, this program uses one of the pre-defined traits classes given in src/Particles/CommonParticleTraits.h. For the purposes of this example, a multipatch dynamic engine is used for particle attributes, and particle data is laid out uniformly. Once again, a typedef is used to create a symbolic name for this combination, so that the program can be updated by making a single change in one location.

Lines 43-56 define the types used in the simulation, and the constants that control the simulation's evolution. The main body of the program follows. As usual, it begins by initializing the POOMA library, and creating an output handler of type Inform (lines 62-63). Line 69 then creates a layout object describing the domain of the problem.

The Particles object itself comes into being on line 72, although the actual particles aren't created until line 79. Recall that by default, globalCreate() renumbers the particles by calling the Particles::renumber() method. Lines 80-93 then randomize the balls' initial positions and velocities.

Lines 103-110 are the most novel part of this simulation, as they create reflecting boundary conditions for the simulation and add them to the Particles object. Lines 103-108 defines where particles bounce; again, this is done in a dimension-independent fashion in order to make code evolution as easy as possible. Line 104 turns lower and upper into a reversing boundary condition, which line 105 then adds to balls. The main simulation loop now consists of nothing more than advancing the balls in each time step, and calling sync() to enforce the boundary conditions.

Summary of Particles Interface

Particles are a fundamental construct in physical calculations. POOMA's Particles class, and the classes that support it, allow programmers to create and manage sets of particles both efficiently and flexibly. While doing this is a multi-step process, the payoff as programs are extended and updated is considerable. The list below summarizes the most important aspects of the Particles interface.

Particles and Fields

The previous sections have described how POOMA represents a set of particles and allows the user to perform typical operations in a particle simulation. The remainder of this document shows how POOMA Particles and Fields can be combined to create complete simulations of complex physical systems. The first section describes how POOMA interpolates values when gathering and scattering field and particle data. This is followed by a look at the in's and out's of data layout, and a medium-sized example that illustrates how these ideas all fit together. (Note: The current implementation of POOMA Particles allows interaction with the original version of the Field abstraction created for POOMA II. Particles have not yet been modified to work with the new experimental design for POOMA Fields that is implemented in the src/Field directory. Thus, all the discussion here of POOMA Fields refers to the original implementation.)

Particle/Field Interpolation

POOMA's Particles class is designed to be used in conjunction with the Field class. Interpolators are the glue that bind these two together, by specifying how to calculate field values at particle (or other) locations that lie in between the locations of Field elements. These interpolators can also be used to go in the opposite direction, acumulating contributions from particles at arbitrary locations into the elements of a Field.

Interpolators are used to gather values to specific positions in a field's spatial domain from nearby field elements, or to scatter values from such positions into the field. The interpolation stencil describes how values are translated between field element locations and arbitrary points in space. An example of using this kind of interpolation is particle-in-cell (PIC) simulations, in which charged particles move through a discretized domain. The particle interactions are determined by scattering the particle charge density into a field, solving for the self-consistent electric field, and gathering that field back to the particle positions to compute forces on the particles. The last code example in this document describes a simulation of this kind.

POOMA currently offers three types of interpolation stencils: nearest grid point (NGP), cloud-in-cell (CIC), and subtracted dipole scheme (SUDS). NGP is a zeroth-order interpolation that gathers from or scatters to the field element nearest the specified location. CIC is a first-order scheme that performs linear weighting among the 2^D field elements nearest the point in D-dimensional space. SUDS is also first-order, but it uses just the nearest field element and its two neighbors along each dimension, so it is only a 7-point stencil in three dimensions. Other types of interpolation schemes can be added in a straightforward manner.

Interpolation is invoked by calling the global functions gather() and scatter(), both of which take four arguments:

  1. the particle attribute to be gathered to or scattered from (usually a single DynamicArray, although one could scatter an expression involving DynamicArray objects as well);
  2. the Field to be gathered from or scattered to;
  3. the particle positions (normally a DynamicArray that is a member of a Particles subclass); and
  4. an interpolator tag object of type NGP, CIC or SUDS, indicating which interpolation stencil to use.

An example of using the gather() function is:

gather(P.efd, Efield, P.pos, CIC());

where P is a Particles subclass object whose attributes include efd for storing the gathered electric field from the Field Efield and pos for the particle positions. The default constructor of the interpolator tag CIC is used to create a temporary instance of the class to pass to gather(), telling it which interpolation scheme to use.

The particle attribute and position arguments passed to gather() and scatter() should have the same layout, and the positions must refer to the geometry of the Field being used. The interpolator will compute the required interpolated values for the particles on each patch. These functions assume each particle is only interacting with field elements in the Field patch that exactly corresponds to the particle's current patch. Thus, applications must use the SpatialLayout particle layout strategy and make sure that the Field has enough guard layers to accommodate the interpolation stencil.

In addition to the basic gather() and scatter() functions, POOMA offers some variants that optimize other common operations. The first of these, scatterValue(), scatters a single value into a Field rather than a particle attribute with different values for each particle. Its first argument is a single value with a type that is compatible with the Field element type.

The other three optimized methods are gatherCache(), scatterCache(), and scatterValueCache(). Each of these methods has two overloaded variants, which allow applications to cache and reuse interpolated data, such as the nearest grid point for each particle and the distance from the particle's position to that grid point. The difference between the elements of each overloaded pair of methods is that one takes both a particle position attribute and a particle interpolator cache attribute among its arguments, while the other takes only the cache attribute. When the first of these is called, it caches position information in the provided cache attribute. When the second version is called with that cache attribute as an argument, it re-uses that information. This can speed up computation considerably, but it is important to note that applications can only do this safely when the particle positions are guaranteed not to have changed since the last interpolation.

Data Layout for Particles and Fields

The use of particles and fields together in a single application brings up some issues regarding data layout that do not arise when either is used on its own. There are two characteristics of Engine objects that must be considered in order to determine whether they can be used for attributes in Particles objects:

  1. Can the engine use a layout that is "shared" among several engines of the same category, such that the size and layout of the engine is synchronized with the other engines using the layout?

    If this is the case, then creation, destruction, repartitioning, and other operations are done for all the engines sharing the layout. Particles require all their attributes to use a shared layout, so only engines that use a shared layout can be used for particle attributes. The only engine type with this capability in this release of POOMA (i.e., the only engine that is usable in Particles attributes) is MultiPatch.

    MultiPatch can use several different types of layouts and single-patch engines, and all MultiPatch engines use a shared layout. However, only the MultiPatch<DynamicTag,*> specializations of MultiPatch engines are useful for Particles attributes, since only that engine type can have patches of dynamically varying size.

  2. Can the engine change size dynamically?

    The engine type used for particle attributes must have dynamic capabilities. Thus, we should use dynamic single-patch engines inside of MultiPatch. The only engines available in this release of POOMA that meet this requirement are Dynamic and Remote<Dynamic>. Both of these are inherently one-dimensional and support operations such as create(), destroy() and copy(). Remote<Dynamic> is similar to Dynamic, but it is context-aware and useful for multi-context codes.

    Implicit in the discussion above is the fact that there are actually three different types of layout classes that an application programmer must keep in mind:

    1. the layout for the particle attributes;
    2. the layout for the Field given to the particle SpatialLayout (which is used to determine the layout of the space in which the particles move around); and
    3. the actual SpatialLayout that connects the info about the Field layout to the Particles attribute layout.

    The only thing that needs to match between the attribute and Field layouts is the number of patches, which must be exactly the same. The engine type (and thus the layout type) of the attributes and of the Field do not have to match. Typically, both the attributes and the Field will have a MultiPatch engine with the same number of patches, but these engines will have different single-patch engine types (Dynamic vs. Brick) and use different types of layouts (Dynamic vs. Grid).

    Note once again that in the simple case of a UniformLayout particle layout, applications do not need to worry about any Field layout type, only the particle attribute layout (which still needs to be shared) and the particle layout. This commonly arises during the early prototyping (i.e., pre-parallel) stages of application development, when you might limit an application to a single patch for simplicity.

Example: Particle-in-Cell Simulation

Our third and final example of the Particles class is a particle-in-cell program, which simulates the motion of charged particles in a static sinusoidal electrical field in two dimensions. This example brings together the Field classes (discussed elsewhere) with the Particles capabilities we have been describing here.

Because this example is longer than the others in this document, it will be described in sections. For a unified listing of the source code, please see the file examples/Particles/PIC2d/PIC2d.cpp. The first step is to include all of the usual header files:

001  #include "Pooma/Particles.h"
002  #include "Pooma/DynamicArrays.h"
003  #include "Pooma/Fields.h"
004  #include "Utilities/Inform.h"
005  #include <iostream>
006  #include <stdlib.h>
007  #include <math.h>

Once this has been done, the application can define a traits class for the Particles object it is going to create. As always, this contains typedef's for AttributeEngineTag_t and ParticleLayout_t. The traits class for this example also includes an application-specific typedef called InterpolatorTag_t, for reasons discussed below.

008  template <class EngineTag, class Centering, class MeshType, class FL,
009            class InterpolatorTag>
010  struct PTraits
011  {
012    // The type of engine to use in the attributes
013    typedef EngineTag AttributeEngineTag_t;
015    // The type of particle layout to use
016    typedef SpatialLayout< DiscreteGeometry<Centering,MeshType>, FL> 
017      ParticleLayout_t;
019    // The type of interpolator to use
020    typedef InterpolatorTag InterpolatorTag_t;
021  };

The interpolator tag type is included in the traits class because an application might want the Particles subclass to provide the type of interpolator to use. One example of this is the case in which a gather() or scatter() call occurs in a subroutine which is passed an object of a Particles subclass. This subroutine could extract the desired interpolator type from that object using: // Particles-derived type Particles_t already defined typedef typename Particles_t::InterpolatorTag_t InterpolatorTag_t;

In this short example, this is not really necessary because InterpolatorTag_t is being defined and then used within the same file scope. Nevertheless, this illustrates a situation in which the user might want to add new traits to their PTraits class beyond the required traits AttributeEngineTag_t and ParticleLayout_t.

We can now also define the class which will represent the charged particles in the simulation. As in other examples, this is derived from Particles and templated on a traits class, so that such things as its layout and evaluation engine can be quickly, easily and reliably changed. This class has three intrinsic properties: the particle positions R, their velocities V, and their charge/mass ratios qm. The class also has a fourth attribute called E, which is used to record the electric field at each particle's position in order to calculate forces. This calculation will be discussed in greater detail below.

024  template <class PT>
025  class ChargedParticles : public Particles<PT>
026  {
027  public:
028    // Typedefs
029    typedef Particles<PT>                          Base_t;
030    typedef typename Base_t::AttributeEngineTag_t  AttributeEngineTag_t;
031    typedef typename Base_t::ParticleLayout_t      ParticleLayout_t;
032    typedef typename ParticleLayout_t::AxisType_t  AxisType_t;
033    typedef typename ParticleLayout_t::PointType_t PointType_t;
034    typedef typename PT::InterpolatorTag_t         InterpolatorTag_t;
036    // Dimensionality
037    static const int dimensions = ParticleLayout_t::dimensions;
039    // Constructor: set up layouts, register attributes
040    ChargedParticles(const ParticleLayout_t &pl)
041    : Particles<PT>(pl)
042    {
043      addAttribute(R);
044      addAttribute(V);
045      addAttribute(E);
046      addAttribute(qm);
047    }
049    // Position and velocity attributes (as public members)
050    DynamicArray<PointType_t,AttributeEngineTag_t> R;
051    DynamicArray<PointType_t,AttributeEngineTag_t> V;
052    DynamicArray<PointType_t,AttributeEngineTag_t> E;
053    DynamicArray<double,     AttributeEngineTag_t> qm;
054  };

With the two classes that the simulation relies upon defined, the program next defines the dependent types, constants, and other values that the application needs. These include the dimensionality of the problem (which can easily be changed), the type of mesh on which the calculations are done, the mesh's size, and so on:

058  // Dimensionality of this problem
059  static const int PDim = 2;
061  // Engine tag type for attributes
062  typedef MultiPatch<DynamicTag,Dynamic> AttrEngineTag_t;
064  // Mesh type
065  typedef UniformRectilinearMesh< PDim, Cartesian<PDim>, double > Mesh_t;
067  // Centering of Field elements on mesh
068  typedef Cell Centering_t;
070  // Geometry type for Fields
071  typedef DiscreteGeometry<Centering_t,Mesh_t> Geometry_t;
073  // Field types
074  typedef Field< Geometry_t, double,
075                 MultiPatch<UniformTag,Brick> > DField_t;
076  typedef Field< Geometry_t, Vector<PDim,double>,
077                 MultiPatch<UniformTag,Brick> > VecField_t;
079  // Field layout type, derived from Engine type
080  typedef DField_t::Engine_t Engine_t;
081  typedef Engine_t::Layout_t FLayout_t;
083  // Type of interpolator
084  typedef NGP InterpolatorTag_t;
086  // Particle traits class
087  typedef PTraits<AttrEngineTag_t,Centering_t,Mesh_t,FLayout_t,
088                  InterpolatorTag_t> PTraits_t;
090  // Type of particle layout
091  typedef PTraits_t::ParticleLayout_t PLayout_t;
093  // Type of actual particles
094  typedef ChargedParticles<PTraits_t> Particles_t;
096  // Grid sizes
097  const int nx = 200, ny = 200;
099  // Number of particles in simulation
100  const int NumPart = 400;
102  // Number of timesteps in simulation
103  const int NumSteps = 20;
105  // The value of pi (some compilers don't define M_PI)
106  const double pi = acos(-1.0);
108  // Maximum value for particle q/m ratio
109  const double qmmax = 1.0;
111  // Timestep
112  const double dt = 1.0;

The preparations above might seem overly elaborate, but the payoff comes when the main simulation routine is written. After the usual initialization call, and the creation of an Inform object to handle output, the program defines one geometry object to represent the problem domain, and another that includes a guard layer:

115  int main(int argc, char *argv[])
116  {
117    // Initialize POOMA and output stream.
118    Pooma::initialize(argc,argv);
119    Inform out(argv[0]);
121    out << "Begin PIC2d example code" << std::endl;
122    out << "------------------------" << std::endl;
124    // Create mesh and geometry objects for cell-centered fields.
125    Interval<PDim> meshDomain(nx+1,ny+1);
126    Mesh_t mesh(meshDomain);
127    Geometry_t geometry(mesh);
129    // Create a second geometry object that includes a guard layer.
130    GuardLayers<PDim> gl(1);
131    Geometry_t geometryGL(mesh,gl);

The program then creates a pair of Field objects. The first, phi, is a field of doubles and records the electrostatic potential at points in the mesh. The second, EFD, is a field of two-dimensional Vectors and stores the electric field at each mesh point. The types used in these definitions were declared on lines 74-77 above. Note how these definitions are made in terms of other defined types, such as Geometry_t, rather than directly in terms of basic types. This allows the application to be modified quickly and reliably with minimal changes to the code.

133    // Create field layout objects for our electrostatic potential
134    // and our electric field.  Decomposition is 4 x 4 and replicated.
135    Loc<PDim> blocks(4,4);
136    FLayout_t flayout(geometry.physicalDomain(),blocks,ReplicatedTag()),
137      flayoutGL(geometryGL.physicalDomain(),blocks,gl,ReplicatedTag());
139    // Create and initialize electrostatic potential and electric field.
140    DField_t phi(geometryGL,flayoutGL);
141    VecField_t EFD(geometry,flayout);

The application now adds periodic boundary conditions to the electrostatic field phi, so that particles will not see sharp changes in the potential at the edges of the simulation domain. The values of phi and EFD are then set: phi is defined explicitly, while EFD records the gradient of phi.

144    // potential phi = phi0 * sin(2*pi*x/Lx) * cos(4*pi*y/Ly)
145    // Note that phi is a periodic Field
146    // Electric field EFD = -grad(phi)
147    phi.addBoundaryConditions(AllPeriodicFaceBC());
148    double phi0 = 0.01 * nx;
149    phi = phi0 * sin(2.0*pi*phi.x().comp(0)/nx)
150               * cos(4.0*pi*phi.x().comp(1)/ny);
151    EFD = -grad<Centering_t>(phi);

With the fields in place, the application creates the particles whose motions are to be simulated, and adds periodic boundary conditions to this object as well. The globalCreate() call creates the same number of particles on each processor.

153    // Create a particle layout object for our use
154    PLayout_t layout(geometry,flayout);
156    // Create a Particles object and set periodic boundary conditions
157    Particles_t P(layout);
158    Particles_t::PointType_t lower(0.0,0.0), upper(nx,ny);
159    PeriodicBC<Particles_t::PointType_t> bc(lower,upper);
160    P.addBoundaryCondition(P.R,bc);
162    // Create an equal number of particles on each processor
163    // and recompute global domain.
164    P.globalCreate(NumPart);

Note that the definitions of lower and upper could be made dimension-independent by defining them with a loop. If ng is an array of ints of length PDim, then this loop would be:

Particles_t::PointType_t lower, upper;
for (int d=0; d<PDim; ++d)
  lower(d) = 0;
  upper(d) = ng[d];

The application then randomizes the particles' positions and charge/mass ratios using a sequential loop (since parallel random number generation is not yet in POOMA). Once this has finished, the method swap() is called to redistribute the particles based on their positions; i.e., to move each particle to its home processor. The initial positions, velocities, and charge/mass ratios of the particles are then printed out.

166    // Random initialization for particle positions in nx by ny domain
167    // Zero initialization for particle velocities
168    // Random intialization for charge-to-mass ratio from -qmmax to qmmax
169    P.V = Particles_t::PointType_t(0.0);
170    srand(12345U);
171    Particles_t::PointType_t initPos;
172    typedef Particle_t::AxisType_t Coordinate_t;
173    Coordinate_t ranmax = static_cast<Coordinate_t>(RAND_MAX);
174    double ranmaxd = static_cast<double>(RAND_MAX);
175    for (int i = 0; i < NumPart; ++i)
176    {
177      initPos(0) = nx * (rand() / ranmax);
178      initPos(1) = ny * (rand() / ranmax);
179      P.R(i) = initPos;
180      P.qm(i) = qmmax * (2 * (rand() / ranmaxd) - 1);
181    }
183    // Redistribute particle data based on spatial layout
184    P.swap(P.R);
186    out << "PIC2d setup complete." << std::endl;
187    out << "---------------------" << std::endl;
189    // Display the initial particle positions, velocities and qm values.
190    out << "Initial particle data:" << std::endl;
191    out << "Particle positions: "  << P.R << std::endl;
192    out << "Particle velocities: " << P.V << std::endl;
193    out << "Particle charge-to-mass ratios: " << P.qm << std::endl;

The application is now able to enter its main timestep loop. In each time step, the particle positions are updated, and then sync() is called to invoke boundary conditions, swap particles, and then renumber. A call is then made to gather() (line 208) to determine the field at each particle's location. As discussed earlier, this function uses the interpolator to determine values that lie off mesh points. Once the field strength is known, the particle velocities can be updated:

195    // Begin main timestep loop
196    for (int it=1; it <= NumSteps; ++it)
197    {
198      // Advance particle positions
199      out << "Advance particle positions ..." << std::endl;
200      P.R = P.R + dt * P.V;
202      // Invoke boundary conditions and update particle distribution
203      out << "Synchronize particles ..." << std::endl;
204      P.sync(P.R);
206      // Gather the E field to the particle positions
207      out << "Gather E field ..." << std::endl;
208      gather( P.E, EFD, P.R, Particles_t::InterpolatorTag_t() );
210      // Advance the particle velocities
211      out << "Advance particle velocities ..." << std::endl;
212      P.V = P.V + dt * P.qm * P.E;
213    }

Finally, the state of the particles at the end of the simulation is printed out, and the simulation is closed down:

215    // Display the final particle positions, velocities and qm values.
216    out << "PIC2d timestep loop complete!" << std::endl;
217    out << "-----------------------------" << std::endl;
218    out << "Final particle data:" << std::endl;
219    out << "Particle positions: "  << P.R << std::endl;
220    out << "Particle velocities: " << P.V << std::endl;
221    out << "Particle charge-to-mass ratios: " << P.qm << std::endl;
223    // Shut down POOMA and exit
224    out << "End PIC2d example code." << std::endl;
225    out << "-----------------------" << std::endl;
226    Pooma::finalize();
227    return 0;


This document has shown how POOMA's Field and Particles classes can be combined to create complete physical simulations. While more setup code is required than with Fortran-77 or C, the payoff is high-performance programs that are more flexible and easier to maintain.