In compiling nearest-neighbor code--for example, in discretizing partial differential equations or implementing convolutions--a standard technique is to allocate storage on each processor for the local array section so as to include additional space for the elements that have to be moved in from neighboring processors. This additional storage is referred to as ``shadow edges.'' There are conceptually two shadow edges for each array dimension: one at the low end of the local array section and the other at the high end.
In a single routine, the compiler can tell which arrays require shadow edges and allocate this additional space accordingly. However, since the width of the shadow area is dependent on the size of the computational stencil being used, an array may require different shadow widths in different routines. Thus, without interprocedural analysis, an array argument may need to be copied into a space with the appropriate shadow width on each procedure call. A similar data motion would be required to copy the data back to its original location on exit from the subroutine. This unnecessary data motion can be avoided by allowing the user to specify the required shadow width when the array is declared.
The syntax for declaring shadow widths is as follows:
H817 shadow-directive is SHADOW shadow-target shadow-attr-stuff
H818 shadow-target is object-name
H819 shadow-attr-stuff is ( shadow-spec-list )
H820 shadow-spec is width
or low-width : high-width
H821 width is int-expr
H822 low-width is int-expr
H823 high-width is int-expr
A shadow-spec of width is equivalent to a shadow-spec of width:width. Thus, the directives
!HPF$ DISTRIBUTE (BLOCK) :: A !HPF$ SHADOW (w) :: A
specify that the array A is distributed BLOCK and is to have a shadow width of w on both sides. If A is a dummy argument, this gives the compiler enough information to inhibit unnecessary data motion at procedure calls.
Alternatively, different shadow widths can be specified for the low end and high end of a dimension. For example:
REAL, DIMENSION (1000) :: A !HPF$ DISTRIBUTE(BLOCK), SHADOW(1:2) :: A .... FORALL (i = 2, 998) A(i) = 0.25 * (A(i) + A(i-1) + A(i+1) + A(i+2)) END FORALL
specifies that only one non-local element is needed at the lower end while two are needed at the high end.