The ON directive restricts the active processor set for a computation to those processors named in its home. The computation controlled is either the following Fortran statement (for a on-directive or the contained block for a block-on-directive. We refer to the controlled computation as the ON-block.
That is, it advises the compiler to use the named processor(s) to perform the ON block. Like the mapping directives ALIGN and DISTRIBUTE, this is advice rather than an absolute commandment; the compiler may override an ON directive. Also like ALIGN and DISTRIBUTE, the ON directive may affect the efficiency of computation, but not the final results.
The single-statement ON directive sets the active processor set for the first non-comment statement that follows it. It is said to apply to that statement. If the statement is a compound statement (e.g., a DO loop or an IF-THEN-ELSE construct), then the ON directive also applies to all statements nested therein. Similarly, the ON construct applies the initial ON clause to--i.e., sets the active processor set for--all statements up to the matching END ON directive.
The evaluation of any function referred to in the home expression is not affected by the ON directive; these functions are called on all processors active when control reached the directive. Thus,
is a reasonable way to idle on one active processor, and is not paradoxically self-referential.!HPF$ ON HOME( P(1: (ACTIVE_NUM_PROCS() - 1)) ) ...
The HOME clause can name a program object, a template, or a processors arrangement. For each of these possibilities, it can specify a single element or multiple elements. This is translated into the processor(s) executing the ON block as follows:
tells the compiler to perform the statement on the processors owning A(2), A(3), and A(4). If A were distributed BLOCK, this might be one processor; if it were distributed CYCLIC, it would be three processors (assuming that many processors were available). Extra copies of elements created by a SHADOW directive (H817) are not taken into consideration by the HOME clause.!HPF$ ON HOME ( A(2:4) )
will execute the following statement on the three processors P(2), P(3), and P(4).!HPF$ ON ( P(2:4) )
In every case, the ON directive specifies the processor(s) that should perform a computation. More formally, it sets the active processors for the statements governed by the ON directive, as described in Section 9.1. That section also describes how some statements (notably ALLOCATE and dynamic remapping directives) require that particular processors be included in the active set. If one of these constructs occurs in the ON block and the active processor set does not contain all the required processors, then the program is not standard-conforming.
Note that the ON directive only specifies how computation is partitioned among processors; it does not indicate processors that may be involved in data transfer. Also, the ON clause by itself does not guarantee that its body can be executed in parallel with any other operation. However, placing the computation can have a significant effect on data locality. As later examples will show, the combination of ON and INDEPENDENT can also provide control over the load balance of parallel computations.
DO I = 1, N !HPF$ ON HOME( A(MY_FCN(I)) ) BEGIN ... !HPF$ END ON END DO
Here, the generated code can perform an "inspector" (i.e., a skeleton loop that only evaluates the HOME clause of each iteration) to produce a list of iterations assigned to each processor. This list can be produced in parallel, since MY_FCN must be side-effect free (at least, the programmer cannot rely on any side effects). However, distributing the computatio of home to all processors may require unstructured communications patterns, possibly negating the advantage of parallelism. In general, more advanced compilers will be able to efficiently invert more complex HOME clauses. It is recommended that the abilities (and limitations) of a particular compiler be documented clearly for users.
Note that pprocessors "screened out" by the naive iplementation may still be required to participate in data transfer. If the underlying architecture allows one-sided communication (e.g., shared memoy or GET/PUT), this is not a problem. On message-passing machines, a request-reply protocol may be used. This requires the inactive processors to enter a wait loop until the ON block completes, or requires the inactive processors to enter a wait loop until the ON block completes, or requires the runtime system to handle requests asynchronously. Again, it is recommended that the documnetation tell programmers which cases are likely to efficient and which inefficient on a particular system. (End of advice to implementors.)
It should also be noted that the ON clause does not change the semantics of a program, in the same sense that DISTRIBUTE does not change semantics. In particular, an ON clause by itself does not change sequential code into parallel code, because the code in the ON block can still interact with code outside the ON block. (To put it another way, ON does not spawn processes.) (End of advice to users.)
It is legal to nest ON directives, if the set of active processors named by the inner ON directive is included in the set of active processors from the outer directive. The syntax of on-construct automatically ensures that it is properly nested inside other compound statements, and that compound statements properly nest inside of it. As with other Fortran compound statements, transfer of control to the interior of an on-construct from outside the block is prohibited: an on-construct may be entered only by executing the (executable) ON directive. Transfers within a block may occur. However, HPF also prohibits transfers of control from the interior of an on-construct to outside the on-construct, except by "falling through" the END ON directive. Note that this is stricter than in ordinary Fortran. If ON clauses are nested, then the innermost home effectively controls execution of the statement(s). A programmer can think of ths as successively restricting the set of processors at each level of ON nesting; clearly, the last restriction must be the strongest. Alternately, the programmer can think of this as a fork-join approach to nested parallelism.
If an ON directives includes a NEW clause, the meaning is the same as a NEW clause in an INDEPENDENT directive. The operation of the program would be identical if the NEW variables were allocated anew, and distributed onto the active processors, on every entry to the ON directive's scope, and deallocated on exit from the ON block. That is, the NEW variables are undefined on entry (i.e., assigned before use in the ON block) and undefined on exit (i.e., not used after the ON block, unless first reassigned). In addition, NEW variables cannot be remapped in the ON clause's scope, whether by REALIGN, REDISTRIBUTE, or by argument association (at subroutine calls). If a variable apears in a NEW clause but does not meet these conditions, then the program is not HPF-conforming. NEW variables are not considered by any nested RESIDENT directives, as detailed in Section 9.3.
The NEW variables are implicitly reallocated and remapped onto the active processors on entry to the ON block. For this reason, there are restrictions on their explicit mappings.
!HPF$ DISTRIBUTE X(BLOCK, *) !HPF$ DISTRIBUTE Y ONTO P ! Nonconforming due to ONTO clause !HPF$ ALIGN WITH X :: Z ! Nonconforming; ALIGN forbidden !HPF$ ON (P(1:4), NEW(X, Y, Z), BEGIN !HPF$ END ON