An extrinsic procedure can be defined as explicit SPMD code by specifying the local procedure code that is to execute on each processor. In this section, we describe the contract between the caller and an EXTRINSIC(MODEL="0DLOCAL"0D) callee. It is important not to confuse the extrinsic procedure, which is conceptually a single procedural entity called from the HPF program, with the individual local procedures that are executed on each node, one apiece. An invocation of an extrinsic procedure results in a separate invocation of a local procedure on each processor. The execution of an extrinsic procedure consists of the concurrent execution of a local procedure on each executing processor. Each local procedure may terminate at any time by executing a RETURN statement. However, the extrinsic procedure as a whole terminates only after every local procedure has terminated; in effect, the processors are synchronized before return to a global HPF caller.
With the exception of returning from a local procedure to the global caller that initiated local execution, there is no implicit synchronization required of the locally executing processors. Variables declared in a local procedure are held in local storage, private to each processor. To access data outside the processor requires either preparatory communication to copy data into the processor before running the local code, or explicit communication operations between the separately executing copies of the local procedure. Individual implementations may provide implementation-dependent means for communicating, for example, through a message-passing library or a shared-memory mechanism. Such communication mechanisms are beyond the scope of this specification. Note, however, that many useful portable algorithms that require only independence of control structure can take advantage of local routines, without requiring a communication facility.
The LOCAL model assumes only that nonsequential array axes are mapped independently to axes of a rectangular processor grid, each array axis to at most one processor axis (no ``skew'' distributions) and no two array axes to the same processor axis. This restriction suffices to ensure that each physical processor contains a subset of array elements that can be locally arranged in a rectangular configuration. (Of course, to compute the global indices of an element given its local indices, or vice versa, may be quite a tangled computation--but it will be possible.) In the case of cyclic distributions, multiple sections of the array may be mapped to the local processors.
It is recommended that if, in any given implementation, an extrinsic type does not obey the conventions described in this section, then its model name or keyword should not contain the word LOCAL.