Kalray MPPA

From ERIKA WIKI
Jump to: navigation, search

Synopsys

Kalray-1 (K1) is the name of the Instruction Set Architecture (ISA) of Kalray MPPA processors. The Kalray-1 core implements a 32-bit 5-issue Very Long Instruction Word (VLIW) architecture with a 7-stage instruction pipeline. MPPA processors are manycore processors.

The Kalray MPPA®-256 processor (Multi-Purpose Processing Array) integrates 256 processing engine (PE) cores and 32 resource management (RM) cores on a single chip. These cores are distributed across 16 computing clusters and 4 I/O subsystems. The I/O subsystem contains 4 RM cores each. The computing clusters contain 1 RM and 16 PE each.

ERIKA Enterprise v3 has been ported only on the computing clusters. The chip officially supported by ERIKA v3 is the second generation of the chip (called Bostan). In this chip version, the cluster's RM core is not available to users, but it's directly managed by the Kalray SDK through the mOS layer (hypervisor), leaving a "virtual cluster" of 16 PE cores for users.

The Kalray port of ERIKA is part of the European project P-SOCRATES, where ERIKA was supposed to be the back-end of a lightweight OpenMP runtime.

You can find project's results provided as an open source project called Upscale SDK

A parallel framework like OpenMP has some requirements that do not fit well with an AMP OS like an Autosar OS. So, we created a library in ERIKA to provide services for the runtime OpenMP, and we call them jobs. Even though it is present and supported, it is not perfectly integrated with the current status of ERIKA.

Currently, ERIKA is still lacking a complete multicore AUTOSAR OS support for this platform (the first release of ERIKA v3 is a single-core OSEK/VDX OS, ready to be extended with multicore features). When the multicore support will be completed, we will try to harmonize the jobs library.

Configuration and Programming

ERIKA Enterprise is configured through RT-Druid and an OIL file. ERIKA's support for Kalray lay on top of Kalray Access Core SDK, and inherits the environment configuration.

CPU

CPU_DATA must be set to KALRAY_K1.

Example of a CPU_DATA section:

 CPU_DATA = KALRAY_K1 {
   ...
 };

Interrupt Handling

ERIKA lets you install a handler for any IRQ source (physical and virtual) provided by Kalray-K1 mOS, using the same name that SDK uses.

 ISR TimerISR {
   CATEGORY = 2;
   SOURCE   = "BSP_IT_TIMER_0";
   ...
 };

OSEK/VDX Extensions

This Section contains information about the OSEK/VDX Extensions (or optional features) that have been implemented for the Kalray K1.

System Timer

System Timer can be configurated to use one od the two cluster's timers (BSP_IT_TIMER_0 or BSP_IT_TIMER_1).

 COUNTER SystemTimer {
   MINCYCLE = 1;
   MAXALLOWEDVALUE = 2147483647;
   TICKSPERBASE = 1;
   TYPE = HARDWARE {
     DEVICE = "BSP_IT_TIMER_0";
     SYSTEM_TIMER = TRUE;
   };
   SECONDSPERTICK = 0.001;
 };

Jobs library

Even though not yet harmonized, the jobs library is already available. The scheduling model is: full preemptive on PE0 (Master-Core), limited preemption on PE[1:15] (this is what was needed for the upscale-sdk, and what we have been able to achieve due some issues in the mOS layer of the AccessCore version we worked with).

To access this feature you need to switch-on the dynamic behavior (TASK creation at runtime from an object pool) of ERIKA (USEDYNAMICAPI = TRUE) and the API extension (USEEXTENSIONAPI = TRUE), and configure all the 16 Cores for the cluster.

Inside USEDYNAMICAPI field in addition to the normal ERIKA's dynamic support, for Kalray-K1 is possible to configure the dimension of the jobs pool (MAX_NUM_JOB).

Inside USEEXTENSIONAPI is possible to configure the SCHEDULER behaviour:

  • PARTITIONED is the normal Autosar OS scheduler with TASK fully pinned to one core and each core handling its own TASKs ready queue.
  • GLOBAL is a global work conserving priority based ready queue common to all cores, where a preempted TASK can migrate to a core previously in idle.

N.B. Be aware that the conformance classes that support jobs library are ECC1 or BCC1 with MULTI_STACK enabled (both ready queue algorithms: priority queue (O(n) with 127 priority) and priority multiqueue (O(1) with 31 priority), are supported).

A configuration example follows:

   CPU_DATA = KALRAY_K1 {
     ID = 0;
     MULTI_STACK = TRUE;
   };
   
   CPU_DATA = KALRAY_K1 {
     ID = 1;
   };
   
   ....
   
   CPU_DATA = KALRAY_K1 {
     ID = 15;
   };
   
   USERESSCHEDULER = FALSE; /* ResScheduler support has to be explicitly shutdown in a multicore configuration */
   
   USEDYNAMICAPI = TRUE {
     TASK_ARRAY_SIZE     = 128;    /* 16 * 8 */
     SN_ARRAY_SIZE       = 128;    /* 16 * 8 */
     STACKS_MEMORY_SIZE  = 131072; /* 8192 * 16 */
     MAX_NUM_JOB         = 2;
   };
   
   USEEXTENSIONAPI = TRUE {
     /* SCHEDULER = PARTITIONED; Uncomment this and comment the following, to switch back to the normal multicore partitioned scheduler */
     SCHEDULER = GLOBAL;
   };
   
   KERNEL_TYPE = OSEK {
     CLASS = ECC1;
     /* RQ = MQ; Uncomment this to use the priority multi-queue */
   };
 };
 
 /* Workaround: FAKE TASK, just to have a place where declare APP_SRC */
 TASK Fake_Task {
   CPU_ID   = 0;
   APP_SRC  = "test.c";
   PRIORITY = 1; /* Always Needed */
 };