Multicore Debugging

The embedded tools company
Addressing several cores via one debug interface
Advantages and Disadvantages

Multicore Debugging
Debugger for all cores of a multicore chip
Debugging of application cores, DSPs, accelerator cores and special-purpose cores
Debugging of more than 80 core architectures
Support for every multicore topology
Support for all multicore operation modes
Support for AMP and SMP systems
Single debug hardware can be licensed for all cores of a multicore chip

Link Order
Technical Support
Video: Synchronized AMP Debugging with ARM® and Nios®-II



The debugging of system-on-chip designs containing several cores places new demands on development tools. Ideally chip designers and tool producers should agree quickly on suitable techniques that guarantee the user rapid availability of high-quality development tools with free selection of the core combination.

Application-specific ASICs are used to an increasing extent today for embedded designs. At the same time it is becoming more and more common for several cores to be integrated to form a system-on-chip (SoC) in order to achieve optimum functionality and performance. The use of a RISC processor combined with a DSP is widespread, although other combinations are also employed.

The testing and the integration phase of multicore SoC designs of this type place new demands on microprocessor development tools. This article focuses on debugging by addressing several cores via a shared debug interface.  

Addressing several cores via one debug interface

TRACE32 Debugger systems support parallel and cascaded JTAG interfaces. More than one debugger can be connected to the same JTAG interface.

Separate debuggers and separate JTAG Connectors

Separate debuggers and common JTAG Connector, cores selected by special line

Separate debuggers and common JTAG Connector, cores cascaded

Common debuggers and common JTAG Connector, cores selected by special line

Common debuggers and common JTAG Connector, cores cascaded


Advantages and Disadvantages

Most processors today provide on-chip debug logic for debugging. The interface most frequently used for controlling the debug logic is the JTAG interface. However, with multicore SoC designs a separate JTAG interface is not provided for every core in order to save on the number of pins and so cut costs. Instead, a joint JTAG interface has to suffice for debugging all cores. This raises the question of how the debugger can address several cores via the joint JTAG interface and how synchronous debugging can be implemented:

One debugger supporting all cores on the SoC

At first glance this would appear to be the ideal solution for the developer. The chief advantages of this solution are as follows: one development tool for all cores and hence the need for familiarization with only one user interface and product philosophy. This is certainly the best approach to adopt for SoCs that are produced in large numbers of units and are used for a large number of developments. An example of this type of solution is the TriCore Debugger from Lauterbach that allows debugging of the main core as well as the peripheral control processor (PCP).

By contrast there is also a counter trend towards individualized SoCs that are used for the development of only one product. A new SoC that achieves even more optimized performance and functionality is created for the next product. If one also wants full freedom in the selection of the integrated cores, one needs the services of a tool producer who can provide an optimum product for every architecture that is employed. Naturally the debugger should also be available fully tested by the time the first chip is produced from the wafer. Although this solution is obviously the best solution available, the chances of it being implemented cheaply and more especially quickly while maintaining a high standard of quality are rather unlikely.

Using a JTAG server

The debuggers are using the software interface of the JTAG server instead having their own JTAG hardware. Highly developed debuggers are already available today for the majority of cores. We might therefore ask, why not connect the existing debuggers to a JTAG server that takes care of the addressing of the individual cores via the joint JTAG interface?

One can imagine a solution of this kind being implemented along the following lines. A server hardware is connected in front of the joint JTAG interface. This hardware is controlled by server software on the host. The individual debug client now sends its communication request to the server by means of a remote procedure call. This guarantees exclusive access to the joint JTAG interface and forwards the communication request to the individual core. Figure 1 shows the basic principle of a server solution of this type. In this case the debugger uses a software interface to the JTAG server instead of dedicated JTAG hardware.

This approach, although it appears very elegant at first sight, similarly does not offer a complete solution to the problem. A number of semiconductor manufacturers already offer server solutions for core groupings in their product range, but there is no server standard in sight offering a vendor-neutral solution. The JTAG server solution also proves to be rather inflexible when individual semiconductor manufacturers have expanded the JTAG signals of their core in order to offer additional debug functionality. For a multicore development environment, for example, it would be desirable for each core to have a stop request signal making it possible to stop the core immediately. Another possibility would be to indicate that a core has stopped by means of a stop indication signal. Both signals could be used ideally for synchronous stopping of all cores in a multicore application.

With free selection of an optimum core combination there can quickly be a number of additional signals. One alternative would simply be to let these signals go by the board and dispense with the additional functionality. However, these signals are often necessary or at least very helpful and should be used for this reason. The JTAG server would therefore still have to be matched to the specific application again and tested.

Another critical point especially for real-time applications is the large volumes of data that have to be handled by the JTAG server. This slows the reaction time of the individual core accordingly. Nevertheless, in order to match up to the real time requirements it is necessary to pack time-critical functions into the hardware which brings us back to the application-specific server. An optimum server solution is therefore particularly one in which all cores used come from the same semiconductor manufacturer, and this manufacturer supplies the JTAG server and at the same time the matching debuggers. Application-specific servers on the other hand require agreements between the producers of the debuggers used which is something that usually proves difficult because of the competitive situation and is also time-consuming and cost-intensive. An added factor is that it is necessary to adapt the server again for every new core combination.

Completely independent debuggers at the joint JTAG interface

Exclusive access must be guaranteed, if there are more than one debugger connected to the same JTAG port.

This approach represents a simple and open solution in which the best possible core combination can be freely selected for the particular application. During development and integration the developer can resort to fully perfected debuggers without the need to match the development environment to a specific application.

The basic idea in this case is that an independent debugger is connected to the joint JTAG interface for every core. In order for this approach to function, each debugger must deactivate its JTAG driver when it is not exchanging data with its core.

Since there are now several independent debuggers using the same JTAG interface it is necessary to ensure that only one debugger operates at the interface at any one time. This can be automated through a system whereby the debug tasks on the host define who is given exclusive access to the JTAG interface by using a semaphore system. Another possible alternative would be a kind of hardware semaphore.

The question of synchronous starting and stopping of all cores can be solved simply and effectively via special logic on the SoC. For synchronous stopping, for example, every core can be equipped with two additional signals:

  • A stop request signal that enables the core to be stopped immediately
  • A stop indication signal that indicates that the core has stopped

These signals could then be combined on the SoC via a matrix. Each core can then set whether it wants to stop or continue running if another core stops, for example by setting a memory-mapped control register. Figure 3 illustrates a matrix of this type. This matrix could also be easily expanded for the peripheries of the individual cores. The main advantage of this solution is the very high degree of synchronization when stopping and the provision of a standard interface for the debugger producers by the chip designer.

Extra on-chip logic allows cross connection of the start/stop request and start/stop indication signals of the cores to provide synchronous start and stop.

Extra on-chip logic allows cross connection of the start/stop request and start/stop indication signals of the cores to provide synchronous start and stop.

This solution offers a relatively simple procedure for chip designers and tool producers. If a few basic preconditions are provided, any cores can be integrated in the SoC and the associated development environment can be selected freely.

The advantages in this case can be summarized as follows:

  • No additional agreements between the tool producers are necessary for this solution approach. High-performance development tools are available immediately.
  • The solution that we have presented does not require any application-specific modifications on the part of the tool producer. Standard tools can be employed which can be reused in other projects.
  • The apparent disadvantage that a complete debug system must be acquired for every core turns out at a second glance to be more of an advantage, since during the development phase usually only one core is tested anyway and the debuggers are used individually for this. The only time when all debuggers are used together is during the integration phase when testing the actions between the cores.

TRACE32 also already supports a wide array of cores with its debuggers. A standard user interface and a common product philosophy therefore already exist for using these debuggers for multicore SoC designs.

As long as multicore debugging remains in its infancy and no general standard is defined we regard this third solution as the most effective. TRACE32 Debuggers are already equipped for operating at a joint JTAG interface.

Copyright © 2016 Lauterbach GmbH, Altlaufstr.40, D-85635 Höhenkirchen-Siegertsbrunn, Germany  Impressum
The information presented is intended to give overview information only.
Changes and technical enhancements or modifications can be made without notice. Report Errors
Last generated/modified: 14-Dec-2016