Coprocessor management troubleshooting grid

来自百问网嵌入式Linux wiki
Zhouyuebiao讨论 | 贡献2020年5月7日 (四) 11:20的版本 (创建页面,内容为“<onlyinclude> Some typical issues related to the management of a coprocessor are listed below. Solutions or debugging methods are proposed for these issues. If your…”)
(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)

Some typical issues related to the management of a coprocessor are listed below. Solutions or debugging methods are proposed for these issues.

If your issue is not listed, try looking in the articles in the Coprocessor management Linux, Coprocessor management STM32Cube or troubleshooting grids categories.

Coprocessor firmware loading and control

Symptom Resolution

The coprocessor traces are not available on Linux® side

Board $> cat  /sys/kernel/debug/remoteproc/remoteproc0/trace0
No such file or directory

This may happen for two reasons:

  • The firmware does not include any resource table or the resource table does not define any trace.
Update the firmware resource table and rebuild the firmware
  • The ".resource_table" section is empty or not defined in the elf file.
Use the following command to verify it in the generated elf file:
 PC $> readelf -l <elf file>
 ...
 02     .data .resource_table .bss ._user_heap_stack
 ....

When starting the coprocessor from the bootloader (u-boot):

"unsupported fw ver: 0
Remote Processor 0 resource table Not found : 0x00000000-0x0 "

The firmware does not include any resource table. This is only a warning that does not prevent the firmware from starting properly. "rproc load_rsc" step can be bypassed.

Inter processor communication

Symptom Resolution

Frozen firmware as consequence of a deadlock in OpenAMP during IP communication with the the main processor.

This Issue probably comes from rpmsg_virtio_rx_callback or rpmsg_virtio_send_offchannel_raw (rpmsg_virtio.c) functions that are called in interrupt context. These functions use a mutex lock in rpmsg_device struct when accessing the index of the virtio queue index. Rework your code so that these functions are not called in interrupt context.

Linux kernel trace:

stm32-ipcc 4c001000.mailbox: Try increasing MBOX_TX_QUEUE_LEN

On each IPCC interrupt, the coprocessor treats all the buffered RPMsgs (one IPCC event for several RPMsgs). On Linux side, one IPCC signal is programmed for each RPMsg sent. This can result is an overflow warning on Linux since too many IPCC events are queued. No RPMsgs are dropped but this message could be interpreted as the coprocessor reaches its capacity to treat the received messages in time.
Consider reworking the code so that the coprocessor processes messages more efficiently or decreasing the rate of messages sent by Linux.

Linux kernel trace ( example with rpmsg_tty driver):

 rpmsg_tty virtio0.rpmsg-tty-channel.-1.0: timeout waiting for a tx buffer

This message means that there is no more TX buffer available to transmit messages to the remote processor. This may happen for two reasons:

  • The firmware implementation is incomplete and the coprocessor does not process any IPCC interrupt (eg interrupt disabled or interrupt handler not defined). Fix the firmware code: refer to IPCC_internal_peripheral for details on the peripheral.
  • The coprocessor is busy, frozen or crashed. This can be confirmed by performing a debug tool analysis.

Linux kernel trace:

rpmsg_tty virtio0.rpmsg-tty-channel.-1.0: No memory for tty_prepare_flip_string

There is no more space in the TTY temporary buffer on reception. The root cause is probably that the Linux application did not read the tty device on time to treat the incoming TTY stream. Consider reworking the Linux application so that it takes less time to process messages or decreasing the message exchange rate.

Linux kernel trace:

remoteproc remoteproc0: stm32_rproc_kick: failed (<mbx>, <error value>)

The Linux remoteproc driver cannot use the IPCC mailbox. This may happen for two reasons:

  • The Linux kernel is built without the support of the stm32 IPCC mailbox: to enable it, refer to IPCC configuration.
  • The mailboxes are not or incorrectly defined in the Linux kernel DeviceTree: fix it as described in mailbox DeviceTree.

<securetransclude src="ProtectedTemplate:PublicationRequestId" params="14609 | 2020-01-15 |"></securetransclude>