699 lines
33 KiB
ArmAsm
699 lines
33 KiB
ArmAsm
|
/*
|
|||
|
* Copyright 2019-2022 Great Scott Gadgets
|
|||
|
*
|
|||
|
* This file is part of HackRF.
|
|||
|
*
|
|||
|
* This program is free software; you can redistribute it and/or modify
|
|||
|
* it under the terms of the GNU General Public License as published by
|
|||
|
* the Free Software Foundation; either version 2, or (at your option)
|
|||
|
* any later version.
|
|||
|
*
|
|||
|
* This program is distributed in the hope that it will be useful,
|
|||
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|||
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|||
|
* GNU General Public License for more details.
|
|||
|
*
|
|||
|
* You should have received a copy of the GNU General Public License
|
|||
|
* along with this program; see the file COPYING. If not, write to
|
|||
|
* the Free Software Foundation, Inc., 51 Franklin Street,
|
|||
|
* Boston, MA 02110-1301, USA.
|
|||
|
*/
|
|||
|
|
|||
|
/*
|
|||
|
|
|||
|
Introduction
|
|||
|
============
|
|||
|
|
|||
|
This file contains the code that runs on the Cortex-M0 core of the LPC43xx.
|
|||
|
|
|||
|
The M0 core is used to implement all the timing-critical usage of the SGPIO
|
|||
|
peripheral, which interfaces to the MAX5864 ADC/DAC via the CPLD.
|
|||
|
|
|||
|
The M0 reads or writes 32 bytes at a time from the SGPIO registers,
|
|||
|
transferring these bytes to or from a shared USB bulk buffer. The M4 core
|
|||
|
handles transferring data between this buffer and the USB host.
|
|||
|
|
|||
|
The SGPIO peripheral is set up and enabled by the M4 core. All the M0 needs to
|
|||
|
do is handle the SGPIO exchange interrupt, which indicates that new data can
|
|||
|
now be read from or written to the SGPIO shadow registers.
|
|||
|
|
|||
|
To implement the different functions of HackRF, the M0 operates in one of
|
|||
|
five modes, configured by the M4:
|
|||
|
|
|||
|
IDLE: Do nothing.
|
|||
|
WAIT: Do nothing, but increment byte counter for timing purposes.
|
|||
|
RX: Read data from SGPIO and write it to the buffer.
|
|||
|
TX_START: Write zeroes to SGPIO until there is data in the buffer.
|
|||
|
TX_RUN: Read data from the buffer and write it to SGPIO.
|
|||
|
|
|||
|
In all modes except IDLE, the M0 advances a byte counter, which increases by
|
|||
|
32 each time that many bytes are exchanged with the buffer (or skipped over,
|
|||
|
in WAIT mode).
|
|||
|
|
|||
|
As the M4 core produces or consumes these bytes, it advances its own counter.
|
|||
|
The difference between the two counter values therefore indicates the number
|
|||
|
of bytes available.
|
|||
|
|
|||
|
If the M4 does not advance its count in time, a TX underrun or RX overrun
|
|||
|
occurs. Collectively, these events are referred to as shortfalls, and the
|
|||
|
handling is similar for both.
|
|||
|
|
|||
|
In an RX shortfall, data is discarded. In TX mode, zeroes are written to
|
|||
|
SGPIO. When in a shortfall, the byte counter does not advance.
|
|||
|
|
|||
|
The M0 maintains statistics on the the number of shortfalls, and the length of
|
|||
|
the longest shortfall.
|
|||
|
|
|||
|
The M0 can be configured to abort TX or RX and return to IDLE mode, if the
|
|||
|
length of a shortfall exceeds a configured limit.
|
|||
|
|
|||
|
The M0 can also be configured to switch modes automatically when its byte
|
|||
|
counter matches a threshold value. This feature can be used to implement
|
|||
|
timed operations.
|
|||
|
|
|||
|
Timing
|
|||
|
======
|
|||
|
|
|||
|
This code has tight timing constraints.
|
|||
|
|
|||
|
We have to complete a read or write from SGPIO every 163 cycles.
|
|||
|
|
|||
|
The CPU clock is 204MHz. We exchange 32 bytes at a time in the SGPIO
|
|||
|
registers, which is 16 samples worth of IQ data. At the maximum sample rate of
|
|||
|
20MHz, the SGPIO update rate is 20 / 16 = 1.25MHz. So we have 204 / 1.25 =
|
|||
|
163.2 cycles available.
|
|||
|
|
|||
|
Access to the SGPIO peripheral is slow, due to the asynchronous bridge that
|
|||
|
connects it to the AHB bus matrix. Section 20.4.1 of the LPC43xx user manual
|
|||
|
(UM10503) specifies the access latencies as:
|
|||
|
|
|||
|
Read: 4 x MCLK + 4 x CLK_PERIPH_SGPIO
|
|||
|
Write: 4 x MCLK + 2 x CLK_PERIPH_SGPIO
|
|||
|
|
|||
|
In our case both these clocks are at 204MHz so reads add 8 cycles and writes
|
|||
|
add 6. These are latencies that add to the usual M0 instruction timings, so an
|
|||
|
ldr from SGPIO takes 10 cycles, and an str to SGPIO takes 8 cycles.
|
|||
|
|
|||
|
These latencies are assumed to apply to all accesses to the SGPIO peripheral's
|
|||
|
address space, which includes its interrupt control registers as well as the
|
|||
|
shadow registers.
|
|||
|
|
|||
|
There are four key code paths, with the following worst-case timings:
|
|||
|
|
|||
|
RX, normal: 152 cycles
|
|||
|
RX, overrun: 76 cycles
|
|||
|
TX, normal: 140 cycles
|
|||
|
TX, underrun: 145 cycles
|
|||
|
|
|||
|
Design
|
|||
|
======
|
|||
|
|
|||
|
Due to the timing constraints, this code is highly optimised.
|
|||
|
|
|||
|
This is the only code that runs on the M0, so it does not need to follow
|
|||
|
calling conventions, nor use features of the architecture in standard ways.
|
|||
|
|
|||
|
The SGPIO handling does not run as an ISR. It polls the interrupt status.
|
|||
|
This saves the cycle costs of interrupt entry and exit, and allows all
|
|||
|
registers to be used freely.
|
|||
|
|
|||
|
All possible registers, including the stack pointer and link register, can be
|
|||
|
used to store values needed in the code, to minimise memory loads and stores.
|
|||
|
|
|||
|
There are no function calls. There is no stack usage. All values are in
|
|||
|
registers and fixed memory addresses.
|
|||
|
|
|||
|
Structure
|
|||
|
=========
|
|||
|
|
|||
|
Each mode has its own loop routine. TX_START and TX_RUN use a single TX loop.
|
|||
|
|
|||
|
Code shared between different modes is implemented in macros and duplicated
|
|||
|
within each mode's own loop.
|
|||
|
|
|||
|
At startup, the main routine sets up registers and memory, then falls through
|
|||
|
to the idle loop.
|
|||
|
|
|||
|
The idle loop waits for a mode to be set, then jumps to that mode's start
|
|||
|
label.
|
|||
|
|
|||
|
Code following the start label is executed only on a transition from IDLE. It
|
|||
|
is at this point that the buffer statistics are reset.
|
|||
|
|
|||
|
Each mode's start code then falls through to its loop label.
|
|||
|
|
|||
|
The first step in each loop is to wait for an SGPIO interrupt and clear it,
|
|||
|
which is implemented by the await_sgpio macro.
|
|||
|
|
|||
|
Then, the mode setting is loaded from memory. If the M4 has reset the mode to
|
|||
|
idle, control jumps back to the idle loop after handling any cleanup needed.
|
|||
|
|
|||
|
Next, any SGPIO operations are carried out. For RX and TX, this begins with
|
|||
|
calculating the buffer margin, and branching if there is a shortfall. Then
|
|||
|
the pointer within the buffer is updated.
|
|||
|
|
|||
|
SGPIO reads and writes are implemented in 16-byte chunks. The four lowest
|
|||
|
registers, r0-r3, are used to temporarily hold the data for each chunk. Data
|
|||
|
is stored in-order in the buffer, but out-of-order in the SGPIO shadow
|
|||
|
registers, due to the SGPIO architecture. A combination of single and
|
|||
|
multiple load/stores is used to reorder the data in each chunk.
|
|||
|
|
|||
|
After completing SGPIO operations, counters are updated and the threshold
|
|||
|
setting is checked. If the byte count has reached the threshold, the next
|
|||
|
mode is set and a jump is made directly to the corresponding loop label.
|
|||
|
Code at the start label of the new mode is not executed, so stats and
|
|||
|
counters are maintained across a sequence of TX/RX/WAIT operations.
|
|||
|
|
|||
|
When a shortfall occurs, a branch is taken to a separate handler routine,
|
|||
|
which branches back to the mode's normal loop when complete.
|
|||
|
|
|||
|
Most of the code for shortfall handling is common to RX and TX, and is
|
|||
|
implemented in the handle_shortfall macro. This is primarily concerned with
|
|||
|
updating statistics, but also handles switching back to IDLE mode if a
|
|||
|
shortfall exceeds the configured limit.
|
|||
|
|
|||
|
There is a rollback mechanism implemented in the shortfall handling. This is
|
|||
|
necessary because it is common for a harmless shortfall to occur during
|
|||
|
shutdown, which produces misleading statistics. The code detects this case
|
|||
|
when the mode is changed to IDLE whilst a shortfall is ongoing. If this
|
|||
|
happens, statistics are rolled back to their values at the beginning of the
|
|||
|
shortfall.
|
|||
|
|
|||
|
The backup of previous values is implemented in handle_shortfall when a new
|
|||
|
shortfall is detected, and the rollback is implemented by the
|
|||
|
checked_rollback routine. This routine is executed by the TX and RX loops
|
|||
|
before returning to the idle loop.
|
|||
|
|
|||
|
Organisation
|
|||
|
============
|
|||
|
|
|||
|
The rest of this file is organised as follows:
|
|||
|
|
|||
|
- Constant definitions
|
|||
|
- Fixed register allocations
|
|||
|
- Macros
|
|||
|
- Ordering constraints
|
|||
|
- Finally, the actual code!
|
|||
|
|
|||
|
*/
|
|||
|
|
|||
|
// Constants that point to registers we'll need to modify in the SGPIO block.
|
|||
|
.equ SGPIO_SHADOW_REGISTERS_BASE, 0x40101100
|
|||
|
.equ SGPIO_EXCHANGE_INTERRUPT_BASE, 0x40101F00
|
|||
|
|
|||
|
// Offsets into the interrupt control registers.
|
|||
|
.equ INT_CLEAR, 0x30
|
|||
|
.equ INT_STATUS, 0x2C
|
|||
|
|
|||
|
// Buffer that we're funneling data to/from.
|
|||
|
.equ TARGET_DATA_BUFFER, 0x20008000
|
|||
|
.equ TARGET_BUFFER_SIZE, 0x8000
|
|||
|
.equ TARGET_BUFFER_MASK, 0x7fff
|
|||
|
|
|||
|
// Base address of the state structure.
|
|||
|
.equ STATE_BASE, 0x20007000
|
|||
|
|
|||
|
// Offsets into the state structure.
|
|||
|
.equ REQUESTED_MODE, 0x00
|
|||
|
.equ ACTIVE_MODE, 0x04
|
|||
|
.equ M0_COUNT, 0x08
|
|||
|
.equ M4_COUNT, 0x0C
|
|||
|
.equ NUM_SHORTFALLS, 0x10
|
|||
|
.equ LONGEST_SHORTFALL, 0x14
|
|||
|
.equ SHORTFALL_LIMIT, 0x18
|
|||
|
.equ THRESHOLD, 0x1C
|
|||
|
.equ NEXT_MODE, 0x20
|
|||
|
.equ ERROR, 0x24
|
|||
|
|
|||
|
// Private variables stored after state.
|
|||
|
.equ PREV_LONGEST_SHORTFALL, 0x28
|
|||
|
|
|||
|
// Operating modes.
|
|||
|
.equ MODE_IDLE, 0
|
|||
|
.equ MODE_WAIT, 1
|
|||
|
.equ MODE_RX, 2
|
|||
|
.equ MODE_TX_START, 3
|
|||
|
.equ MODE_TX_RUN, 4
|
|||
|
|
|||
|
// Error codes.
|
|||
|
.equ ERROR_NONE, 0
|
|||
|
.equ ERROR_RX_TIMEOUT, 1
|
|||
|
.equ ERROR_TX_TIMEOUT, 2
|
|||
|
|
|||
|
// Our slice chain is set up as follows (ascending data age; arrows are reversed for flow):
|
|||
|
// L -> F -> K -> C -> J -> E -> I -> A
|
|||
|
// Which has equivalent shadow register offsets:
|
|||
|
// 44 -> 20 -> 40 -> 8 -> 36 -> 16 -> 32 -> 0
|
|||
|
.equ SLICE0, 44
|
|||
|
.equ SLICE1, 20
|
|||
|
.equ SLICE2, 40
|
|||
|
.equ SLICE3, 8
|
|||
|
.equ SLICE4, 36
|
|||
|
.equ SLICE5, 16
|
|||
|
.equ SLICE6, 32
|
|||
|
.equ SLICE7, 0
|
|||
|
|
|||
|
/* Allocations of single-use registers */
|
|||
|
|
|||
|
buf_size_minus_32 .req r14
|
|||
|
state .req r13
|
|||
|
buf_base .req r12
|
|||
|
buf_mask .req r11
|
|||
|
shortfall_length .req r10
|
|||
|
hi_zero .req r9
|
|||
|
sgpio_data .req r7
|
|||
|
sgpio_int .req r6
|
|||
|
count .req r5
|
|||
|
buf_ptr .req r4
|
|||
|
|
|||
|
/* Macros */
|
|||
|
|
|||
|
.macro await_sgpio name
|
|||
|
// Wait for, then clear, SGPIO exchange interrupt flag.
|
|||
|
//
|
|||
|
// Clobbers:
|
|||
|
int_status .req r0
|
|||
|
scratch .req r1
|
|||
|
|
|||
|
// The worst case timing is assumed to occur when reading the interrupt
|
|||
|
// status register *just* misses the flag being set - so we include the
|
|||
|
// cycles required to check it a second time.
|
|||
|
//
|
|||
|
// We also assume that we can spend a full 10 cycles doing an ldr from
|
|||
|
// SGPIO the first time (2 for ldr, plus 8 for SGPIO-AHB bus latency),
|
|||
|
// and still miss a flag that was set at the start of those 10 cycles.
|
|||
|
//
|
|||
|
// This latter asssumption is probably slightly pessimistic, since the
|
|||
|
// sampling of the flag on the SGPIO side must occur some time after
|
|||
|
// the ldr instruction begins executing on the M0. However, we avoid
|
|||
|
// relying on any assumptions about the timing details of a read over
|
|||
|
// the SGPIO to AHB bridge.
|
|||
|
|
|||
|
\name\()_int_wait:
|
|||
|
// Spin on the exchange interrupt status, shifting the slice A flag to the carry flag.
|
|||
|
ldr int_status, [sgpio_int, #INT_STATUS] // int_status = SGPIO_STATUS_1 // 10, twice
|
|||
|
lsr scratch, int_status, #1 // scratch = int_status >> 1 // 1, twice
|
|||
|
bcc \name\()_int_wait // if !carry: goto int_wait // 3, then 1
|
|||
|
|
|||
|
// Clear the interrupt pending bits that were set.
|
|||
|
str int_status, [sgpio_int, #INT_CLEAR] // SGPIO_CLR_STATUS_1 = int_status // 8
|
|||
|
.endm
|
|||
|
|
|||
|
.macro on_request label
|
|||
|
// Check if a new mode change request was made, and if so jump to the given label.
|
|||
|
mode .req r3
|
|||
|
flag .req r2
|
|||
|
ldr mode, [state, #REQUESTED_MODE] // mode = state.requested_mode // 2
|
|||
|
lsr flag, mode, #16 // flag = mode >> 16 // 1
|
|||
|
bne \label // if flag != 0: goto label // 1 thru, 3 taken
|
|||
|
.endm
|
|||
|
|
|||
|
.macro update_buf_ptr
|
|||
|
// Update the address of the buffer segment we want to write to / read from.
|
|||
|
mov buf_ptr, buf_mask // buf_ptr = buf_mask // 1
|
|||
|
and buf_ptr, count // buf_ptr &= count // 1
|
|||
|
add buf_ptr, buf_base // buf_ptr += buf_base // 1
|
|||
|
.endm
|
|||
|
|
|||
|
.macro update_counts
|
|||
|
// Update counts after successful SGPIO operation.
|
|||
|
|
|||
|
// Update the byte count and store the new value.
|
|||
|
add count, #32 // count += 32 // 1
|
|||
|
str count, [state, #M0_COUNT] // state.m0_count = count // 2
|
|||
|
|
|||
|
// We didn't have a shortfall, so the current shortfall length is zero.
|
|||
|
mov shortfall_length, hi_zero // shortfall_length = hi_zero // 1
|
|||
|
.endm
|
|||
|
|
|||
|
.macro jump_next_mode name
|
|||
|
// Jump to next mode if the byte count threshold has been reached.
|
|||
|
//
|
|||
|
// Clobbers:
|
|||
|
threshold .req r0
|
|||
|
new_mode .req r1
|
|||
|
|
|||
|
// Check count against threshold. If not a match, return to start of current loop.
|
|||
|
ldr threshold, [state, #THRESHOLD] // threshold = state.threshold // 2
|
|||
|
cmp count, threshold // if count != threshold: // 1
|
|||
|
bne \name\()_loop // goto loop // 1 thru, 3 taken
|
|||
|
|
|||
|
// Otherwise, load and set new mode.
|
|||
|
ldr new_mode, [state, #NEXT_MODE] // new_mode = state.next_mode // 2
|
|||
|
str new_mode, [state, #ACTIVE_MODE] // state.active_mode = new_mode // 2
|
|||
|
|
|||
|
// Branch according to new mode.
|
|||
|
cmp new_mode, #MODE_RX // if new_mode == RX: // 1
|
|||
|
beq rx_loop // goto rx_loop // 1 thru, 3 taken
|
|||
|
bgt tx_loop // elif new_mode > RX: goto tx_loop // 1 thru, 3 taken
|
|||
|
cmp new_mode, #MODE_WAIT // if new_mode == WAIT: // 1
|
|||
|
beq wait_loop // goto wait_loop // 1 thru, 3 taken
|
|||
|
b idle // goto idle // 3
|
|||
|
.endm
|
|||
|
|
|||
|
.macro handle_shortfall name
|
|||
|
// Handle a shortfall.
|
|||
|
//
|
|||
|
// Clobbers:
|
|||
|
length .req r0
|
|||
|
num .req r1
|
|||
|
prev .req r1
|
|||
|
longest .req r1
|
|||
|
limit .req r1
|
|||
|
|
|||
|
// Get current shortfall length from high register.
|
|||
|
mov length, shortfall_length // length = shortfall_length // 1
|
|||
|
|
|||
|
// Is this a new shortfall?
|
|||
|
cmp length, #0 // if length > 0: // 1
|
|||
|
bgt \name\()_extend_shortfall // goto extend_shortfall // 1 thru, 3 taken
|
|||
|
|
|||
|
// If so, increase the shortfall count.
|
|||
|
ldr num, [state, #NUM_SHORTFALLS] // num = state.num_shortfalls // 2
|
|||
|
add num, #1 // num += 1 // 1
|
|||
|
str num, [state, #NUM_SHORTFALLS] // state.num_shortfalls = num // 2
|
|||
|
|
|||
|
// Back up previous longest shortfall.
|
|||
|
ldr prev, [state, #LONGEST_SHORTFALL] // prev = state.longest_shortfall // 2
|
|||
|
str prev, [state, #PREV_LONGEST_SHORTFALL] // prev_longest_shortfall = prev // 2
|
|||
|
|
|||
|
\name\()_extend_shortfall:
|
|||
|
|
|||
|
// Extend the length of the current shortfall, and store back in high register.
|
|||
|
add length, #32 // length += 32 // 1
|
|||
|
mov shortfall_length, length // shortfall_length = length // 1
|
|||
|
|
|||
|
// Is this now the longest shortfall?
|
|||
|
ldr longest, [state, #LONGEST_SHORTFALL] // longest = state.longest_shortfall // 2
|
|||
|
cmp length, longest // if length <= longest: // 1
|
|||
|
blt \name\()_loop // goto loop // 1 thru, 3 taken
|
|||
|
str length, [state, #LONGEST_SHORTFALL] // state.longest_shortfall = length // 2
|
|||
|
|
|||
|
// Is this shortfall long enough to trigger a timeout?
|
|||
|
ldr limit, [state, #SHORTFALL_LIMIT] // limit = state.shortfall_limit // 2
|
|||
|
cmp limit, #0 // if limit == 0: // 1
|
|||
|
beq \name\()_loop // goto loop // 1 thru, 3 taken
|
|||
|
cmp length, limit // if length < limit: // 1
|
|||
|
blt \name\()_loop // goto loop // 1 thru, 3 taken
|
|||
|
|
|||
|
// If so, reset mode to idle and return to idle loop, logging an error.
|
|||
|
//
|
|||
|
// Modes are mapped to errors as follows:
|
|||
|
//
|
|||
|
// MODE_RX (2) -> ERROR_RX_TIMEOUT (1)
|
|||
|
// MODE_TX_RUN (4) -> ERROR_TX_TIMEOUT (2)
|
|||
|
//
|
|||
|
// As such, the error code can be obtained by shifting the mode right by 1 bit.
|
|||
|
|
|||
|
mode .req r3
|
|||
|
error .req r2
|
|||
|
ldr mode, [state, #ACTIVE_MODE] // mode = state.active_mode // 2
|
|||
|
lsr error, mode, #1 // error = mode >> 1 // 1
|
|||
|
str error, [state, #ERROR] // state.error = error // 2
|
|||
|
mov mode, #MODE_IDLE // mode = MODE_IDLE // 1
|
|||
|
str mode, [state, #ACTIVE_MODE] // state.active_mode = mode // 2
|
|||
|
b idle // goto idle // 3
|
|||
|
.endm
|
|||
|
|
|||
|
/*
|
|||
|
|
|||
|
Ordering constraints
|
|||
|
====================
|
|||
|
|
|||
|
The following routines are in an unusual order, to preserve the ability to
|
|||
|
use PC-relative conditional branches between them ("b<cond> label"). The
|
|||
|
ordering has been chosen to ensure that all routines are close enough to each
|
|||
|
other for the limited range of these instructions (−256 bytes to +254 bytes).
|
|||
|
|
|||
|
The ordering of routines, and which others each needs to be able to reach, is
|
|||
|
as follows:
|
|||
|
|
|||
|
Routine: Uses conditional branches to:
|
|||
|
|
|||
|
idle tx_loop, wait_loop
|
|||
|
tx_zeros tx_loop
|
|||
|
checked_rollback idle
|
|||
|
tx_loop tx_zeros, checked_rollback, rx_loop, wait_loop
|
|||
|
wait_loop rx_loop, tx_loop
|
|||
|
rx_loop rx_shortfall, checked_rollback, tx_loop, wait_loop
|
|||
|
rx_shortfall rx_loop
|
|||
|
|
|||
|
If any of these routines are reordered, or made longer, you may get an error
|
|||
|
from the assembler saying that a branch is out of range.
|
|||
|
|
|||
|
*/
|
|||
|
|
|||
|
// Entry point. At this point, the libopencm3 startup code has set things up as
|
|||
|
// normal; .data and .bss are initialised, the stack is set up, etc. However,
|
|||
|
// we don't actually use any of that. All the code in this file would work
|
|||
|
// fine if the M0 jumped straight to main at reset.
|
|||
|
.global main
|
|||
|
.thumb_func
|
|||
|
main: // Cycle counts:
|
|||
|
// Initialise registers used for constant values.
|
|||
|
value .req r0
|
|||
|
ldr sgpio_int, =SGPIO_EXCHANGE_INTERRUPT_BASE // sgpio_int = SGPIO_INT_BASE // 2
|
|||
|
ldr sgpio_data, =SGPIO_SHADOW_REGISTERS_BASE // sgpio_data = SGPIO_REG_SS // 2
|
|||
|
ldr value, =(TARGET_BUFFER_SIZE - 32) // value = TARGET_BUFFER_SIZE - 32 // 2
|
|||
|
mov buf_size_minus_32, value // buf_size_minus_32 = value // 1
|
|||
|
ldr value, =TARGET_DATA_BUFFER // value = TARGET_DATA_BUFFER // 2
|
|||
|
mov buf_base, value // buf_base = value // 1
|
|||
|
ldr value, =TARGET_BUFFER_MASK // value = TARGET_DATA_MASK // 2
|
|||
|
mov buf_mask, value // buf_mask = value // 1
|
|||
|
ldr value, =STATE_BASE // value = STATE_BASE // 2
|
|||
|
mov state, value // state = value // 1
|
|||
|
zero .req r0
|
|||
|
mov zero, #0 // zero = 0 // 1
|
|||
|
mov hi_zero, zero // hi_zero = zero // 1
|
|||
|
|
|||
|
// Initialise state.
|
|||
|
str zero, [state, #REQUESTED_MODE] // state.requested_mode = zero // 2
|
|||
|
str zero, [state, #ACTIVE_MODE] // state.active_mode = zero // 2
|
|||
|
str zero, [state, #M0_COUNT] // state.m0_count = zero // 2
|
|||
|
str zero, [state, #M4_COUNT] // state.m4_count = zero // 2
|
|||
|
str zero, [state, #NUM_SHORTFALLS] // state.num_shortfalls = zero // 2
|
|||
|
str zero, [state, #LONGEST_SHORTFALL] // state.longest_shortfall = zero // 2
|
|||
|
str zero, [state, #SHORTFALL_LIMIT] // state.shortfall_limit = zero // 2
|
|||
|
str zero, [state, #THRESHOLD] // state.threshold = zero // 2
|
|||
|
str zero, [state, #NEXT_MODE] // state.next_mode = zero // 2
|
|||
|
str zero, [state, #ERROR] // state.error = zero // 2
|
|||
|
|
|||
|
idle:
|
|||
|
// Wait for a mode to be requested, then set up the new mode and acknowledge the request.
|
|||
|
mode .req r3
|
|||
|
flag .req r2
|
|||
|
zero .req r0
|
|||
|
|
|||
|
// Read the requested mode and check flag to see if this is a new request. If not, ignore.
|
|||
|
ldr mode, [state, #REQUESTED_MODE] // mode = state.requested_mode // 2
|
|||
|
lsr flag, mode, #16 // flag = mode >> 16 // 1
|
|||
|
beq idle // if flag == 0: goto idle // 1 thru, 3 taken
|
|||
|
|
|||
|
// Otherwise, this is a new request. The M4 is blocked at this point,
|
|||
|
// waiting for us to clear the request flag. So we can safely write to
|
|||
|
// all parts of the state.
|
|||
|
|
|||
|
// Set the new mode as both active & next.
|
|||
|
uxth mode, mode // mode = mode & 0xFFFF // 1
|
|||
|
str mode, [state, #ACTIVE_MODE] // state.active_mode = mode // 2
|
|||
|
str mode, [state, #NEXT_MODE] // state.next_mode = mode // 2
|
|||
|
|
|||
|
// Don't reset counts on a transition to IDLE.
|
|||
|
cmp mode, #MODE_IDLE // if mode == IDLE: // 1
|
|||
|
beq ack_request // goto ack_request // 1 thru, 3 taken
|
|||
|
|
|||
|
// For all other transitions, reset counts.
|
|||
|
mov zero, #0 // zero = 0 // 1
|
|||
|
str zero, [state, #M0_COUNT] // state.m0_count = zero // 2
|
|||
|
str zero, [state, #M4_COUNT] // state.m4_count = zero // 2
|
|||
|
str zero, [state, #NUM_SHORTFALLS] // state.num_shortfalls = zero // 2
|
|||
|
str zero, [state, #LONGEST_SHORTFALL] // state.longest_shortfall = zero // 2
|
|||
|
str zero, [state, #THRESHOLD] // state.threshold = zero // 2
|
|||
|
str zero, [state, #PREV_LONGEST_SHORTFALL] // prev_longest_shortfall = zero // 2
|
|||
|
str zero, [state, #ERROR] // state.error = zero // 2
|
|||
|
mov shortfall_length, zero // shortfall_length = zero // 1
|
|||
|
mov count, zero // count = zero // 1
|
|||
|
|
|||
|
ack_request:
|
|||
|
// Clear SGPIO interrupt flag, which the M4 set to get our attention.
|
|||
|
str flag, [sgpio_int, #INT_CLEAR] // SGPIO_CLR_STATUS_1 = flag // 8
|
|||
|
|
|||
|
// Write back requested mode with the flag cleared to acknowledge the request.
|
|||
|
str mode, [state, #REQUESTED_MODE] // state.requested_mode = mode // 2
|
|||
|
|
|||
|
// Dispatch to appropriate loop.
|
|||
|
//
|
|||
|
// This code is arranged such that the branch to rx_loop is the
|
|||
|
// unconditional one - which is necessary since it's too far away to
|
|||
|
// use a conditional branch instruction.
|
|||
|
cmp mode, #MODE_WAIT // if mode < WAIT: // 1
|
|||
|
blt idle // goto idle // 1 thru, 3 taken
|
|||
|
beq wait_loop // elif mode == WAIT: goto wait_loop // 1 thru, 3 taken
|
|||
|
cmp mode, #MODE_RX // if mode > RX: // 1
|
|||
|
bgt tx_loop // goto tx_loop // 1 thru, 3 taken
|
|||
|
b rx_loop // goto rx_loop // 3
|
|||
|
|
|||
|
tx_zeros:
|
|||
|
|
|||
|
// Write zeros to SGPIO.
|
|||
|
mov zero, #0 // zero = 0 // 1
|
|||
|
str zero, [sgpio_data, #SLICE0] // SGPIO_REG_SS[SLICE0] = zero // 8
|
|||
|
str zero, [sgpio_data, #SLICE1] // SGPIO_REG_SS[SLICE1] = zero // 8
|
|||
|
str zero, [sgpio_data, #SLICE2] // SGPIO_REG_SS[SLICE2] = zero // 8
|
|||
|
str zero, [sgpio_data, #SLICE3] // SGPIO_REG_SS[SLICE3] = zero // 8
|
|||
|
str zero, [sgpio_data, #SLICE4] // SGPIO_REG_SS[SLICE4] = zero // 8
|
|||
|
str zero, [sgpio_data, #SLICE5] // SGPIO_REG_SS[SLICE5] = zero // 8
|
|||
|
str zero, [sgpio_data, #SLICE6] // SGPIO_REG_SS[SLICE6] = zero // 8
|
|||
|
str zero, [sgpio_data, #SLICE7] // SGPIO_REG_SS[SLICE7] = zero // 8
|
|||
|
|
|||
|
// If in TX start mode, don't count this as a shortfall.
|
|||
|
ldr mode, [state, #ACTIVE_MODE] // mode = state.active_mode // 2
|
|||
|
cmp mode, #MODE_TX_START // if mode == TX_START: // 1
|
|||
|
beq tx_loop // goto tx_loop // 1 thru, 3 taken
|
|||
|
|
|||
|
// Run common shortfall handling and jump back to TX loop start.
|
|||
|
handle_shortfall tx // handle_shortfall() // 24
|
|||
|
|
|||
|
checked_rollback:
|
|||
|
// Checked rollback handler. This code is run when the M0 is in a TX or RX mode, and is
|
|||
|
// placed back into IDLE mode by the M4. If there is an ongoing shortfall at this point,
|
|||
|
// it is assumed to be a shutdown artifact and rolled back.
|
|||
|
|
|||
|
// If there is no ongoing shortfall, there's nothing to do - jump back to idle loop.
|
|||
|
length .req r0
|
|||
|
mov length, shortfall_length // length = shortfall_length // 1
|
|||
|
cmp length, #0 // if length == 0: // 1
|
|||
|
beq idle // goto idle // 3
|
|||
|
|
|||
|
// Otherwise, roll back the state to ignore the current shortfall, then jump to idle.
|
|||
|
prev .req r0
|
|||
|
ldr prev, [state, #PREV_LONGEST_SHORTFALL] // prev = prev_longest_shortfall // 2
|
|||
|
str prev, [state, #LONGEST_SHORTFALL] // state.longest_shortfall = prev // 2
|
|||
|
ldr prev, [state, #NUM_SHORTFALLS] // prev = num_shortfalls // 2
|
|||
|
sub prev, #1 // prev -= 1 // 1
|
|||
|
str prev, [state, #NUM_SHORTFALLS] // state.num_shortfalls = prev // 2
|
|||
|
|
|||
|
b idle // goto idle // 3
|
|||
|
|
|||
|
tx_loop:
|
|||
|
|
|||
|
// Wait for and clear SGPIO interrupt.
|
|||
|
await_sgpio tx // await_sgpio() // 34
|
|||
|
|
|||
|
// Check if there is a mode change request.
|
|||
|
// If so, we may need to roll back shortfall stats.
|
|||
|
on_request checked_rollback // 4
|
|||
|
|
|||
|
// Check if there is enough data in the buffer.
|
|||
|
//
|
|||
|
// The number of bytes in the buffer is given by (m4_count - m0_count).
|
|||
|
// We need 32 bytes available to proceed. So our margin, which we want
|
|||
|
// to be positive or zero, is:
|
|||
|
//
|
|||
|
// buf_margin = m4_count - m0_count - 32
|
|||
|
//
|
|||
|
// If there is insufficient data, transmit zeros instead.
|
|||
|
buf_margin .req r0
|
|||
|
ldr buf_margin, [state, #M4_COUNT] // buf_margin = m4_count // 2
|
|||
|
sub buf_margin, count // buf_margin -= count // 1
|
|||
|
sub buf_margin, #32 // buf_margin -= 32 // 1
|
|||
|
bmi tx_zeros // if buf_margin < 0: goto tx_zeros // 1 thru, 3 taken
|
|||
|
|
|||
|
// Update buffer pointer.
|
|||
|
update_buf_ptr // update_buf_ptr() // 3
|
|||
|
|
|||
|
// At this point we know there is TX data available.
|
|||
|
// Set active mode to TX_RUN (it might still be TX_START).
|
|||
|
mov mode, #MODE_TX_RUN // mode = TX_RUN // 1
|
|||
|
str mode, [state, #ACTIVE_MODE] // state.active_mode = mode // 2
|
|||
|
|
|||
|
// Write data to SGPIO.
|
|||
|
ldm buf_ptr!, {r0-r3} // r0-r3 = buf_ptr[0:16]; buf_ptr += 16 // 5
|
|||
|
str r0, [sgpio_data, #SLICE0] // SGPIO_REG_SS[SLICE0] = r0 // 8
|
|||
|
str r1, [sgpio_data, #SLICE1] // SGPIO_REG_SS[SLICE1] = r1 // 8
|
|||
|
str r2, [sgpio_data, #SLICE2] // SGPIO_REG_SS[SLICE2] = r2 // 8
|
|||
|
str r3, [sgpio_data, #SLICE3] // SGPIO_REG_SS[SLICE3] = r3 // 8
|
|||
|
ldm buf_ptr!, {r0-r3} // r0-r3 = buf_ptr[0:16]; buf_ptr += 16 // 5
|
|||
|
str r0, [sgpio_data, #SLICE4] // SGPIO_REG_SS[SLICE4] = r0 // 8
|
|||
|
str r1, [sgpio_data, #SLICE5] // SGPIO_REG_SS[SLICE5] = r1 // 8
|
|||
|
str r2, [sgpio_data, #SLICE6] // SGPIO_REG_SS[SLICE6] = r2 // 8
|
|||
|
str r3, [sgpio_data, #SLICE7] // SGPIO_REG_SS[SLICE7] = r3 // 8
|
|||
|
|
|||
|
// Update counts.
|
|||
|
update_counts // update_counts() // 4
|
|||
|
|
|||
|
// Jump to next mode if threshold reached, or back to TX loop start.
|
|||
|
jump_next_mode tx // jump_next_mode() // 13
|
|||
|
|
|||
|
wait_loop:
|
|||
|
|
|||
|
// Wait for and clear SGPIO interrupt.
|
|||
|
await_sgpio wait // await_sgpio() // 34
|
|||
|
|
|||
|
// Check if there is a mode change request.
|
|||
|
// If so, return to idle.
|
|||
|
on_request idle // 4
|
|||
|
|
|||
|
// Update counts.
|
|||
|
update_counts // update_counts() // 4
|
|||
|
|
|||
|
// Jump to next mode if threshold reached, or back to wait loop start.
|
|||
|
jump_next_mode wait // jump_next_mode() // 15
|
|||
|
|
|||
|
rx_loop:
|
|||
|
|
|||
|
// Wait for and clear SGPIO interrupt.
|
|||
|
await_sgpio rx // await_sgpio() // 34
|
|||
|
|
|||
|
// Check if there is a mode change request.
|
|||
|
// If so, we may need to roll back shortfall stats.
|
|||
|
on_request checked_rollback // 4
|
|||
|
|
|||
|
// Check if there is enough space in the buffer.
|
|||
|
//
|
|||
|
// The number of bytes in the buffer is given by (m0_count - m4_count).
|
|||
|
// We need space for another 32 bytes to proceed. So our margin, which
|
|||
|
// we want to be positive or zero, is:
|
|||
|
//
|
|||
|
// buf_margin = buf_size - (m0_count - state.m4_count) - 32
|
|||
|
//
|
|||
|
// which can be rearranged for efficiency as:
|
|||
|
//
|
|||
|
// buf_margin = m4_count + (buf_size - 32) - m0_count
|
|||
|
//
|
|||
|
// If there is insufficient space, jump to shortfall handling.
|
|||
|
buf_margin .req r0
|
|||
|
ldr buf_margin, [state, #M4_COUNT] // buf_margin = state.m4_count // 2
|
|||
|
add buf_margin, buf_size_minus_32 // buf_margin += buf_size_minus_32 // 1
|
|||
|
sub buf_margin, count // buf_margin -= count // 1
|
|||
|
bmi rx_shortfall // if buf_margin < 0: goto rx_shortfall // 1 thru, 3 taken
|
|||
|
|
|||
|
// Update buffer pointer.
|
|||
|
update_buf_ptr // update_buf_ptr() // 3
|
|||
|
|
|||
|
// Read data from SGPIO.
|
|||
|
ldr r0, [sgpio_data, #SLICE0] // r0 = SGPIO_REG_SS[SLICE0] // 10
|
|||
|
ldr r1, [sgpio_data, #SLICE1] // r1 = SGPIO_REG_SS[SLICE1] // 10
|
|||
|
ldr r2, [sgpio_data, #SLICE2] // r2 = SGPIO_REG_SS[SLICE2] // 10
|
|||
|
ldr r3, [sgpio_data, #SLICE3] // r3 = SGPIO_REG_SS[SLICE3] // 10
|
|||
|
stm buf_ptr!, {r0-r3} // buf_ptr[0:16] = r0-r3; buf_ptr += 16 // 5
|
|||
|
ldr r0, [sgpio_data, #SLICE4] // r0 = SGPIO_REG_SS[SLICE4] // 10
|
|||
|
ldr r1, [sgpio_data, #SLICE5] // r1 = SGPIO_REG_SS[SLICE5] // 10
|
|||
|
ldr r2, [sgpio_data, #SLICE6] // r2 = SGPIO_REG_SS[SLICE6] // 10
|
|||
|
ldr r3, [sgpio_data, #SLICE7] // r3 = SGPIO_REG_SS[SLICE7] // 10
|
|||
|
stm buf_ptr!, {r0-r3} // buf_ptr[0:16] = r0-r3; buf_ptr += 16 // 5
|
|||
|
|
|||
|
// Update counts.
|
|||
|
update_counts // update_counts() // 4
|
|||
|
|
|||
|
// Jump to next mode if threshold reached, or back to RX loop start.
|
|||
|
jump_next_mode rx // jump_next_mode() // 12
|
|||
|
|
|||
|
rx_shortfall:
|
|||
|
|
|||
|
// Run common shortfall handling and jump back to RX loop.
|
|||
|
handle_shortfall rx // handle_shortfall() // 24
|
|||
|
|
|||
|
// The linker will put a literal pool here, so add a label for clearer objdump output:
|
|||
|
constants:
|