Joining and Rejoining #
The most intricate aspect of a device’s interaction with a LoRaWAN network occurs when it joins. The join process is important because it is the only time that a device must receive information from the network (in the form of the join accept message). The join process is also critical because the timing of the receive windows allows only two opportunities to receive the join acknowledgement after the device transmits the join request. Adding to the importance of this timing is the likely scenario that devices will be shipped and activated in batches. This means that many, and maybe many thousands, of devices will be trying to join the network in a short time period. You need to have an effective strategy for joining the network that allows the most devices to gain network access and consume the smallest amount of power in the process. To accomplish this, you should make sure that the devices randomly use all available frequencies. Some regions specify the join frequency selection explicitly (for example, EU863). Other regions leave the join frequencies open to the available spectrum. Consequently, designers must think about the strategy that increases the chance of success in the shortest period to minimize radio operation time and therefore save power. This is most important in Class A operations. In addition to random frequency selection, randomly selecting available DataRates (DRs) for the region that you want is also important.
Note: It is important to vary the DR. If you always choose a low DR, join requests will take much more time on air. Join requests will also have a much higher chance of interfering with other join attempts as well as with regular message traffic from other devices. Conversely, if you always use a high DR and the device trying to join the network is far away from the LoRaWAN gateway or sitting in an RF-obstructed or null region, the gateway may not receive a device’s join request. Given these realities, randomly vary the DR and frequency to defend against low signals while balancing against on-air time for join requests.
ALOHA #
The LoRaWAN transmission is related to the very first shared-frequency radio packet transmission network. This first network, called ALOHAnet1, was developed at the University of Hawaii in 1971 to send data between the different facilities in Hawaii. Since all the stations in the network shared a single uplink and single downlink frequency, they needed a way for the various radios to manage their packets over the air. The mechanism employed to solve this coordination was transmit with ACK. An uplink packet was sent and, if received, an ACK packet was sent back in the subsequent downlink. If the ACK was received, the uplink station knew its outgoing message had been received. Otherwise the uplink station would wait a random period and retransmit until successful. Later enhancements of this technique varied the random retry period by linear, multiplicative or exponential factors2. The goal of the network is to settle on the maximum rate of data transmission (for example, the fewest retries at the shortest retry intervals). The LoRaWAN joining process shares this structure in that join requests must be acknowledged before data packets can be sent over the channel. However, if many devices are trying to join simultaneously, they can jam each other. As a result, the devices will need to retry joining. By randomly selecting a retry period, devices can automatically “spread out” their attempts in a short period even when they might have been mostly synchronous at some time in the past.
Retransmissions Back-off #
A suggested “back-off” schedule is 15 seconds, 30 seconds, one minute, five minutes, 30 minutes and 60 minutes (with 60 minutes repeating). In this schedule, each back-off is the maximum time and the retry attempts occur between 15 seconds and the set back-off time. The application and the LoRaWAN specification drive this schedule. The LoRaWAN specification3?has two sections dedicated to the retransmissions of messages (including join requests) for a total duration of on-air time (Section 7) and DR back-off (Section 18.4) for retransmissions. The on-air limits for retransmission is provided in the specification as follows:
For those frame retransmissions, the interval between the end of the RX2 slot and the next uplink retransmission SHALL be random and follow a different sequence for every device (For example using a pseudo-random generator seeded with the device’s address). The transmission duty-cycle of such message SHALL respect the local regulation and the following limits, whichever is more constraining:
This means that the total air time will be limited to 36 seconds in the first hour, 36 seconds in the next 10 hours, and then 8.7 seconds each subsequent 24-hour period after the first 11 hours. Unless the local on-air duty-cycle regulations are more stringent, the requirements in the specification?must?be followed.
We recommended that the DR should be varied randomly during the join process. If you use a fixed DR during joining, or if you send regular uplink messages with a “confirmed” frame, we suggest that you decrease the DR as the number of retries increases. As stated in the LoRaWAN specification, Section 18.4:
It is strongly recommended to adopt the following retransmission strategy. The first transmission of the confirmed frame happens with a DR.
The DR max(a,b) stands for maximum of a and b values. If the frame has not been acknowledged after a recommended eight transmissions, the MAC layer should return an error code to the application layer.
Note:?For each retransmission, the frequency channel is selected randomly as for normal transmissions.
Any further transmission uses the last DR used.
Rejoining Frequency #
Another important consideration in system design for device management is how often a device will join-check or just rejoin the network. The design of the LoRaWAN network for Class A devices enables very low power use thanks to its uplink-centric approach. This maximizes the amount of real data traffic while keeping the protocol overhead to a minimum; it does not waste power or bandwidth receiving synchronization messages from the network. The potential issue with this approach is a device that is out of sync with the network will not be aware of the situation. A device could continue to send data to the gateway without the device knowing that its information is being forwarded to the network.
The following approaches can help minimize the amount of time a device sends messages to the network that the network cannot receive.
Send Occasional Messages with an ACK Request
This does?not?mean sending an ACK request with every message. Sending an ACK request with every message is a waste of power and bandwidth. This approach recommends sending the ACK request on a reasonable subset of messages, from one to 25 percent, depending on the update rate and sensitivity to lost data if the device and network get out of sync. For example, if a device is programmed to send an update every three hours and the application can tolerate up to 24 hours of loss, only three out of 24 (or one out of eight) messages need to be sent with an ACK request. If the ACK request is not received, the device should try to rejoin within the limitations described in section 3.1.1.
Send a LinkCheckReq Message
This approach is like sending occasional messages with ACK requests and may be easier to implement when using device-side logic. By occasionally sending a zero-byte?LinkCheckReq?message, the server can respond to the device and confirm that it is still joined to the network. If the server does not receive a response, it can assume that the device is no longer on the network and can send another join request.
Start a new Join Request on a Regular Basis
A device using this approach can start a join request on a regular basis that is set at, or just below, the maximum tolerance for lost data in the application. This may result in significantly more join attempts and result in some additional network congestion. However, it will proactively set the maximum time that a device may be offline before it tries to rejoin the network.