Residual energy-based clustering in UAV-aided wireless sensor networks for surveillance and monitoring applications

Aim : Unmanned aerial vehicle (UAV)-aided wireless sensor networks (WSNs) are effectively used for surveillance, monitoring, and rescue applications in military and commercial domains. In UAV-aided WSNs (UWSNs), efficient data gathered from sensor nodes are desired to enhance network performance. However, communication between UAV and sensor nodes is challenging due to the high mobility of the UAV nodes due to the uneven energy consumptions. We will come up with further analysis and validation of our work in the future.


INTRODUCTION
In wireless sensor networks (WSNs), a number of sensor nodes are deployed randomly in a geographical area. The sensor nodes are battery-operated in an unattended mode over time without being charged and replaced. Thus, energy efficiency is one of the most critical design issues in WSNs as it determines the network lifetime. The primary source of energy depletion in sensor nodes is the radio activities of transmitting and receiving. In WSNs, if the sensor nodes have to transfer their data to a base station (BS) in a multi-hop manner, a large amount of energy is wasted. Hence, mobile sinks for collecting data from the sensor nodes are getting popular due to their improved network coverage and energy utilization [1,2] . Usually, mobile sinks are used in two ways. To collect the sensed data, a mobile sink either goes for a random walk to find the scattered nodes or follows a predefined path.
Unmanned aerial vehicles (UAVs) can be employed in many different applications [3][4][5] . In a UAV-aided WSN (UWSN), the ground-to-air communication from sensor nodes to UAV is effectively utilized, in which the UAV plays the role of a mobile sink. With significant advancements in wireless communication and UAV technologies, many UWSN applications are found in the literature, which are targeted towards monitoring, surveillance, and tracking in disaster and time-critical environments [6][7][8][9] . UWSNs differ from both WSNs and UAV networks. Hence, the clustering algorithms designed for WSNs and UAV networks do not precisely address the issues of UWSNs. In UWSNs, communication from UAV to sensor nodes is typically performed in a broadcast or multicast manner. The number of UAVs is much smaller (i.e., one or a few) than that of deployed sensor nodes in a UWSN. On the other hand, communication from sensor nodes to UAV is performed in a unicast manner. Many sensor nodes are redundantly deployed in the target area, and thus, multiple sensor nodes may contain the same data. If all those redundantly deployed sensor nodes need to transfer their data to the UAV individually, then not only the transmission time but also the node energy is excessively consumed. Furthermore, concerning the high mobility of the UAV, it cannot stay long collecting data from a number of sensor nodes.
Consequently, there might be a large number of packet losses, which may directly affect the network efficiency and reliability. As an efficient solution, the clustering of sensor nodes helps improve network lifetime and reliability in UWSNs. Each cluster head (CH) aggregates the sensed data as in [10] from its cluster members (CMs) and transmits the aggregated data to the UAV. As a result, the collision between sensor nodes while communicating with UAV is significantly reduced. As collisions and retransmissions are decreased, less energy is consumed, and the delay of packet delivery is also markedly reduced. Therefore, clustering the sensor nodes into some logical groups is preferred for saving energy and prolonging network lifetime in UWSNs. Clustering algorithms proposed so far for UWSNs have not considered the uneven energy consumption in sensor nodes due to their roles in CH-to-UAV communication. This may lead to the early death of some sensor nodes, which degrades network performance and decreases network lifetime.
In this paper, we propose a residual energy-based clustering algorithm for sensor-to-UAV communication in UWSNs, which reduces the energy consumption of sensor nodes and improves network lifetime. The data delivery ratio is also improved due to the reduction of collisions and retransmissions. Our performance study shows that the proposed clustering algorithm is better than the conventional one in terms of network lifetime and data delivery ratio. This paper is an extended version of our preliminary work [11] . In the initial work [11] , the clustering algorithm was not perfect concerning its completeness, and the paper was presented with limited results of simulations in some scenarios. In the present paper, however, we not only refined the clustering algorithm thoroughly and more rigorously with new formalism but also performed extensive simulations in various scenarios.

Organization of the paper
The remaining part of the paper is organized as follows. Related works are summarized and reviewed in the following section. In the third section, the proposed residual energy-based clustering is presented with regard to system model and cluster formation in detail. In the fourth section, the performance of the proposed scheme is evaluated and compared to that of the conventional one through the MATLAB simulation. Finally, the paper is concluded in the fifth section.
In this paper, several symbols and abbreviations are used. Table 1 shows the notations used in the article along with their meaning.

Related works
Incorporation of the UAV with different wireless technologies has been done for the enhancement of network lifetime and network coverage. In UWSNs, however, the high mobility of the UAV limits the communication duration between the UAV and sensor nodes [12,13] . Furthermore, a number of sensor nodes tend to communicate with the UAV once the UAV enters within the transmission range of the sensor nodes, leading to collisions. Hence, the clustering technique has been widely used, and considerable research efforts have been made on this topic [14,15] . A distributed energy-efficient and fault-tolerant clustering algorithm is introduced in [16] . Low-energy adaptive clustering hierarchy (LEACH) is a wellknown clustering mechanism to select CHs using probability [17] . LEACH assemblies network in clusters based on received signal strength. In LEACH, nodes are either ordinary CMs or CHs. Every CM sends the sensed data to its CH after clustering.
A framework, where the UAV conveys information from sensor nodes to the sink and the sink selects a set of CHs, is presented in [18] . Using UAV for the selection of CHs helps to mitigate the threat of compromised CHs. Different security-driven CH selection schemes are categorized and compared with communication and computation overheads. The clustering approach in UWSNs, where the UAV flies to sensor nodes, collects data, and then transfers data to BS, is studied in [19] . It uses Kmeans++ for the uneven distributions of sensor nodes and the fuzzy logic method to dynamically choose CHs according to sensor node energy and storage. The selection of the CHs based on the maximum number of hops between a CM and the CH is proposed in [20] . Principally, if the maximum number of hops decreases, the number of clusters increases, and vice versa. Clustering based on particle swarm optimization is discussed in [21] to minimize energy consumption, bit error rate, and UAV travel time. A bio-inspired clustering that uses the dragonfly algorithm for cluster formation and management of UAVs is studied in [22] . CH selection is made on the basis of the connectivity with BS along with the fitness function which consists of the residual energy and position of UAVs.
A clustering scheme for UWSNs, where sensor nodes are grouped into clusters and where the sensor nodes communicate with the nearest CH only, is proposed in [23] . CHs then aggregate data and communicate with the UAV. Both the received signal strength indicator (RSSI) values of the radio signal received from the UAV and the remaining energy levels of the sensor nodes are used for the selection of CHs in [24] . CMs are also selected considering the RSSI values of the CHs. To provide an energy-efficient solution for a UAV-supported multilevel architecture, user equipments (UEs) are grouped into clusters in [25,26] . UEs then choose their role to be either CH or CM in a distributed fashion, following the theory of minority games. Subsequently, CMs use stochastic learning to select a CH. A cluster formation mechanism that considers physical proximity and communication interest is presented in [27] . CH selection is made on the basis of UE's energy availability. The heuristic algorithm to minimize the energy consumption of sensor nodes in UWSN applications is presented in [28] . The UAV is supposed to fly over each of the clusters, gets information about the sensor nodes, and selects which node to contact. An energy-efficient swarm intelligence-based clustering algorithm, in which the particle fitness function is exploited for inter-cluster distance, intracluster distance, residual energy, and geographic location, is studied in [29] . Several hierarchical and energyefficient clustering algorithms are analyzed and compared in [30,31] . In some clustering schemes [32] , cluster heads are first elected. The cluster formation is then completed after a CH broadcasts a declaration message and after its members respond to it via a join message. A summary of the clustering approaches used in UWSNs is given in Table 2.

System model of residual energy-based clustering
The system model in our study is presented in this subsection, which includes the network model and the assumptions for the proposed clustering algorithm. In UWSNs, the two subsystems of the UAV and WSN are involved. In our proposed work, a WSN consists of homogenous sensor nodes distributed randomly in a ground area with the BS located at a distance away from the sensor nodes. The sensor nodes are static and identified with a distinct identifier. Sensor nodes use the same transmission power throughout the communication process. We considered a fixed-wing UAV for our work, which flies at constant height (h) and velocity (v) to collect data from CHs. The UAV is equipped with a directional antenna with flare angle . The UAV is responsible for collecting the sensed data from the sensor nodes using the predefined path and then transferring the data to the BS. We have narrowed down our work towards the clustering approach in UWSNs, which is conducted after the deployment of sensor nodes and before the arrival of the UAV. Clustering is divided into CH selection and cluster formation. Sensor nodes are capable of sensing, storing the sensed data, and transmitting the data to their CH. However, the network topology may change because of the death of nodes due to energy depletion. According to a simplified energy consumption model in [32] , the energy spent on transmitting and receiving k bits of data over distance d can be represented as (1) and (2) respectively, where E elec is the electrical energy for transmitter/receiver, E fs is the free-space energy loss, E amp is the energy for amplifier, and d o is the threshold value of distance. The framework of a cluster-based UWSN is shown in Figure 1.

Cluster formation
There is a set of n sensor nodes, S = {s 1 , s 2 , s 3 , …, s n }, and each sensor node s i has z attributes (i.e., residual energy, energy consumption rate, etc.) represented by a vector A = {a 1 , a 2 , a 3 , …, a n }. A sensor node s i can be in one of the two states of CH and CM. The traffic load for CH and each CM is different, leading to unbalanced energy consumption in sensor nodes. As a result, few nodes may die very soon, which

Protocol
Clustering type Metrics addressed Energy efficient fault-tolerant clustering algorithm [16] Distributed energy-efficient and faulttolerant clustering Distance, energy, and cluster cardinality WSN clustering for UAV-based data gathering [19] Set-covering problem-based clustering Minimum hops between CH and CMs Energy-efficient and low packet loss clustering [20] Kmeans++ and fuzzy logic -based clustering Energy and storage of sensor nodes Cluster-based communication topology selection [21] PSO-based clustering BER and UAV travel time Bio-inspired clustering [22] Dragonfly algorithm for cluster formation and management Cluster building time and successful delivery probability Clustering for sensor networks with mobile agents [23] Optimal cluster size-based clustering Energy consumption and latency Distributed clustering in UWSNs [24] Dynamic and distributed clustering Energy consumption and stable and wellconnected CHs Self-adaptive energy-efficient operation [25,26] Minority games and reinforcement learning-based clustering Energy and UAV position Socio-spatial resource management in wireless powered public safety networks [27] Chinese restaurant process-based cluster formation Communication interest, physical distance, and energy Heuristic algorithm and cooperative relay for UWSNs [28] Cooperative data relay and optimization scheme-based clustering Average energy and flight distance LEACH [17,32] Distance-based distributed clustering Network lifetime and latency ultimately affects the overall network operation. Owing to the energy constrained, the number of clusters and the number of CMs in a cluster should be proportional to the residual energy in CHs.
To resolve the issue mentioned above in UWSNs, a residual energy-based clustering scheme is proposed in this paper, in which each cluster has a different cluster size. Note that the cluster size means the number of sensor nodes in a cluster in this paper. After the sensor nodes are deployed in their application area, they start the cluster formation process. During the process, every node having energy higher than the minimum threshold energy (E thres ) transmits its status to all other nodes within its transmission range. Other nodes whose energy is below E thres will not compete for the CH selection process. The status message includes node identifier, residual energy, and location coordinates. For any sensor node , the probability to be elected as CH for a particular round t is given by where E rem (t) is the remaining energy of CH at time t, N s is the number of nodes competing for CH, and is the aggregated network energy. After CH selection in a particular region, no cluster is selected with the competition range (Comp Ri ) of that CH, which is calculated as (4) where α is a constant whose value ranges between 0 and 1. Range Ci is the communication range of CH, Max (d) denotes maximum distance to select another CH, and d(C i ,C k ) is the distance between two clusters.
Based on the energy level of CH, the size of the cluster (i.e., the number of sensor nodes in a cluster) is decided as After calculating the Cluster_size value, CH broadcasts the CH status message. Depending on the signal strength, other nodes estimate the channel quality of the node-to-CH link and the distance away from the CH. The estimated values are used to find the communication cost between a sensor node (SN) and its CH as (6) where d SN-CH and SNR SN-CH are the distance from a node to CH and the signal-to-noise ratio between node and CH, respectively, which indicate the channel quality. The nodes request to join the CH that has the best value of communication cost. When the cluster size reaches the maximum, further requests are denied. Afterward, nodes request to join the next best CH on the basis of the next lowest value of communication cost. The cluster formation process is illustrated in Algorithm 1 and detailed with the help of Figure 2.
At the beginning of CH selection, every SN broadcasts its status (node ID, residual energy, and the location coordinates) within its neighborhood. The SN with the highest value of residual energy is elected as CH for that region. The competitive range of that CH is then calculated using equation (2). No more CH is then selected within that competitive range, which helps to limit the number of CHs. The selected CHs compute their Cluster_size as in equation (3), which takes the residual energy of CHs into account. On the other hand, CMs calculate Comm cost using equation (4) and maintain the CH candidate list in the increasing order of Comm cost . SNs send join requests to the CH having the least Comm cost value. The CH accepts join requests until cluster size limit is met. If Cluster_size exceeds the limit, the CH then broadcasts cluster limit status. The SN that receives cluster limit status removes the CH from its candidate list and then sends a join request to the next CH. After clustering, every sensor node including CH senses the attributes and transmits them to its CH periodically. In every round of CH-to-UAV communication, the residual energy of CH is checked. If the energy of CH is found to be less than or equal to E thres , a new CH is selected accordingly.
Input: A set of n sensor nodes, S = {s 1 , s 2 , s 3 , …, s n }, and the attributes of n sensor nodes, A = {a 1 , a 2 , a 3 , …, a n } Output: CHs selection and cluster formation // Cluster head election 1.
Every SN having energy higher than the threshold value broadcasts its energy status within its neighborhood. // broadcast message includes node ID, residual energy, and location coordinates.

2.
SNs with highest energy are elected as CHs in a distributed manner. 3.
if SN is elected as CH // node with the highest energy 4.
CH calculates Cluster_size and broadcasts CH status within its neighborhood. 5. else 6.
SN receives CH status from CHs and includes them in CH candidate list. 7.
end if // Operation of cluster head 8.
for every CH do 9.
if Cluster_size limit is reached then 10.
else if a join request is received from SN then 12.
Accept the join request message 13.
Send join reply message to the SN 14.
end for // Operation of cluster member 16. for every sensor node except CH do 17.
if receives Cluster_size limit status then

18.
Remove the CH from CH candidate list 19.
Find next CH with the lowest Comm cost from CH candidate list 20.
Send join request to the CH 21.
else if receives join reply from CH then 22.
SN becomes a member of the CH. In each round of communication, CMs sense the environment and transfer the sensed data to its CH. The energy consumed by any CM during inter-cluster transmission is given as where E sense is the energy dissipated while sensing, d CH is the distance between CM and CH, and K aCK is the length of acknowledgment packet coming from CH in bits. After a CH receives information from its CMs, it aggregates the data and transmits the aggregated data to the UAV. If E agg is the energy used for data aggregation, the energy consumed by the CH is given by (8) where d CH-CM(i) is the distance between CH and CM(i), d (CH-UAV) is the distance between UAV and CH, and R t is the number of re-attempts before transmitting the data to UAV. Because the number of CMs in the cluster and the number of CHs in the network are defined and limited by the residual energy and the competition range of a node as in equations (4) and (5), the energy consumed by the nodes will be proportionate, which results in prolonged network lifetime.

RESULTS
In this section, we evaluate the performance of the proposed algorithm in terms of network lifetime and the number of delivered packets via the MATLAB simulation. The proposed scheme is then compared with the conventional clustering scheme LEACH [32] in which CH is selected based on the Euclidean distance with other nodes. LEACH is a widely used clustering algorithm for wireless communication. In LEACH, CH is chosen in random rotation periods, but the node energy is not considered for CH selection, leading to the early death of some sensor nodes.

Simulation environment
We conducted the simulation in an area of 100 m × 100 m. We assume that 50-200 sensor nodes are randomly deployed in the network area. We also consider that all the sensor nodes have the same communication range and the same initial energy. We consider the fixed-wing UAV with a constant speed of 10 m/s and a height of 15 m. The flight path and schedule of UAV are predefined. Table 3 summarizes the simulation parameters used in our performance study.

Performance metrics
WSNs are particularly used in remote and inaccessible places where limited energy always remains a major challenge. In our proposed algorithm, a UWSN is applicable for surveillance and monitoring applications where the maximum data collection with the minimum possible energy is highly desired. The performance metrics considered for our simulation are as follows: Network lifetime: Interest in energy-efficient communication is attributed to the limitations imposed by battery-powered sensor nodes. The performance of the network is significantly degraded with the death of the sensor nodes. To help the extension of the sensor nodes' lifetime, a clustering algorithm that balances the rate of energy consumption in CH and CM is proposed. Residual energy-based clustering prevents the early death of the nodes by adjusting the amount of traffic to the nodes. In our simulation study, the network lifetime is measured by the number of alive nodes, as shown in Figure 3.  Number of delivered packets: Data collection efficiency is another essential metric that defines the performance of UWSNs. If the network experiences high packet loss due to the nodes' death, the BS may not make good decisions. This may result in the loss of enormous life and property in some critical applications. In our work, we have increased the number of collected packets by suppressing the early death of the sensor nodes due to energy depletion. The number of the delivered packet can also be used for the metric for data collection efficiency.

Simulation results and discussion
In this subsection, the simulation results are summarized and comparatively discussed. In Figure 3, UAVbased data collection from the ground sensor nodes is shown. For simplicity, the height and velocity of the UAV are kept constant. The UAV follows a simple predefined path and collects data from CHs during its flight. In the simulation, we varied the number of rounds and observed the results. Figure 4 represents the total number of alive nodes versus simulation time in rounds. The three different cases of network density are taken into simulation; i.e., 50, 100, and 200 nodes are deployed in the same network area. From Figure 4, we can infer that fewer sensor nodes are dead in the proposed clustering scheme compared to the conventional one [32] for the three different cases of network density. As a result, network lifetime is significantly prolonged with our residual energy-based clustering algorithm. Figure 5 shows the number of packets that are delivered to the UAV versus simulation time in rounds. As in the network lifetime evaluation, the three different network densities (50, 100, and 200 nodes) are taken into simulation for the same network area. As the number of dead nodes is decreased in a network, the number of delivered packets (i.e., packet delivery ratio) is significantly improved. In other words, packet loss is markedly reduced. By comparing the performance of the proposed algorithm with that of the conventional one [32] , it is easily inferred that the packet delivery ratio is significantly improved in the proposed clustering scheme. In summary, the proposed clustering algorithm outperforms the conventional one in terms of network lifetime and data delivery ratio. This is mainly due to the fact that the residual energy is effectively exploited in clustering sensor nodes for energy-efficient sensor-to-UAV communication in the proposed clustering algorithm.

DISCUSSION
In UWSNs, the energy consumption is unbalanced due to the variable traffic load in the sensor nodes. As a result, some nodes die earlier, and thus, the network may not function properly. In this paper, this issue is addressed and resolved. That is, we proposed a residual energy-based clustering algorithm for sensor-to-UAV communication in UWSNs, in which cluster size is determined according to the available residual energy level. Our simulation results show that the proposed clustering scheme outperforms the conventional one in terms of network lifetime and data delivery ratio.
In this work, we narrowed our research towards the CH selection and cluster formation in UWSNs focusing on surveillance and monitoring applications, but an in-depth theoretical analysis and its validation are not addressed. We will come up with the in-depth analysis and its validation in a future work. UAV path planning for efficient data collection and other different issues such as secure communications and faulttolerant mechanisms are left as another future work. Moreover, proper synchronization and cooperation