Affiliations: [a] School of Computing, DIT University, Uttarakhand, India | [b] School of Engineering & Computing, Dev Bhoomi Uttarakhand University, Uttarakhand, India
Abstract: Twitter and Facebook are widely recognized as crucial tools for situational information during disasters. Given that the classification of disaster related tweets is computationally challenging due to the high dimension of textual data caused by the redundant and irrelevant features. Hence for optimal feature selection (FS) and classification of disaster tweets, this work utilizes binary salp swarm algorithm (BSSA) and proposed two enhancements over it (PBcSSA). The commensalism phase from symbiotic organisms search (SOS) is integrated with BSSA to enhance its feature space searchability and then its parallel implementation is done using Apache Spark framework to reduce the execution time. The experiments were performed in a cross-disaster setting on nine groups of datasets including biological, earthquake, flood, hurricane, industrial, societal, transportation, wildfire, and environmental. The proposed PBcSSA combined with the Naive Bayes (NB) classifier in wrapper mode and its performance is compared with standard BSSA, binary sine cosine algorithm (BSCA), binary particle swarm optimization (BPSO), binary grey wolf optimization (BGWO), and binary whale optimization algorithm (BWOA). The experimental results reveal that the proposed PBcSSA outperforms other algorithms in disaster tweet classification and achieved highest average F1-score with lowest feature set in a reduced execution time.