Abu Zitar, Raed
An enhanced binary Rat Swarm Optimizer based on local-best concepts of PSO and collaborative crossover operators for feature selection
2022, Zitar, Raed, Awadallah, Mohammed A., Al-Betar, Mohammed Azmi, Braik, Malik Shehadeh, Hammouri, Abdelaziz I., Abu Doush, Iyad
In this paper, an enhanced binary version of the Rat Swarm Optimizer (RSO) is proposed to deal with Feature Selection (FS) problems. FS is an important data reduction step in data mining which finds the most representative features from the entire data. Many FS-based swarm intelligence algorithms have been used to tackle FS. However, the door is still open for further investigations since no FS method gives cutting-edge results for all cases. In this paper, a recent swarm intelligence metaheuristic method called RSO which is inspired by the social and hunting behavior of a group of rats is enhanced and explored for FS problems. The binary enhanced RSO is built based on three successive modifications: i) an S-shape transfer function is used to develop binary RSO algorithms; ii) the local search paradigm of particle swarm optimization is used with the iterative loop of RSO to boost its local exploitation; iii) three crossover mechanisms are used and controlled by a switch probability to improve the diversity. Based on these enhancements, three versions of RSO are produced, referred to as Binary RSO (BRSO), Binary Enhanced RSO (BERSO), and Binary Enhanced RSO with Crossover operators (BERSOC). To assess the performance of these versions, a benchmark of 24 datasets from various domains is used. The proposed methods are assessed concerning the fitness value, number of selected features, classification accuracy, specificity, sensitivity, and computational time. The best performance is achieved by BERSOC followed by BERSO and then BRSO. These proposed versions are comparatively assessed against 25 well-regarded metaheuristic methods and five filter-based approaches. The obtained results underline their superiority by producing new best results for some datasets.
Multiclass feature selection with metaheuristic optimization algorithms: a review
2022, Zitar, Raed, Olatunji O. Akinola, Absalom E. Ezugwu, Jeffrey O. Agushaka, Abualigah, Latih
Selecting relevant feature subsets is vital in machine learning, and multiclass feature selection is harder to perform since most classifications are binary. The feature selection problem aims at reducing the feature set dimension while maintaining the performance model accuracy. Datasets can be classified using various methods. Nevertheless, metaheuristic algorithms attract substantial attention to solving different problems in optimization. For this reason, this paper presents a systematic survey of literature for solving multiclass feature selection problems utilizing metaheuristic algorithms that can assist classifiers selects optima or near optima features faster and more accurately. Metaheuristic algorithms have also been presented in four primary behavior-based categories, i.e., evolutionary-based, swarm-intelligence-based, physics-based, and human-based, even though some literature works presented more categorization. Further, lists of metaheuristic algorithms were introduced in the categories mentioned. In finding the solution to issues related to multiclass feature selection, only articles on metaheuristic algorithms used for multiclass feature selection problems from the year 2000 to 2022 were reviewed about their different categories and detailed descriptions. We considered some application areas for some of the metaheuristic algorithms applied for multiclass feature selection with their variations. Popular multiclass classifiers for feature selection were also examined. Moreover, we also presented the challenges of metaheuristic algorithms for feature selection, and we identified gaps for further research studies.
Arabic Text Classification Using Modified Artificial Bee Colony Algorithm for Sentiment Analysis: The Case of Jordanian Dialect
2023, Abu Zitar, Raed, Habeeb, Abdallah, Otair, Mohammed A, Abualigah, Laith, Alsoud, Anas Ratib, Elminaam, Diaa Salama Abd, Ezugwu, Absalom E, Jia, Heming
Arab customers give their comments and opinions daily, and it increases dramatically through online reviews of products or services from companies, in both Arabic, and its dialects. This text describes the user’s condition or needs for satisfaction or dissatisfaction, and this evaluation is either negative or positive polarity. Based on the need to work on Arabic text sentiment analysis problem, the case of the Jordanian dialect. The main purpose of this paper is to classify text into two classes: negative or positive which may help the business to maintain a report about service or product. The first phase has tools used in natural language processing; the stemming, stop word removal, and tokenization to filtering the text. The second phase, modified the Artificial Bee Colony (ABC) Algorithm, with Upper Confidence Bound (UCB) Algorithm, to promote the exploitation ability for the minimum dimension, to get the minimum number of the optimal feature, then using forward feature selection strategy by four classifiers of machine learning algorithms: (K-Nearest Neighbors (KNN), Support vector machines (SVM), Naïve-Bayes (NB), and Polynomial Neural Networks (PNN). This proposed model has been applied to the Jordanian dialect database, which contains comments from Jordanian telecom company’s customers. Based on the results of sentiment analysis few suggestions can be provided to the products or services to discontinue or drop, or upgrades it. Moreover, the proposed model is applied to the database of the Algerian dialect, which contains long Arabic texts, in order to see the efficiency of the proposed model for short and long texts. Four performance evaluation criteria were used: precision, recall, f1-score, and accuracy. For a future step, in order to build on or use for the classification of Arabic dialects, the experimental results show that the proposed model gives height accuracy up to 99% by applying to the Jordanian dialect, and a 82% by applying to the Algerian dialect.
Comparative Study on Arabic Text Classification: Challenges and Opportunities
2023, Abu Zitar, Raed, Abualigah, Laith, Oliva, Diego, Hussien, Abdelazim G., Melhem, Mohammed K. Bani
There have been great improvements in web technology over the past years which heavily loaded the Internet with various digital contents of different fields. This made finding certain text classification algorithms that fit a specific language or a set of languages a difficult task for researchers. Text Classification or categorization is the practice of allocating a given text document to one or more predefined labels or categories, it aims to obtain valuable information from unstructured text documents. This paper presents a comparative study based on a list of chosen published papers that focus on improving Arabic text classifications, to highlight the given models and the used classifiers besides discussing the faced challenges in these types of researches, then this paper proposes the expected research opportunities in the field of text classification research. Based on the reviewed researches, SVM and Naive Bayes were the most widely used classifiers for Arabic text classification, while more effort is needed to develop and to implement flexible Arabic text classification methods and classifiers.