Statistical Comparison of Simple and Machine Learning Based Land Use and Land Cover Classification Algorithms: A Case Study

Abstract
This study used three different classification models, namely Support Vector Machine (SVM), Random Forest Machine (RFM), and Maximum Likelihood (ML) for classification of Landsat (7 & 8), and Sentinel-2A data sets. Each case’s area of interest (AOI) and number of training sets (within fixed AOI of Chennai district boundary) were considered equal. Land use class change was observed because of rapid urbanization and developmental activities under urbanization, and the LULC was monitored using the ArcGIS Pro platform for 2005, 2010, 2015 and 2020. The overall accuracy (OA) of the first, second, and third was 89%, 88%, 82%, 80% under RF, and 87%, 85%, 79%, 80% under SVM. However, the ML classifier provided the OA as 82%, 77%, 76%, 66% for 2005, 2010, 2015 and 2020, respectively. The Kappa coefficient (K) was calculated under the first, second, and third, as 84%, 79%, 75%, 72%, under RF, and 80%, 78%, 71%, 67% under SVM. However, the ML provided a K value of 77%, 67%, 67%, 57% for 2005, 2010, 2015 and 2020. Based on the quantitative assessments, the RF classifier showed good accuracy, then SVM and ML in classifications of fixed AOI with fixed training sets.
1 Introduction
Generally, under the land use/land cover (LULC) classification technique, we categorized or arranged the pixels (manual or auto) into a particular land cover. If classes were known, they were classified as supervised, otherwise they fell under unsupervised (Anderson 1976; Rajendran et al. 2020). LULC plays a vital role in planning, development, and management of local or regional land areas (Singh et al. 2015). The earth observation data is widely utilized for LULC change analysis. The earth observation data are freely available for analysis. There are many techniques that exist for LULC classification with the help of earth observational datasets. Several methods or algorithms have been developed to generate an accurate LULC database using Landsat and Sentinel-2 to examine improved classification accuracy. In the present research work, different classification models were used for Landsat (5 & 8) and Sentinel-2 (A & B) data for LULC database generation. Due to high frequency and free availability of remotely sensed data from Sentinel-2A and Landsat (8 & 5), different LULC classifiers can be utilized for understanding the natural problem (at national, continental, and global levels, as well as at local levels) like flood, drought, urbanization, agriculture expansion/change, and many others (Basommi et al. 2016; Kumar et al. 2018; Kushwaha et al. 2021). Several classification algorithms are available but among them the Maximum Likelihood Classification (MLC) is a commonly used classifier because of MLC output results (Yu et al. 2014; Singh et al. 2017; Yinga et al. 2022).
The MLC model (parametric classifiers) follows a normal distribution of data, whereas in the real world, data are not accepting such distribution. In the present scenario, machine learning (ML) based models for classification of satellite data are widely used for LULC mapping because of their high precision with respect to the MLC model. Unlike parametric classifiers, non-parametric classifiers do not follow a hypothesis about distribution of data. Non-parametric classifiers like Random Forest (RF), Support Vector Machine (SVM), which are Machine Learning (ML) based classifiers, provide accurate LULC information using earth observational data (Foody 2009; Lamine et al. 2018). The RF method is based on an ensemble technique (multiple CART-like trees). With RF, each CART works like an independent tree as classifiers. Due to individual tree properties, RF results are more accurate in terms of classification for LULC. Similarly, SVM is another ML based classifier which also provides good accuracy in term of classification for LULC due to its capability of building a hyperplane, which separates the number of classes with a smaller number of misclassified (or mixed) pixels during the training process (Singh et al. 2014; Meshram et al. 2020; Mahanta and Rawat 2020a; Maitima et al. 2009; Ndegwa Mundia and Murayama 2009). During generation or building of an optimal hyperplane, SVM uses the kernel trick, which affects the accuracy of LULC classification, which is a major drawback of SVM. Despite this, SVM is also used as a popular classifier in LULC classification of remote sensing data. This research focuses on identifying the most suitable classifier among RF, SVM, and MLC models for regional level LULC information in terms of accuracy and sampling designs for training data.
Land use and land cover (LULC) classification is a fundamental task in remote sensing and geographical information systems (GIS) that plays a crucial role in various applications such as urban planning, environmental monitoring, natural resource management, and disaster assessment. Accurate classification of land use and land cover categories is essential for understanding the dynamics of landscapes and their changes over time. Traditionally, simple classification algorithms such as maximum likelihood, spectral angle mapper, and decision trees have been widely used for this purpose. However, with the advent of machine learning techniques, there has been a growing interest in exploring their potential for improving the accuracy and efficiency of LULC classification. The rapid advancements in machine learning algorithms, particularly deep learning techniques, have led to significant improvements in various fields, including computer vision, natural language processing, and remote sensing. In recent years, machine learning-based approaches have gained popularity in LULC classification tasks due to their ability to automatically learn complex patterns and features from large datasets, thereby potentially outperforming traditional methods. Nevertheless, the question remains whether these advanced machine learning techniques truly provide superior performance compared to conventional approaches in the context of LULC classification.
This research paper aims to conduct a comprehensive statistical comparison between simple and machine learning-based LULC classification algorithms using a case study approach. The primary objective is to evaluate and compare the performance of these algorithms in terms of accuracy, efficiency, robustness, and scalability. By rigorously analyzing the strengths and limitations of both types of algorithms, this study seeks to provide insights into their relative effectiveness and identify the most suitable approaches for different types of LULC classification tasks.
The remainder of this paper is structured as follows: Section 2 provides an overview of the importance of LULC classification and its applications. Section 3 reviews the existing literature on simple and machine learning-based classification algorithms in the context of remote sensing and GIS. Section 4 outlines the research objectives and the methodology adopted for conducting the comparative analysis. Finally, Section 5 discusses the significance of this study and the potential implications of its findings for various stakeholders, including researchers, practitioners, and policymakers.
Land use and land cover classification is a critical component of environmental monitoring and management, providing valuable information about the spatial distribution and dynamics of different land cover types. Understanding LULC patterns is essential for addressing a wide range of societal and environmental challenges, including urbanization, deforestation, agricultural expansion, habitat loss, and climate change. By accurately delineating and characterizing land cover categories, researchers and policymakers can make informed decisions to support sustainable development and conservation efforts. One of the key applications of LULC classification is urban planning and management. Rapid urbanization and population growth have led to the expansion of cities and towns, resulting in changes in land use patterns and the conversion of natural landscapes into built-up areas. Accurate mapping of urban areas and infrastructure can help urban planners identify suitable locations for new developments, assess the impact of urbanization on the environment, and optimize resource allocation for infrastructure projects.
In addition to urban planning, LULC classification plays a crucial role in natural resource management and conservation. By monitoring changes in land cover over time, conservationists can identify areas at risk of deforestation, degradation, or habitat loss, and implement measures to protect biodiversity hotspots and endangered species. Moreover, LULC data can inform land-use policies and land-use planning initiatives aimed at preserving ecologically sensitive areas and promoting sustainable land management practices. Furthermore, LULC classification is essential for assessing the impact of climate change on terrestrial ecosystems.
Changes in land cover, such as the loss of forest cover or the expansion of agricultural land, can alter the local climate and hydrological cycle, leading to adverse effects on ecosystem services such as water regulation, carbon sequestration, and soil erosion control. By monitoring LULC changes using remote sensing data, scientists can better understand the drivers of land-use change and develop strategies to mitigate its negative consequences on ecosystem health and resilience. Overall, accurate and up-to-date information on land use and land cover is indispensable for addressing various environmental, social, and economic challenges facing the planet. By leveraging advances in remote sensing technology and computational methods, researchers can improve the accuracy and efficiency of LULC classification techniques, thereby enhancing their ability to monitor and manage terrestrial ecosystems effectively. The classification of land use and land cover from remotely sensed imagery has been the subject of extensive research in the fields of remote sensing, GIS, and environmental science. Over the years, a wide range of classification algorithms have been developed, ranging from traditional statistical methods to more advanced machine learning techniques.
In this section, we provide an overview of the existing literature on simple and machine learning-based classification algorithms for LULC mapping, highlighting their strengths, weaknesses, and applications. Traditional classification algorithms such as maximum likelihood, minimum distance, and decision trees have been widely used for LULC mapping due to their simplicity, interpretability, and computational efficiency. Maximum likelihood classification, for example, is based on statistical probability theory and assumes that pixel values are normally distributed within each class. Despite its simplicity, maximum likelihood classification can achieve reasonably good results when the spectral signatures of land cover classes are well-separated and distinct. However, it may perform poorly in cases where classes overlap or exhibit complex spectral variations, leading to misclassification errors. Decision tree classifiers, on the other hand, partition the feature space into hierarchical decision rules based on the spectral characteristics of training samples. Decision trees are easy to interpret and visualize, making them suitable for generating classification rules that can be easily understood by non-experts. However, decision tree classifiers are prone to overfitting, especially when applied to high-dimensional feature spaces or noisy datasets. Moreover, decision trees may struggle to capture complex relationships between spectral bands and land cover classes, limiting their accuracy and generalization ability.
In recent years, machine learning-based classification algorithms have emerged as promising alternatives to traditional methods for LULC mapping tasks. Machine learning techniques such as support vector machines (SVM), random forests, and convolutional neural networks (CNNs) have demonstrated superior performance in various remote sensing applications, including object detection, image segmentation, and land cover classification. SVM classifiers, for example, are based on the concept of finding an optimal hyperplane that separates different classes in the feature space with the maximum margin. SVMs are effective for handling non-linear decision boundaries and can generalize well to unseen data, making them suitable for LULC classification tasks with complex class distributions.
Random forest classifiers, which are ensembles of decision trees, have gained popularity in remote sensing due to their robustness, scalability, and ability to handle high-dimensional feature spaces. Random forests combine multiple decision trees trained on bootstrapped samples of the training data, thereby reducing the risk of overfitting and improving classification accuracy. By aggregating the predictions of individual trees, random forests can achieve robust performance across a wide range of LULC mapping scenarios, including those involving imbalanced or noisy datasets.
Deep learning techniques, particularly CNNs, have revolutionized the field of image analysis and computer vision, leading to significant improvements in object recognition, scene understanding, and semantic segmentation tasks. CNNs are hierarchical architectures composed of multiple layers of convolutional, pooling, and fully connected units that learn hierarchical representations of input data. CNNs have been successfully applied to LULC classification tasks, leveraging their ability to automatically extract spatial and spectral features from remote sensing imagery. By learning feature representations directly from raw pixel values, CNNs can capture complex spatial patterns and contextual relationships between neighboring pixels, thereby achieving state.
The transformation brought about by urbanization and industrialization over the past two centuries has altered land use and land cover (LULC), contributing to the deterioration of sustainable conditions for the future. It's important to note that this urbanization doesn't adhere to a linear trend (Bose and Chowdhury 2020). Sorting out how land is used and covered over time, known as Land-Use/Land-Cover (LULC) time-series classification, is an important but tricky job in remote sensing. This means using lots of labeled images over time to train a computer to predict what kind of land use or cover is in other images that are not labeled (Manzanarez et al. 2022).
As Earth Observation (EO) technology gets better, more images of the Earth's surface become available. This has led to the development and use of different methods to detect changes after sorting images and methods over time (Minale et al. 2013). By the year 2030, it is projected that small- and medium-sized cities in India will become more significant. This is expected to lead to the enhancement of their infrastructure and economic conditions, enabling these cities to cater to larger geographical areas. Satellite data, with high and medium spatial resolution, featuring multispectral and multi-temporal capabilities, has become an essential tool for evaluating factors like vegetation cover, forest degradation, and urban expansion (Rawat et al. 2015). The use of remote sensing and GIS technology provides a framework for examining changes in landscapes across the Earth's surface. Traditional approaches to mapping rely on existing records, on-site surveys, and maps, making them both time-consuming and costly (Thakkar and Chaudhari 2021). Furthermore, maps generated through conventional methods quickly become obsolete in rapidly evolving environments. In contrast to conventional data acquisition approaches, remotely sensed data provides valuable information in a manner that is both efficient and cost-effective (Soergal et al. 2012). The examination of LULC changes in major urban areas benefits significantly from high-resolution satellite images or aerial photographs.
Precise information regarding LULC changes is crucial for comprehending the primary causes and environmental implications associated with such changes. Furthermore, an analysis of the driving forces behind LULC changes is imperative for gaining insights into ongoing transformations and predicting future modifications (Poyatos et al. 2003). The examination of the spatial and temporal aspects of LULC changes, along with their driving factors, served as the foundation for ensuring the sustainability of natural resource systems by reflecting the watershed's current state.
Despite the growing concerns about the impacts of LULC changes on global environmental shifts and sustainable development, research on LULC change in the Talcher Region remains limited. The world's ecosystem has been significantly impacted by human activity, mostly in the form of changes in LULC dynamics. In recent decades, LULC modeling has become a focus area, addressing challenges arising from the alteration and transformation of LULC. Accurate land cover assessment and tracking require large amounts of habitat and Earth surface data. Land use and land cover changes are frequently sparked by anthropogenic influences, which then follow by ensuing processes that are natural (Naikoo et al. 2020).
Urbanization, a prominent global trend that has gained momentum since the 20th century, stands out as a main driver reshaping landscape structure and functions. Conversely, LULCC is a pivotal variable influencing resource planning, control measures, and contamination of vital resources, such as soil and water (Shahfahad et al. 2022). Diverse landscape pattern scenarios, encompassing factors such as climate change, land use alteration, and the establishment of new road networks, have been identified by researchers as contributors to urbanization. Remote Sensing (RS) emerges as a crucial tool, providing synoptic and continuous data extensively utilized in LULCC studies. Satellite datasets such as Landsat, IRS, and IKONOS provide invaluable data inputs for predictive research endeavors. Geospatial methodologies have garnered extensive utilization among scholars for the management and planning of natural resources (Kafy et al. 2021). Cellular Automata (CA), a prominent method predicated on a uniform grid framework, has been utilized in urban growth simulations to emulate spatial phenomena. The inherent advantage of CA lies in its ability to encapsulate intricate systems via succinct rule sets and states, offering benefits in the realm of urbanization research. Remarkably, Cellular Automata (CA) functions on hypothetical scenarios, rendering it apt for activities pertaining to planning. The combination of CA with the CAMCM provides a robust technique for modeling the spatial and temporal dynamics of LULCC.
In this research, the methodology was utilized to: (i) analyze the temporal and spatial alterations within the study region over the timeframes of 2000–2010-2015; (ii) simulate and predict land use alterations for the year 2025. The findings of this investigation hold significant utility for urban planners, resource managers, and policymakers, with potential applicability in diverse geographical contexts. The evaluation of past LULC patterns and the forecasting of LULC dynamics for 2025 through the CA-Markov Chain model were pivotal in fostering the adoption of sustainable methodologies in natural resource stewardship, spatial land governance, decision-making frameworks, and policy formulation (Shafizadeh-Moghadam et al. 2017). Consequently, this specific study focused on the evaluation of LULC change trends, the comprehensive assessment and analysis of LULC change dynamics, and the prediction of the future trajectory of LULC.
In this study, the methodology was executed to: (i) scrutinize the temporal and spatial variations in the study region across the intervals of 2000–2010–2015; and (ii) model and predict alterations in land use for the year 2025. The findings of this research are poised to provide substantial value to urban planners, resource managers, and policymakers, with potential relevance across various geographical contexts. The analysis of historical LULC patterns and the projection of LULC for 2025 using the Cellular Automata-Markov Chain model have proven advantageous for instigating efficient and sustainable practices in natural resource management, spatial land governance, decision-making frameworks, and policy formulation (Singh et al. 2015). Consequently, this study focused on appraising the trajectory of LULC alterations, examining and dissecting LULC dynamics, and forecasting the forthcoming path of LULC. The predictive outcomes of this research underwent rigorous testing and validation utilizing the traditional Kappa coefficient for location statistics.
LULC change analysis is a crucial component of environmental management and planning, offering valuable perspectives into the dynamics and consequences of human activities on the environment. It enables us to comprehend the alterations in LULC, which describes the human-induced transformations occurring on the land, encompassing activities like agriculture, urbanization, and forestry, as well as the characteristics of the terrain, including vegetation, bodies of water, and exposed soil. The examination of LULC change is instrumental in understanding the underlying causes driving change, including population growth, economic development, and environmental degradation. Changes in LULC have the potential to diminish the accessibility of vital resources such as water, food, and energy, while simultaneously amplifying the susceptibility of communities to environmental hazards like floods, landslides, and droughts. Monitoring of LULC changes has evolved into a pivotal and indispensable element within modern approaches geared towards monitoring shifts in the environment, and efficiently overseeing the stewardship of natural resources (Rawat et al. 2013; Kumar et al. 2018). Understanding the patterns of LULC change is crucial for efficient environmental management, encompassing the implementation of effective water management practices (Twisa and Buchroithner 2019). The extensive changes to the Earth's land surface resulting from escalating anthropogenic activities across the biosphere are diminishing the effectiveness of global systems (Lambin et al. 2001).
The role of LULC is of paramount importance in the strategic management and monitoring of natural resources, as it responds to the growing human demands within the current ecosystem (Vivekananda et al. 2021). Romshoo et al. (2018) investigated the spatio-temporal variability of land surface temperature and lapse rate across the Kashmir region. Urban expansion has led to significant LULC changes, resulting in a dramatic transformation of the landscape over time, as revealed by multi-temporal Landsat satellite data. By utilizing various GIS techniques, an investigation was conducted to analyze the changes in LULC in Uttarakhand, a state in northern India known for its varied topography (Rawat et al. 2015). A study was conducted that utilized remote sensing and modeling techniques to evaluate the alterations in LULC across arid land across Europe (Halmy et al. 2015). Combining S-1 and S-2 data yields provides highly accurate image classification results for LULC mapping (Zen El-Dein et al. 2023). Gaur and Singh (2023) used MLC to detect LULC patterns. LULC information can provide valuable insights into how human activities, including agriculture, urbanization, and deforestation, affect the environment and the region's natural resources (Liping et al. 2018). Evaluating and predicting LULC changes through spatio-temporal data analysis plays a pivotal role in safeguarding our environment and optimizing land use planning and management (Lukas et al. 2023). Expansion of horticulture leads to a rise in black carbon emissions because of increased biomass, ultimately leading to elevated warming across the broad area.
The potential factors contributing to the shrinking of the Kolahoi glacier may include a decline in snow depth attributed to elevated black carbon levels, escalating temperatures, and diminished precipitation (Rafiq et al. 2016). Barakat et al (2021) found that an increase in population and economic development resulted in urbanization, which in turn had a notable effect on LULC, leading to a reduction in agricultural land and, and consequently, a decrease in soil organic carbon (SOC) sequestration. Assessing river basins necessitates the detection of LULC changes to evaluate hydrological and ecological circumstances, thereby enabling the sustainable utilization of their resources. A spectral indices method was utilized to generate the LULC change deduction map (Sahu et al. 2024). Alterations in LULC have a substantial impact on the spatial distribution of precipitation within a region experiencing climatic shifts. This influence is further compounded by the presence of ET.
2 Study area and data
The Chennai district of Tamil Nadu was considered as study area because of good prior knowledge about the study, as shown in Figure 1 (Saravanan et al. 2018a; Rawat et al. 2018; Saravanan et al. 2018b). The average elevation in the study area is 6.7 meters, and the average annual temperature is 27.9° C.
Figure 1 Study area location maps.
In the present research, Landsat (5 & 8) and Sentinel-2A/B data from different years were downloaded from the USGS web page (Table 1). Training sets data were generated with the help of Google Earth for classification.
Table 1 General description of used data sets.
Name of Sensor | Date of Acquisition | Cloud Coverage % | Product Level |
Landsat 5 | 11/05/2005 | 0.00 | C1 |
Landsat 5 | 10/05/2010 | 0.00 | C1 |
Landsat 8 | 14/10/2015 | 0.66 | C1 |
Sentinal-2 | 02/04/2020 | 0.00 | C1 |
3 Materials and methods
This research work is categorized into two steps: the first step includes the classification of LULC using three different schemes: ML, SVM, and RF; while the second category includes the accuracy of LULC classified images using the Kappa coefficient analysis. In each year (2005, 2010, 2015, and 2020) the training sets’ numbers are the same for each classification scheme. For the classification scheme, all factors under each scheme were kept the same for the years 2005, 2010, 2015, and 2020. The three classification schemes were applied using Landsat-5 (2005), Landsat-5 (2010), Landsat-82015, and Sentinel-2 (2020).
The objective of the present research was to i) map LULC changes within six classes (water body, grass land, natural vegetation, barren land, agriculture, urban area) during five-year interval from 2005 to 2020; and ii) assess the accuracy of generated LULC from SVM, RF (supervised machine learning ML based), and ML (simple supervised) algorithms. For the present study, models were performed in ArcGIS Pro 2.5 (Students Version) platform. A flow chart of the adopted methodology is represented by Figure 2.
Figure 2 Schematic of workflow for LULC and accuracy assessment.
3.1 Random Forest classification
Random Forest (RF) is a versatile and powerful machine learning algorithm that has gained popularity for its effectiveness in classification tasks across various domains. Introduced by Leo Breiman and Adele Cutler in 2001, Random Forest has emerged as a robust ensemble learning method capable of handling complex data structures, high-dimensional feature spaces, and non-linear relationships. In this comprehensive overview, we delve into the principles, methodology, advantages, limitations, and applications of Random Forest classification, providing insights into its inner workings and practical considerations. At its core, Random Forest is an ensemble learning technique that combines the predictions of multiple individual decision trees to improve predictive accuracy and robustness. Random Forest constructs an ensemble of decision trees, where each tree is trained independently on a random subset of the training data and a random subset of features. This ensemble approach helps mitigate the risk of over-fitting and improves generalization performance.
During the training phase, Random Forest randomly selects a subset of the training data with replacement, creating multiple bootstrapped datasets of equal size to the original training set. This process, known as bootstrap sampling, introduces diversity among the individual decision trees in the ensemble. At each node of the decision tree, Random Forest randomly selects a subset of features from the total feature set. This random feature selection helps de-correlate the trees and reduces the risk of overfitting by ensuring that different trees are trained on different subsets of features. During the prediction phase, each decision tree in the ensemble independently assigns a class label to the input sample. The final prediction is then determined by aggregating the individual predictions through a majority voting scheme, where the class label with the most votes across all decision trees is selected as the final prediction.
Random Forest randomly selects a subset of the training data with replacement (bootstrap sampling). This process creates multiple bootstrapped datasets of equal size to the original training set, ensuring diversity among the individual decision trees. At each node of the decision tree, Random Forest randomly selects a subset of features from the total feature set. This random feature selection helps reduce correlation among the trees and improves the generalization performance of the ensemble. For each bootstrapped dataset, Random Forest constructs a decision tree using a recursive partitioning algorithm such as the CART (Classification and Regression Trees) algorithm. Each tree is grown to its maximum depth without pruning, leading to fully grown and potentially overfit trees. During the prediction phase, each decision tree in the ensemble independently assigns a class label to the input sample. The final prediction is then determined by aggregating the individual predictions through a majority voting scheme, where the class label with the most votes across all decision trees is selected as the final prediction. In the case of classification tasks, the class label with the most votes across all decision trees is selected as the final prediction. For regression tasks, the average of the individual tree predictions is computed. Random Forest is inherently robust to overfitting due to the ensemble nature of the algorithm. By aggregating multiple decision trees, Random Forest can effectively mitigate the variance associated with individual trees, leading to improved generalization performance. Random Forest provides a measure of feature importance, indicating the contribution of each feature to the overall predictive performance. This information can help identify the most relevant features for classification tasks and guide feature selection or dimensionality reduction efforts.
Random Forest can handle missing data without requiring imputation or preprocessing. During the training phase, missing values are simply ignored, and the prediction is made based on the available features, simplifying the data preprocessing pipeline. Random Forest can capture complex non-linear relationships between input features and class labels. By combining multiple decision trees, Random Forest can learn intricate decision boundaries that may not be easily modeled by linear classifiers, making it suitable for a wide range of classification tasks. Random Forest can efficiently handle large datasets with high-dimensional feature spaces. The parallelizable nature of the algorithm allows for distributed computing, making it suitable for scalable implementations on modern computational architectures, such as multicore processors and distributed computing frameworks.
Random Forest classification has been successfully applied to a wide range of domains and applications. Random Forest is widely used for land cover classification, crop mapping, deforestation detection, and land use change analysis using satellite imagery and aerial photographs. The ability of Random Forest to handle high-dimensional remote sensing data makes it a popular choice for analyzing Earth observation data. Random Forest is employed for gene expression analysis, protein structure prediction, and disease diagnosis based on genetic data. The ability of Random Forest to handle noisy and high-dimensional biological data makes it well-suited for analyzing genomic and proteomic datasets.
Random Forest is utilized for credit risk assessment, fraud detection, stock price prediction, and customer segmentation in banking and financial services. The ability of Random Forest to handle large and complex financial datasets makes it a valuable tool for risk management and investment decision making. Random Forest can also be used for disease diagnosis, patient outcome prediction, medical image analysis, and drug discovery in healthcare and biomedicine. The ability of Random Forest to handle heterogeneous healthcare data, including electronic health records, medical images, and genomic data, makes it a versatile tool for improving healthcare outcomes. Random Forest is employed for customer segmentation, churn prediction, market basket analysis, and recommendation systems in marketing and retail industries. The ability of Random Forest to handle large and diverse marketing datasets makes it a valuable tool for improving customer engagement and optimizing marketing strategies.
While Random Forest provides accurate predictions, the ensemble nature of the algorithm can make it less interpretable compared to simpler models such as decision trees. The presence of multiple decision trees in the ensemble makes it challenging to understand the underlying decision-making process. Training a Random Forest model can be computationally intensive, especially for large datasets with numerous features. The need to grow multiple decision trees and perform feature selection at each node can lead to increased computational overhead, requiring efficient implementation strategies and computational resources. Random Forest has several hyperparameters that need to be tuned, such as the number of trees, the maximum depth of each tree, and the number of features to consider at each split. Careful tuning of these hyperparameters is essential to optimize model performance and prevent over fitting. Random Forest may struggle with imbalanced datasets, where one class is significantly more prevalent than others. Techniques such as class weighting or resampling can help address this issue and improve the predictive performance of Random.
3.2 Support Vector Machine
Support Vector Machine (SVM) is a versatile and powerful supervised machine learning algorithm that has gained significant popularity for its effectiveness in classification and regression tasks across various domains. Introduced by Vladimir Vapnik and his colleagues in the 1990s, SVM has emerged as a fundamental tool in the field of machine learning, offering robust solutions for complex data analysis problems. In this comprehensive overview, we delve into the principles, methodology, advantages, limitations, and applications of Support Vector Machine, providing insights into its inner workings and practical considerations.
Support Vector Machine operates on the principle of finding the optimal hyperplane that separates different classes in the feature space with the maximum margin. The key idea behind SVM is to transform the input data into a higher-dimensional space using a kernel function, where the classes become linearly separable. The optimal hyperplane is then identified as the one that maximizes the margin of separation between classes while minimizing classification errors. This margin maximization approach not only leads to better generalization performance but also enhances the algorithm's robustness to noisy data and overfitting.
The methodology of the Support Vector Machine can be described in several key steps. First, the input data is transformed into a higher-dimensional space using a kernel function, such as a linear, polynomial, radial basis function (RBF), or sigmoid kernel. This transformation enables the creation of linear decision boundaries in the higher-dimensional space, even for non-linearly separable data. Next, SVM identifies the optimal hyperplane that separates the classes by maximizing the margin between support vectors, which are the data points closest to the decision boundary. The optimization objective of SVM is to find the hyperplane that not only separates the classes but also maximizes the margin while minimizing classification errors. This optimization problem is typically formulated as a convex quadratic programming problem, which can be efficiently solved using optimization techniques such as gradient descent or quadratic programming solvers.
One of the key advantages of Support Vector Machine is its ability to handle high-dimensional data and non-linear decision boundaries effectively. By using kernel functions to transform the input data into a higher-dimensional space, SVM can capture complex relationships between features and class labels, leading to improved classification performance. Moreover, SVM is robust to overfitting, thanks to its margin maximization approach, which focuses on finding the hyperplane that maximizes the margin of separation between classes. This property makes SVM particularly well-suited for tasks with small training datasets or high-dimensional feature spaces, where overfitting is a common concern. Another advantage of Support Vector Machine is its versatility and flexibility in handling different types of data and problem settings. SVM can be applied to both binary and multi-class classification problems, as well as regression tasks. Furthermore, SVM supports various kernel functions, allowing users to choose the most appropriate kernel for their specific data and problem requirements.
Whether the data is linearly separable or non-linearly separable, SVM can adapt to different scenarios and provide accurate predictions. SVM is widely used for image classification tasks, such as object recognition, facial recognition, and scene understanding. By leveraging its ability to handle high-dimensional data and non-linear decision boundaries, SVM can accurately classify images into different categories based on their visual features. SVM is employed for text categorization and document classification tasks, such as spam email detection, sentiment analysis, and topic modeling. By transforming text data into a higher-dimensional space using techniques like term frequency-inverse document frequency (TF-IDF) or word embeddings, SVM can effectively classify text documents into predefined categories. SVM is utilized for various bioinformatics applications, including protein classification, gene expression analysis, and disease diagnosis. By analyzing biological data such as DNA sequences, protein structures, and gene expression profiles, SVM can assist researchers in identifying patterns and making predictions related to molecular biology and genetics. SVM is applied in finance for tasks such as stock price prediction, credit risk assessment, and fraud detection. By analyzing financial data such as historical stock prices, credit card transactions, and market indicators, SVM can help investors make informed decisions and financial institutions mitigate risks. SVM is used in remote sensing for land cover classification, crop mapping, and environmental monitoring. By analyzing satellite imagery and aerial photographs, SVM can accurately classify different land cover types and detect changes in the Earth's surface over time.
Despite its many strengths, Support Vector Machine also has some limitations and considerations. One limitation is its computational complexity, especially for large-scale datasets with numerous features. The training time of SVM can be significant, particularly when using kernel functions that involve computing pairwise similarities between data points in the higher-dimensional space. Additionally, SVM requires careful selection of hyperparameters, such as the choice of kernel function, regularization parameter, and kernel parameters. Tuning these hyperparameters can be time-consuming and computationally intensive, requiring cross-validation or grid search techniques to find the optimal settings. Moreover, SVM may not perform well on datasets with imbalanced class distributions, where one class is significantly more prevalent than others. In such cases, techniques like class weighting or resampling may be necessary to address the imbalance and improve classification performance. Support Vector Machine stands as a versatile, powerful, and widely used algorithm in the realm of machine learning, offering effective solutions for a diverse range of classification and regression problems. By leveraging its margin maximization approach, kernel-based transformations, and robust optimization techniques, SVM can accurately classify data points into different categories and make predictions with high accuracy. Whether it's image classification, text categorization, bioinformatics, finance, or remote sensing, SVM has demonstrated its efficacy in various domains and applications. As machine learning continues to advance and new algorithms emerge, Support Vector Machine remains a cornerstone method, providing valuable insights and solutions to complex data analysis problems.
3.3 Maximum likelihood classification
Maximum Likelihood Classification (MLC) is a widely used method in remote sensing and image processing for classifying pixel values into different land cover or land use categories based on statistical principles. Introduced in the late 1960s, MLC has since become one of the fundamental techniques for analyzing and interpreting remotely sensed imagery. In this comprehensive overview, we delve into the principles, methodology, advantages, limitations, and applications of Maximum Likelihood Classification, providing insights into its inner workings and practical considerations. At its core, Maximum Likelihood Classification is based on statistical principles and probability theory. The key idea behind MLC is to assign each pixel in an image to the class that has the highest probability of generating the observed pixel values. This probability is estimated based on the statistical properties of each land cover class, such as its mean vector and covariance matrix. MLC assumes that the pixel values within each class follow a multivariate normal distribution and uses the maximum likelihood estimation method to determine the parameters of these distributions. The class with the highest likelihood of generating the observed pixel values is then assigned to the pixel, making MLC a decision-theoretic approach to classification.
The first step in MLC is to collect training data for each land cover class of interest. This training data typically consists of samples or pixels that are known to belong to each class, either through ground truth information or manual interpretation of the imagery. The training data should be representative of the variability within each class and cover a wide range of conditions. Once the training data is collected, the next step is to estimate the statistical parameters of each land cover class, including the mean vector and covariance matrix. These parameters are estimated using maximum likelihood estimation, which involves finding the parameter values that maximize the likelihood function of the observed data given the model parameters. With the parameters estimated, MLC then calculates the likelihood of each pixel value belonging to each class based on the multivariate normal distribution model. The class with the highest likelihood for each pixel is then assigned as the classification result. In other words, each pixel is assigned to the class that it is most likely to belong to, given its observed spectral values. In cases where the classes are not equally likely or the costs of misclassification vary, MLC can incorporate prior probabilities and misclassification costs into the decision rule using Bayesian decision theory. This allows MLC to make more informed classification decisions based on the relative importance of different classes and the consequences of misclassification.
MLC is based on rigorous statistical principles and probability theory, making it a theoretically sound approach to classification. By modeling the probability distributions of pixel values within each class, MLC can make informed decisions about class assignments based on the observed data. MLC uses the maximum likelihood decision rule, which is known to be the optimal decision rule under certain conditions. By maximizing the likelihood of the observed data given the model parameters, MLC ensures that each pixel is assigned to the class that is most likely to have generated its observed spectral values.
MLC is robust to noise and outliers in the data, thanks to its probabilistic framework. By modeling the pixel values within each class as multivariate normal distributions, MLC can effectively filter out noise and make accurate classification decisions even in the presence of variability and uncertainty. MLC is a flexible classification method that can accommodate a wide range of data types and distributions. Whether the data is continuous or discrete, MLC can be applied with appropriate modifications to the likelihood function and decision rule. MLC assumes that the pixel values within each class follow a multivariate normal distribution. While this assumption holds true for many natural phenomena, it may not always be valid, especially for classes with non-Gaussian distributions or outliers. The performance of MLC depends heavily on the quality and representativeness of the training data. If the training data is biased or unrepresentative of the true class distributions, MLC may produce inaccurate classification results. MLC may struggle with class imbalance, where some classes are significantly more prevalent than others in the training data. In such cases, the likelihood function may be skewed towards the dominant classes, leading to biased classification results. MLC assumes that the classes are separable by linear decision boundaries in feature space. However, in cases where the classes are not linearly separable or overlap in feature space, MLC may fail to capture the underlying class distributions accurately.
MLC is widely used for land cover mapping and classification in remote sensing and GIS applications. By analyzing satellite imagery and aerial photographs, MLC can classify different land cover types such as forests, water bodies, urban areas, and agricultural fields. MLC is employed for vegetation analysis and monitoring, including forest health assessment, crop yield estimation, and habitat mapping. By analyzing spectral signatures from remote sensing data, MLC can identify and classify different vegetation types and monitor changes in vegetation cover over time. MLC is used for environmental monitoring and assessment, including pollution detection, wetland mapping, and habitat conservation. By analyzing multispectral and hyperspectral imagery, MLC can detect environmental changes and assess the health of ecosystems. MLC is applied in urban planning and management for tasks such as land use zoning, infrastructure planning, and transportation network analysis. By classifying urban land cover types and analyzing spatial patterns, MLC can support decision-making processes in urban development and sustainability. MLC is utilized for natural resource management and conservation, including wildlife habitat modeling, biodiversity assessment, and ecosystem services mapping. By classifying land cover and land use categories, MLC can identify priority areas for conservation and management actions.
Maximum Likelihood Classification is a powerful and widely used method for classifying pixel values into different land cover or land use categories based on statistical principles. By modeling the probability distributions of pixel values within each class and using the maximum likelihood decision rule, MLC can make informed classification decisions that are robust to noise and uncertainty. Despite its limitations and assumptions, MLC remains a valuable tool in remote sensing, GIS, and image processing, offering insights into the spatial and spectral characteristics of Earth's surface and supporting a wide range of applications in environmental monitoring, urban planning, natural resource management, and beyond. As technology continues to advance and new methods emerge, Maximum Likelihood Classification stands as a cornerstone technique in the field of remote sensing and image analysis, providing valuable solutions to complex classification problems.
3.4 Accuracy assessment
Accuracy assessment formulas are tabulated in Equations 1–4, with a comprehensive breakdown of the evaluation process. Initially, ground truth (GT) reference data were meticulously generated through the utilization of Google Earth Pro, a widely used tool for remote sensing applications. These GT data provide an open-access reference point for land use/land cover (LULC) analysis, with precise values extracted for further analysis. Subsequently, a confusion matrix is meticulously constructed, aligning classes (WB, GL, NV, BL, Ag, and UA) determined by GT values along the x-axis, and classes determined from remote sensing (RS) data classification along the y-axis of the matrix, following the methodology outlined. The arrangement of the matrix in a tabular format ensures that correct values are positioned along the diagonal, as emphasized by Ndegwa Mundia and Murayama (2009), facilitating accurate assessment. Conversely, erroneously classified values are situated off-diagonally, as observed in the arrangement methodology detailed by Morgado et al. (2014).
![]() |
(1) |
![]() |
(2) |
![]() |
(3) |
![]() |
(4) |
The validation process for the generated LULC maps spanning different years (2005, 2010, 2015, and 2020) involves the acquisition of data from Google Earth Pro, enabling a comparative analysis of land use dynamics over time. Consequently, varying degrees of accuracy are achieved using the same confusion matrix arrangement, as evidenced by studies conducted by Mahanta et al. (2022); Negi et al. (2021); Mahanta and Rawat (2020a); Rawat et al. (2013, 2019); and Sawai et al. (2020), highlighting the robustness and adaptability of the assessment methodology across different temporal scales.
Accuracy assessment is a critical component of remote sensing analysis, particularly in the context of LULC mapping. It involves evaluating the reliability and precision of the classified imagery by comparing it to ground truth (GT) reference data. Accuracy assessment enables researchers and stakeholders to gauge the level of agreement between the classified imagery and actual conditions on the ground, providing insights into the quality and validity of the classification results. In this explanation, we will delve into the methodology and procedures involved in accuracy assessment, including the generation of ground truth data, the construction of confusion matrices, and the validation of LULC maps across different time periods.
The accuracy assessment process begins with the generation of ground truth (GT) reference data, which serves as a benchmark for evaluating the accuracy of classified imagery. Ground truth data provide information about the true LULC classes at specific locations on the ground, enabling researchers to assess the performance of the classification algorithm. In the context of LULC mapping, ground truth data are typically generated through field surveys or by using high-resolution satellite imagery and geographic information systems (GIS) software. However, field surveys can be time-consuming and expensive, especially for large study areas. Therefore, researchers often utilize remote sensing tools such as Google Earth Pro to generate ground truth data more efficiently. Google Earth Pro provides high-resolution satellite imagery and interactive mapping tools that allow users to identify and label LULC classes directly on the imagery. This approach streamlines the data collection process and ensures consistency in the ground truth data across different study areas.
Once the ground truth data has been generated, the next step is to construct a confusion matrix, also known as an error matrix or classification matrix. A confusion matrix is a tabular representation of the relationship between the classified imagery and the ground truth data, showing the correspondence between the predicted and observed LULC classes. The confusion matrix arranges the LULC classes determined by the ground truth data along the rows of the matrix and the classes determined from the classified imagery along the columns. Each cell in the matrix represents the number of pixels classified into a specific LULC class, with the diagonal cells representing correctly classified pixels and the off-diagonal cells representing erroneously classified pixels. The values in the confusion matrix can be used to calculate various accuracy metrics, such as overall accuracy, producer's accuracy, user's accuracy, and Kappa coefficient, which provide quantitative measures of the classification performance. In the construction of the confusion matrix, it is essential to ensure consistency in the classification scheme and the spatial resolution of the imagery. The LULC classes determined by the ground truth data should align with the classes used in the classification algorithm, allowing for a direct comparison between the two datasets. Additionally, the spatial resolution of the classified imagery should match the resolution of the ground truth data to avoid discrepancies in the classification results. Furthermore, the confusion matrix should account for any errors or uncertainties associated with the classification process, such as misclassifications due to spectral confusion, mixed pixels, or misregistration between the imagery and the ground truth data. The arrangement of the confusion matrix is critical for accurately assessing the classification performance. The correct values, representing pixels that are accurately classified, should be positioned along the diagonal of the matrix, indicating a perfect match between the classified imagery and the ground truth data. Conversely, the erroneously classified values, representing pixels that are misclassified, should be situated off-diagonally, highlighting the discrepancies between the two datasets. By visually inspecting the confusion matrix, researchers can identify patterns of misclassification and assess the overall accuracy of the classification algorithm. Several accuracy metrics can be derived from the confusion matrix to quantify the classification performance. Overall accuracy is the most straightforward metric and represents the percentage of correctly classified pixels out of the total number of pixels in the study area. Producer's accuracy measures the probability that a pixel classified as a specific LULC class in the imagery actually belongs to that class according to the ground truth data. Similarly, user's accuracy measures the probability that a pixel classified as a specific LULC class in the ground truth data is correctly identified in the classified imagery.
The Kappa coefficient is a statistical measure that assesses the agreement between the classified imagery and the ground truth data, accounting for the possibility of agreement occurring by chance. These accuracy metrics provide valuable insights into the strengths and weaknesses of the classification algorithm and help researchers evaluate the reliability and validity of the classification results. The accuracy assessment process is not limited to a single point in time but can be extended to validate LULC maps across different time periods. In this context, the accuracy assessment methodology remains consistent, but the ground truth data and classified imagery are collected and analyzed for multiple time points to track changes in land use and land cover over time. For example, researchers may validate LULC maps for different years (e.g., 2005, 2010, 2015, and 2020) to assess the temporal dynamics of land use change and monitor the impact of environmental factors such as urbanization, deforestation, and agricultural expansion. By comparing LULC maps for different time periods, researchers can identify trends, patterns, and hotspots of land use change and inform policy decisions and land management strategies accordingly. The validation of LULC maps across different time periods involves the acquisition of ground truth data and classified imagery for each time point, followed by the construction of confusion matrices and calculation of accuracy metrics as described earlier. By applying the same accuracy assessment methodology to LULC maps for different years, researchers can ensure consistency in the evaluation process and facilitate meaningful comparisons between the classification results. Furthermore, the use of consistent accuracy metrics allows researchers to quantify the degree of accuracy achieved for each LULC map and assess the reliability of the classification results over time.
This iterative approach to accuracy assessment enables researchers to track changes in land use and land cover systematically and provides valuable insights into the drivers and implications of land use change at regional and global scales. Several studies have applied the accuracy assessment methodology to validate LULC maps for different years and investigate land use dynamics in various regions. For example, Mahanta et al. (2022) evaluated LULC maps for different years in the Brahmaputra River Basin using ground truth data from Google Earth Pro and found varying degrees of accuracy across time periods, highlighting the importance of temporal validation for monitoring land use change. Similarly, Negi et al. (2021) conducted an accuracy assessment of LULC maps for different years in the Himalayan region using field surveys and satellite imagery and observed significant land use changes over time, particularly in response to deforestation and urbanization pressures. Mahanta and Rawat (2020b) assessed the accuracy of LULC maps for different years in the Northeastern region of India using ground truth data from Google Earth Pro and observed significant changes in land use patterns, particularly in response to infrastructure development and industrialization. Rawat et al. (2013, 2019) conducted accuracy assessments of LULC maps for different years in the Gangetic Plains using satellite imagery and GIS techniques and identified trends of urban sprawl and agricultural intensification. Similarly, Sawai et al. (2020) evaluated LULC maps for different years in the Western Himalayas using ground.
The accuracy assessment step is always important, and the last step to evaluate the performance of any system. Under the accuracy assessment process at class-level performance of a given classifier, we calculated CM, producer's accuracy (P_Accuracy), user's accuracy (U_Accuracy), and an overall Kappa statistic index of agreement for each year and each classifier algorithm. Overall Accuracy (OA) is the most extensively used algorithm in accuracy estimation of remote sensing data based LULC (Sawai et al. 2020; Mahanta et al. 2022; Negi et al. 2021; Rawat et al. 2013, 2019; Mahanta and Rawat 2020a). The ArcGIS Pro platform also provides a tool to accomplish the accuracy assessment for different classifiers like producer's accuracy, user's accuracy, and an overall Kappa statistic index.
Other ways to assess accuracy (or relative comparison) among RF, SVM, and MLH classifiers, include the Z-Score test (Equation 5). This comparison test will show whether any significant difference exists between the recital of the classifiers.
![]() |
(5) |
Where:
p1 and p2 | = | proportions of the correctly classified test data of any two classifiers, and |
s1 and s2 | = | standard deviation (SD) of their samples. |
According to the null hypothesis, Ho: |p1 – p2 | = 0, while an alternative hypothesis H1: |p1 – p2 | ≠ 0, the Z value is estimated under α/2 confidence level of a two-tailed Z test, Ho is discarded if Z ≥ Z α/2. At a 95% confidence level, Equation 5 is utilized to compare the output maps from any two classifiers. If Z > 1.96, its revealed that the first classifier is better than the second classifier, with a higher than 95% probability. In the present study, training and test data were the same for the inter-comparison of all the classifiers results, therefore, there is no chance for any bias in analysis.
After the accuracy assessment, it is essential to know the status of LULC change, as well as the magnitude rate of change for each five-year interval from 2005 to 2020. The magnitude of change (MC), percentage of change (PC), and the annual rate of change (ARC) of the classified LULCs were calculated using Equations 6 to 9:
![]() |
(6) |
![]() |
(7) |
![]() |
(8) |
![]() |
(9) |
Where:
Af | = | area of the selected class at the final time (in our study, f is the year 2020), |
Ai | = | area of class at the initial time of study (in our study, i is year 2005), and |
n | = | number of years (in our study, n is 15 years). |
4 Results and discussion
4.1 Performance of SVM, RF, and ML
Figure 3 illustrates the OA of three different RF, SVM, and ML classifiers with a fixed number of training datasets. The RF classifier performed better than the OA, with respect to the SVM and ML algorithms during the year 2005 using Landsat-5. The performance of RF follows a similar pattern when applied over 2010 Landsat-5 data for the same study area with the same number of training datasets. From Figure 3, it can be seen that for the Landsat-5 data, the ML performance is low with respect to RF and SVM. The effect of the constant training datasets over each algorithm in each case, in each class, can be accessed through Producer Accuracy (PA) and User Accuracy (UA), as demonstrated in Figure 4 and Figure 5. Larger classes, such as UA and AG in the case of RF and SVM, have much better P and U Accuracy than ML (Figures 4 and 5). BL and GL, which are fluctuating classes under the study area, have good P and U Accuracy with respect to SVM and ML, while SVM sometimes has good accuracy in comparison with ML.
Figure 3 Overall Accuracy (OA) for different classifier algorithms, for different years.
Figure 4 Producer Accuracy (PA) for different classifier algorithms, for different years.
Figure 5 User Accuracy (UA) for different classifier algorithms, for different years.
For each classifier in the ArcGIS Pro platform, the output of the accuracy assessment comes in the form of an error matrix (EM). Each EM, from each classifier model provides information about the number of correctly and incorrectly classified pixels by a particular classifier. Based on Equations 1 to 4, the number of correctly and incorrectly classified pixels in EM, PA, UA, and OA are generated using ArcGIS Pro, so there is no need for additional effort. The EM for each class’s accuracy performance is revealed in terms of PA (%), UA (%), and Kappa value (%). For all six classes, the PA is arranged in a row, and the UA appears in a column, while the Kappa value is shown in a separate column, under EM.
Based on fixed training sampling points for accurate classification, results shows that RF produces an OA of 89% (for year 2005 with Landsat-5 data), 85% (for year 2010 with Landsat-5 data), 82% (for year 2015 with Landsat-8 data), and 80% (for year 2020 with Sentinel-2 data); while for the same data and the same years, SVM is has an OA of 87%, 87%, 79%, and 80%, but with ML, these values were found to be lower in comparison with the RF and SVM classifiers (Figure 3). Because of the good value of OA for RF, the Kappa coefficient (k) values (0.84, 0.79, 0.75, and 0.72) were also good in comparison with the SVM and MLH classifiers from 2005 to 2020, as shown in Figure 6.
Figure 6 Kappa coefficient (k) from the dataset with different classifiers, for different years.
NV, Ag, WB, and UA have similarly high P and U accuracy values. These classes do not fall below 65 % accuracy (Figures 4 and 5). In the present study, low or fluctuated values of P and U accuracy were found for GL and BL classes, with the ML classifier. It reveals that the pixels of GL and BL are often incorrectly classified or classified by ML into other classes. Similarly, SVM sometimes displayed the same behaviour.
Four LULC maps were developed, namely for 2005, 2010, 2015, and 2020, using each classifier (Figures 8, 9, and 10), to determine the LULC change information (Table 2b, 3b, and 4b). This LULC change information for the study area is displayed in two formats: i) using a five-year interval (2005 to 2010, 2010 to 2015, and 2015 to 2020), and ii) using a fifteen-year interval (2005 to 2020). This LULC information is vital because it suggests the conversion of one land-use to another (in percentage and km2).
Table 2a RF based areal changes in each class during the years 2005 to 2020.
Area (km2) | ||||
Classes | 2005 | 2010 | 2015 | 2020 |
WB | 4.5954 | 4.0401 | 3.3948 | 4.6296 |
GL | 13.2021 | 11.5703 | 1.7109 | 5.4639 |
NV | 24.3606 | 15.202 | 13.8328 | 12.4356 |
BL | 0.927 | 0.6543 | 1.7262 | 1.6675 |
Ag | 16.308 | 16.1325 | 14.8187 | 6.0015 |
UA | 69.0288 | 80.8227 | 92.9385 | 98.2238 |
Total Area | 128.4219 | 128.4219 | 128.4219 | 128.4219 |
Table 2b MC and ARC, based on RF classifier.
2005 to 2010 | 2010 to 2015 | 2015 to 2020 | 2005 to 2020 | ||||||||
Classes | MC | ARC | MC | ARC | MC | ARC | MC | ARC | |||
WB | 0.43 | 0.02 | 0.50 | 0.03 | -0.96 | -0.07 | -0.03 | 0.00 | |||
GL | 1.27 | 0.02 | 7.68 | 0.17 | -2.92 | -0.44 | 6.03 | 0.03 | |||
NV | 7.13 | 0.08 | 1.07 | 0.02 | 1.09 | 0.02 | 9.29 | 0.02 | |||
BL | 0.21 | 0.06 | -0.83 | -0.33 | 0.05 | 0.01 | -0.58 | -0.04 | |||
Ag | 0.14 | 0.00 | 1.02 | 0.02 | 6.87 | 0.12 | 8.03 | 0.03 | |||
UA | 6.09 | 0.02 | -9.43 | -0.03 | -4.12 | -0.01 | -7.46 | -0.01 |
Table 2c MC% and ARC%, based on RF classifier.
2005 to 2010 | 2010 to 2015 | 2015 to 2020 | 2005 to 2020 | ||||||||
Classes | MC% | ARC% | MC% | ARC% | MC% | ARC% | MC% | ARC% | |||
WB | 12.08 | 2.42 | 15.97 | 3.19 | -36.37 | -7.27 | -0.74 | -0.04 | |||
GL | 12.36 | 2.47 | 85.21 | 17.04 | -219.36 | -43.87 | 58.61 | 2.93 | |||
NV | 37.60 | 7.52 | 9.01 | 1.80 | 10.10 | 2.02 | 48.95 | 2.45 | |||
BL | 29.42 | 5.88 | -163.82 | -32.76 | 3.40 | 0.68 | -79.88 | -3.99 | |||
Ag | 1.08 | 0.22 | 8.14 | 1.63 | 59.50 | 11.90 | 63.20 | 3.16 | |||
UA | 8.83 | 1.77 | -14.99 | -3.00 | -5.69 | -1.14 | -10.80 | -0.54 |
Table 3a SVM based areal changes in each class, during five year intervals, from 2005 to 2020.
Area (km2) | ||||
Classes | 2005 | 2010 | 2015 | 2020 |
WB | 4.6017 | 4.7775 | 3.3372 | 4.7565 |
GL | 7.326 | 5.091 | 2.8998 | 3.7539 |
NV | 17.2656 | 28.3345 | 38.3796 | 50.121 |
BL | 3.3039 | 1.3257 | 2.2248 | 3.627 |
Ag | 5.4405 | 4.0025 | 3.6144 | 2.3436 |
UA | 90.4842 | 84.8907 | 77.9661 | 63.8199 |
Total Area | 128.4219 | 128.4219 | 128.4219 | 128.4219 |
Table 3b MC and ARC, based on SVM classifier.
2005 to 2010 | 2010 to 2015 | 2015 to 2020 | 2005 to 2020 | ||||||||
Classes | MC | ARC | MC | ARC | MC | ARC | MC | ARC | |||
WB | -0.14 | -0.01 | 1.12 | 0.06 | -1.11 | -0.09 | -0.12 | 0.00 | |||
GL | 1.74 | 0.06 | 1.71 | 0.09 | -0.67 | -0.06 | 2.78 | 0.02 | |||
NV | -8.62 | -0.13 | -7.82 | -0.07 | -9.14 | -0.06 | -25.58 | -0.10 | |||
BL | 1.54 | 0.12 | -0.70 | -0.14 | -1.09 | -0.13 | -0.25 | 0.00 | |||
Ag | 1.12 | 0.05 | 0.30 | 0.02 | 0.99 | 0.07 | 2.41 | 0.03 | |||
UA | 4.36 | 0.01 | 5.39 | 0.02 | 11.02 | 0.04 | 20.76 | 0.01 |
Table 3c MC% and ARC%, based on SVM classifier.
2005 to 2010 | 2010 to 2015 | 2015 to 2020 | 2005 to 2020 | ||||||||
Classes | MC% | ARC% | MC% | ARC% | MC% | ARC% | MC% | ARC% | |||
WB | -3.82 | -0.76 | 30.15 | 6.03 | -42.53 | -8.51 | -3.36 | -0.17 | |||
GL | 30.51 | 6.10 | 43.04 | 8.61 | -29.45 | -5.89 | 48.76 | 2.44 | |||
NV | -64.11 | -12.82 | -35.45 | -7.09 | -30.59 | -6.12 | -190.29 | -9.51 | |||
BL | 59.87 | 11.97 | -67.82 | -13.56 | -63.03 | -12.61 | -9.78 | -0.49 | |||
Ag | 26.43 | 5.29 | 9.70 | 1.94 | 35.16 | 7.03 | 56.92 | 2.85 | |||
UA | 6.18 | 1.24 | 8.16 | 1.63 | 18.14 | 3.63 | 29.47 | 1.47 |
Table 4a MLH based areal changes in each class, during five year intervals, from 2005 to 2020.
Area (km2) | ||||
Classes | 2005 | 2010 | 2015 | 2020 |
WB | 3.9672 | 2.4894 | 1.1061 | 3.8052 |
GL | 8.4051 | 5.6674 | 4.2259 | 3.7728 |
NV | 15.3828 | 25.9713 | 41.9958 | 45.0848 |
BL | 2.9799 | 0.4311 | 1.6308 | 3.8403 |
Ag | 43.8624 | 5.895 | 6.3765 | 1.6587 |
UA | 53.8245 | 87.9677 | 73.0868 | 70.2601 |
Total Area | 128.4219 | 128.4219 | 128.4219 | 128.4219 |
Table 4b MC and ARC, based on MLH classifier.
2005 to 2010 | 2010 to 2015 | 2015 to 2020 | 2005 to 2020 | ||||||||
Classes | MC | ARC | MC | ARC | MC | ARC | MC | ARC | |||
WB | 1.15 | 0.07 | 1.08 | 0.11 | -2.10 | -0.49 | 0.13 | 0.00 | |||
GL | 2.13 | 0.07 | 1.12 | 0.05 | 0.35 | 0.02 | 3.61 | 0.03 | |||
NV | -8.25 | -0.14 | -12.48 | -0.12 | -2.41 | -0.01 | -23.13 | -0.10 | |||
BL | 1.98 | 0.17 | -0.93 | -0.56 | -1.72 | -0.27 | -0.67 | -0.01 | |||
Ag | 29.56 | 0.17 | -0.37 | -0.02 | 3.67 | 0.15 | 32.86 | 0.05 | |||
UA | -26.59 | -0.13 | 11.59 | 0.03 | 2.20 | 0.01 | -12.80 | -0.02 |
Table 4c MC% and ARC%, based on MLH classifier.
2005 to 2010 | 2010 to 2015 | 2015 to 2020 | 2005 to 2020 | ||||||||
Classes | MC% | ARC% | MC% | ARC% | MC% | ARC% | MC% | ARC% | |||
WB | 37.25 | 7.45 | 55.57 | 11.11 | -244.02 | -48.80 | 4.08 | 0.20 | |||
GL | 32.57 | 6.51 | 25.43 | 5.09 | 10.72 | 2.14 | 55.11 | 2.76 | |||
NV | -68.83 | -13.77 | -61.70 | -12.34 | -7.36 | -1.47 | -193.09 | -9.65 | |||
BL | 85.53 | 17.11 | -278.29 | -55.66 | -135.49 | -27.10 | -28.87 | -1.44 | |||
Ag | 86.56 | 17.31 | -8.17 | -1.63 | 73.99 | 14.80 | 96.22 | 4.81 | |||
UA | -63.43 | -12.69 | 16.92 | 3.38 | 3.87 | 0.77 | -30.54 | -1.53 |
4.2 Classifier effects over magnitude rate of change
From the EM results, for an accurate classification, the selection of the correct classifier is important. The RF classifier provides a good value for OA and the Kappa coefficient. The RF has good value for the Kappa coefficient (Figure 6) with respect to SVM and ML. The study estimated the magnitude rate of change for each class using RF, SVM, and ML because of the qualitative sensitivity impact of the classifier. After the accuracy assessment, the qualitative status of the LULC change is determined by applying different classifiers and their magnitude rate of change in each five-year interval, from 2005 to 2020.
Figure 7 reveals the final MC percentage in each five-year interval, as well as the MC percentage over fifteen years (2005 to 2020) via RF, SVM, and ML. LULC maps of the study area were generated based on each classifier, for each five-year interval (Figures 8a, 8b, and 8c). The accuracy of LULC simply cannot be assessed only based on the final classified LULC maps from different supervised classifiers, because statistical tests (OA, PA, and UA) are important. Therefore OA, PA, and UA tests were used, and it was found that RF is a better supervised classifier than SVM and ML. Finally, based on supervised RF classifiers, the developed LULC maps were accepted due to their agreement with statistical tests. Therefore, the overall results showed that RF has higher accuracy than the SVM and ML schemes, and results also reveal the accuracy pattern of classifiers as RF > SVM > ML.
Figure 7 LULC changes in the study area between 2005 and 2020, at each 5-year interval.
Figure 8a LULC maps for different years, based on the Random Forest (RF) classifier scheme.
Figure 8b LULC maps of different years, based on the Support Vector Machine (SVM) classifier scheme.
Figure 8c LULC maps of different years, based on the Maximum Likelihood (MLH) classifier scheme.
5 Conclusions
LULC information in digital map format is vital for many reasons, and has been obtained from earth observational data, which has been an AOI for many decades. Previously, only a few classifier techniques were available, therefore, researchers were not satisfied with the accuracy of LULC. Only Kappa coefficients were considered in final classification by any classifier algorithm, but in the present scenario, the option of classifiers, as well as earth observational data, are more in comparison to the previous period. Therefore, before finalizing the LULC from any classifier, we must be conscious about its accuracy. Because the present study clearly demonstrated that the same data, and same number of training sets or sampling points changes with different classifiers, accuracy will be different. Even in the present study, we found that advanced Machine Learning (ML) techniques based on the SVM classifier have some accuracy issues. Pandey et al. (2013) also found that RF, which falls under a soft classifier, is better than SVM, which is considered a hard classifier. There is no doubt that this results in the SVM classifier having good accuracy when compared with a previous classifier, like MLC. This study is important for researchers to identify the best classification algorithm to generate the LULC of any AOI. The present study also demonstrated that temporal variability in datasets is important because it provides key information for MC (km2), ARC (km2, year 1), PC (%), and ARC (%) of a particular class, in a certain time interval, but it is a different research subject that determines which classifier is most suitable for which class of LULC. In this study, it was found that urbanization influences agriculture, green cover, and water bodies, while continuously increasing built-up lands from 2005 to 2020.
References
-
Anderson, J.R. 1976. “A Land Use and Land Cover Classification System for Use with Remote Sensor Data.” USCS Prof Paper, 964.
-
Barakat, M.A., H. AlSalamat, F. Jirjees, H. Al-Obaidi, Z.K. Hussain, S.E. Hadidi, S. Mansour, D. Malaeb, et al. 2021. "Factors Associated with Knowledge and Awareness of Stroke Among the Jordanian Population: A Cross-Sectional Study." F1000 Research 10, 1242. https://doi.org/10.12688/f1000research.74492.2
-
Basommi, L.P., Q-f. Guan, D-d. Cheng, and S.K. Singh. 2016. “Dynamics of Land Use Change in a Mining Area: A Case Study of Nadowli District, Ghana.” Journal of Mountain Science 13 (4): 633–42. https://doi.org/10.1007/s11629-015-3706-4
-
Bose, A., and I.R. Chowdhury. 2020. "Monitoring and modeling of spatio-temporal urban expansion and land-use/land-cover change using Markov chain model: A case study in Siliguri Metropolitan area, West Bengal, India." Modeling Earth Systems and Environment 6, 2235–2249.
-
Foody, G.M. 2009. “Sample Size Determination for Image Classification Accuracy Assessment and Comparison.” International Journal of Remote Sensing 30 (20): 5273–91. https://doi.org/10.1080/01431160903130937
-
Gaur, S., and R. Singh. 2023. "A Comprehensive Review on Land Use/Land Cover (LULC) Change Modeling for Urban Development: Current Status and Future Prospects." Sustainability 15 (2): 903. https://doi.org/10.3390/su15020903
-
Halmy, M.W.A., P.E. Gessler, J.A. Hicke, and B.B. Salem. 2015. "Land use/land cover change detection and prediction in the north-western coastal desert of Egypt using Markov-CA." Applied Geography 63, 101–112.
-
Kafy, A.A., M.S. Rahman, M. Islam, A. Al Rakib, M.A. Islam, M.H.H. Khan, M.S. Sikdar, et al. 2021. "Prediction of seasonal urban thermal field variance index using machine learning algorithms in Cumilla, Bangladesh." Sustainable Cities and Society 64, 102542.
-
Kumar, M., D.M. Denis, S.K. Singh, S. Szabó, and S. Suryavanshi. 2018. “Landscape Metrics for Assessment of Land Cover Change and Fragmentation of a Heterogeneous Watershed.” Remote Sensing Applications: Society and Environment 10, 224–33. https://doi.org/10.1016/j.rsase.2018.04.002
-
Kushwaha, K., M.M. Singh, S.K. Singh, and A. Patel. 2021. “Urban Growth Modeling Using Earth Observation Datasets, Cellular Automata-Markov Chain Model and Urban Metrics to Measure Urban Footprints.” Remote Sensing Applications: Society and Environment 22, 100479. https://doi.org/10.1016/j.rsase.2021.100479
-
Lambin, E.F., B.L. Turner, H.J. Geist, S.B. Agbola, A. Angelsen, J.W. Bruce, O.T. Coomes, et al. 2001. "The causes of land-use and land-cover change: moving beyond the myths." Global Environmental Change 11 (4): 261–269.
-
Lamine, S., G.P. Petropoulos, S.K. Singh, S. Szabó, N. El Islam Bachari, P.K. Srivastava, and S. Suman. 2018. “Quantifying Land Use/Land Cover Spatio-Temporal Landscape Pattern Dynamics from Hyperion Using SVMs Classifier and FRAGSTATS®.” Geocarto International 33 (8): 862–78. https://doi.org/10.1080/10106049.2017.1307460
-
Liping, C., S. Yujun, and S. Saeed. 2018. "Monitoring and predicting land use and land cover changes using remote sensing and GIS techniques—A case study of a hilly area, Jiangle, China." PLoS ONE 13 (7): e0200493.
-
Lukas, P., A.M. Melesse, and T.T. Kenea. 2023. "Prediction of future land use/land cover changes using a coupled CA-ANN model in the Upper Omo–Gibe River Basin, Ethiopia." Remote Sensing 15 (4): 1148.
-
Mahanta, A.R., and K.S. Rawat. 2020a. “Land Use and Land Cover Monitoring using Multi-temporal Earth Observational Date: A Case Study of Kancheepuram Peninsular of India.” International Journal of Advanced Research in Engineering and Technology (IJARET) 11, 5, 087: 835–841.
-
Mahanta, A.R., and K.S. Rawat. 2020b. “Predicting and Analyzing Water Quality using Machine Learning Based Model: A Case Study for Kancheepuram Watershed.” International Journal of Advanced Research in Engineering and Technology (IJARET) 11, 5 (088): 842–51.
-
Mahanta, A.R., K.S. Rawat, S.K. Singh, S. Sanjeevi, and A.K. Mishra. 2022. “Evaluation of Long-Term Nitrate and Electrical Conductivity in Groundwater System of Peninsula, India.” Applied Water Science 12 (2): 17. https://doi.org/10.1007/s13201-021-01568-1
-
Maitima, J.M., S.M. Mugatha, R.S. Reid, L.N. Gachimbi, A. Majule, H. Lyaruu, D. Pomery, et al. 2009. “The Linkages between Land Use Change, Land Degradation and Biodiversity across East Africa.” African Journal of Environmental Science and Technology 3 (10): 310–325.
-
Manzanarez, S., V. Manian, and M. Santos. 2022. "Land Use Land Cover Labeling of GLOBE Images Using a Deep Learning Fusion Model." Sensors 22 (18): 6895. https://doi.org/10.3390/s22186895
-
Meshram, P., K.S. Rawat, S. Kumar, and D.S. Kandar. 2020. “Mapping Forest Cover and Deforestation using LANDSAT-8 Earth Observation Time Series Satellite Data–A Case Study of Central India.” IJARET 11, 04, (075): 717– 722.
-
Minale, A.S. 2013. "Retrospective analysis of land cover and use dynamics in Gilgel Abbay Watershed by using GIS and remote sensing techniques, Northwestern Ethiopia." International Journal of Geosciences 4, 07: 1003.
-
Morgado, P., E. Gomes, and N. Costa. 2014. "Competing visions? Simulating alternative coastal futures using a GIS-ANN web application." Ocean and Coastal Management 101, 79–88.
-
Naikoo, M.W., M. Rihan, and M. Ishtiaque. 2020. "Analyses of land use land cover (LULC) change and built-up expansion in the suburb of a metropolitan city: Spatio-temporal analysis of Delhi NCR using Landsat datasets." Journal of Urban Management 9 (3): 347–359.
-
Ndegwa Mundia, C., and Y. Murayama. 2009. “Analysis of Land Use/Cover Changes and Animal Population Dynamics in a Wildlife Sanctuary in East Africa.” Remote Sensing 1 (4): 952–70. https://doi.org/10.3390/rs1040952
-
Negi, A., K.S. Rawat, A. Nainwal, M.C. Shah, and V. Kumar. 2021. “Quality Analysis of Statistical and Data-Driven Rainfall-Runoff Models for a Mountainous Catchment.” Materials Today: Proceedings 46 (20): 10376–83. https://doi.org/10.1016/j.matpr.2020.12.544
-
Pandey, R., S. Naik, and R. Marfatia. 2013. “Image processing and machine learning for automated fruit grading system: A technical review,” International Journal of Computer Applications 81 (16): 29–39. https://doi.org/10.5120/14209-2455
-
Poyatos, F. 2003. "La comunicación no verbal: algunas de sus perspectivas de estudio e investigación." Revista de investigación lingüística 6, 2.
-
Rafiq, S., R. Salim, and I. Nielsen. 2016. "Urbanization, openness, emissions, and energy intensity: A study of increasingly urbanized emerging economies." Energy Economics 56, 20–28.
-
Rajendran, G.B., U.M. Kumarasamy, C. Zarro, P.B. Divakarachari, and S.L. Ullo. 2020. "Land-Use and Land-Cover Classification Using a Human Group-Based Particle Swarm Optimization Algorithm with an LSTM Classifier on Hybrid Pre-Processing Remote-Sensing Images.” Remote Sensing 12 (24): 4135
-
Rawat, K.S., A.K. Mishra, V.K. Sehgal, N. Ahmed, and V.K. Tripathi. 2013. “Comparative Evaluation of Horizontal Accuracy of Elevations of Selected Ground Control Points from ASTER and SRTM DEM with Respect to CARTOSAT-1 DEM: A Case Study of Shahjahanpur District, Uttar Pradesh, India.” Geocarto International 28 (5): 439–52. https://doi.org/10.1080/10106049.2012.724453
-
Rawat, K.S., T.G.A. Jacintha, and S.K. Singh. 2018. “Hydro-Chemical Survey and Quantifying Spatial Variations in Groundwater Quality in Coastal Region of Chennai, Tamilnadu, India–a Case Study.” Indonesian Journal of Geography 50 (1): 57–69. https://doi.org/10.22146/ijg.27443
-
Rawat, K.S., S.K. Singh, M.I. Singh, and B.L. Garg. 2019. “Comparative Evaluation of Vertical Accuracy of Elevated Points with Ground Control Points from ASTERDEM and SRTMDEM with Respect to CARTOSAT-1DEM.” Remote Sensing Applications: Society and Environment 13, 289–97. https://doi.org/10.1016/j.rsase.2018.11.005
-
Rawat, S., A.K. Gupta, S.J. Sangode, P. Srivastava, and H.C. Nainwal. 2015. "Late Pleistocene–Holocene vegetation and Indian summer monsoon record from the Lahaul, northwest Himalaya, India." Quaternary Science Reviews 114, 167–181.
-
Romshoo, S.A., S. Altaf, I. Rashid, and R.A. Dar. 2018. "Climatic, geomorphic and anthropogenic drivers of the 2014 extreme flooding in the Jhelum basin of Kashmir, India." Geomatics, Natural Hazards and Risk 9 (1): 224–248.
-
Sahu, S.R., K.S. Rawat, S.K. Singh, and K.K. Gupta. 2024. "Analysis of drainage morphometry and spectral indices using earth observation datasets in Palar River basin, India." Discover Geoscience 2, 41. https://doi.org/10.1007/s44288-024-00038-w
-
Saravanan J., K.S. Rawat, and S.K. Singh. 2018a. "Sub-Surface Investigation Using Vertical Electrical Sounding: Chennai Metropolitan Area." Current World Environment 13, 3. https://www.cwejournal.org/vol13no3/sub-surface-investigation-using-vertical-electrical-sounding--chennai-metropolitan-area
-
Saravanan, J., K.S. Rawat, and S.K. Singh. 2018b. “Groundwater Quality of Coastal Aquifer Evaluation Using Spatial Analysis Approach.” Oriental Journal of Chemistry 34 (6): 2902. http://dx.doi.org/10.13005/ojc/340630
-
Sawai, S., K.S. Rawat, S.K. Singh, and S. Kumar. 2020. “Statistical Investigation of Accuracy of Satellite Elevation Data: A Case Study. Journal of Critical Reviews 17, 15. Preprint.
-
Shafizadeh-Moghadam, H., A. Asghari, A. Tayyebi, and M. Taleai. 2017. "Coupling machine learning, tree-based and statistical models with cellular automata to simulate urban growth." Computers, Environment and Urban Systems 64, 297–308.
-
Shahfahad, S. Talukdar, M. Rihan, T.H. Hoang, S. Bhaskaran, and A. Rahman. 2022. "Modelling urban heat island (UHI) and thermal field variation and their relationship with land use indices over Delhi and Mumbai metro cities." Environment, Development and Sustainability: A Multidisciplinary Approach to the Theory and Practice of Sustainable Development 24 (3): 3762–3790.
-
Singh, S.K., P.K. Srivastava, M. Gupta, J.K. Thakur, and S. Mukherjee. 2014. “Appraisal of Land Use/Land Cover of Mangrove Forest Ecosystem Using Support Vector Machine.” Environmental Earth Sciences 71 (5): 2245–55.
-
Singh, S.K., Sk. Mustak, P.K. Srivastava, S. Szabó, and T. Islam. 2015. “Predicting Spatial and Decadal LULC Changes Through Cellular Automata Markov Chain Models Using Earth Observation Datasets and Geo-Information.” Environmental Processes 2 (1): 61–78. https://doi.org/10.1007/s40710-015-0062-x
-
Singh, S.K., P.K. Srivastava, S. Szabó, G.P Petropoulos, M. Gupta, and T. Islam. 2017. “Landscape Transform and Spatial Metrics for Mapping Spatiotemporal Land Cover Dynamics Using Earth Observation Data-Sets.” Geocarto International 32 (2): 113–27.
-
Soergel, D.A., N. Dey, R. Knight, and S.E. Brenner. 2012. "Selection of primers for optimal taxonomic classification of environmental 16S rRNA gene sequences." ISME Journal 6 (7): 1440–1444.
-
Thakkar, A., and K. Chaudhari. 2021. "A comprehensive survey on deep neural networks for stock market: The need, challenges, and future directions." Expert Systems with Applications 177, 114800.
-
Twisa, S., and M.F. Buchroithner. 2019. "Land-use and land-cover (LULC) change detection in Wami River Basin, Tanzania." Land 8 (9): 136.
-
Vivekananda, U., D. Bush, J.A. Bisby, S. Baxendale, R. Rodionov, B. Diehl, and N. Burgess. 2021. "Theta power and theta‐gamma coupling support long‐term spatial memory retrieval." Hippocampus 31 (2): 213–220.
-
Yinga, O.E., K.S. Kumar, M. Chowlani, S.K. Tripathi, V.P. Khanduri, and S.K. Singh. 2022. “Influence of Land-Use Pattern on Soil Quality in a Steeply Sloped Tropical Mountainous Region, India.” Archives of Agronomy and Soil Science 68 (6): 852–72. https://doi.org/10.1080/03650340.2020.1858478
-
Yu, L., L. Liang, J. Wang, Y. Zhao, Q. Cheng, L. Hu, S. Liu, et al. 2014. “Meta-Discoveries from a Synthesis of Satellite-Based Land-Cover Mapping Research.” International Journal of Remote Sensing 35 (13): 4573–88. https://doi.org/10.1080/01431161.2014.930206
-
Zen El-Dein, A.A.M., M.H.M. Koriem, and S.A. Ibrahim. 2023. "Effect of Intercropping Sunflower Cultivars and Defoliation Time on Sugar Beet Yield and Quality." Journal of Plant Production 14 (6): 303–311.