Related papers: Regional Development Classification Model using De…
Determination quadrant development has an important role in order to determine the achievement of the development of a district, in terms of the sector's gross regional domestic product (GDP). The process of determining the quadrant…
Nowadays, agricultural field is experiencing problems related to climate change that result in the changing patterns in cropping season, especially for paddy and coarse grains, pulses roots and Tuber (CGPRT/Palawija) crops. The cropping…
The identification of regional development gaps is an effort to see how far the development conducted in every District in a Province. By seeing the gaps occurred, it is expected that the Policymakers are able to determine which region that…
We present an algorithm for classification tasks on big data. Experiments conducted as part of this study indicate that the algorithm can be as accurate as ensemble methods such as random forests or gradient boosted trees. Unlike ensemble…
Based on decision trees, many fields have arguably made tremendous progress in recent years. In simple words, decision trees use the strategy of "divide-and-conquer" to divide the complex problem on the dependency between input features and…
The aim of this study is to show the importance of two classification techniques, viz. decision tree and clustering, in prediction of learning disabilities (LD) of school-age children. LDs affect about 10 percent of all children enrolled in…
Real-life machine learning problems exhibit distributional shifts in the data from one time to another or from one place to another. This behavior is beyond the scope of the traditional empirical risk minimization paradigm, which assumes…
Decision tree is an important method for both induction research and data mining, which is mainly used for model classification and prediction. ID3 algorithm is the most widely used algorithm in the decision tree so far. In this paper, the…
Data mining has been applied in various areas because of its ability to rapidly analyze vast amounts of data. This study is to build the Graduates Employment Model using classification task in data mining, and to compare several of…
Decision trees and random forest remain highly competitive for classification on medium-sized, standard datasets due to their robustness, minimal preprocessing requirements, and interpretability. However, a single tree suffers from high…
Decision tree learning is a widely used approach in machine learning, favoured in applications that require concise and interpretable models. Heuristic methods are traditionally used to quickly produce models with reasonably high accuracy.…
In order to speed-up classification models when facing a large number of categories, one usual approach consists in organizing the categories in a particular structure, this structure being then used as a way to speed-up the prediction…
Student repetition in secondary education imposes significant resource burdens, particularly in resource-constrained contexts. Addressing this challenge, this study introduces a unified machine learning framework that simultaneously…
Credit ratings are one of the primary keys that reflect the level of riskiness and reliability of corporations to meet their financial obligations. Rating agencies tend to take extended periods of time to provide new ratings and update…
Classification and Regression Trees (CARTs) are off-the-shelf techniques in modern Statistics and Machine Learning. CARTs are traditionally built by means of a greedy procedure, sequentially deciding the splitting predictor variable(s) and…
The automated classification of objects from large catalogues or survey projects is an important task in many astronomical surveys. Faced with various classification algorithms, astronomers should select the method according to their…
In this work, we propose a novel node splitting method for regression trees and incorporate it into the regression forest framework. Unlike traditional binary splitting, where the splitting rule is selected from a predefined set of binary…
Differential performance debugging is a technique to find performance problems. It applies in situations where the performance of a program is (unexpectedly) different for different classes of inputs. The task is to explain the differences…
Data mining involves the systematic analysis of large data sets, and data mining in agricultural soil datasets is exciting and modern research area. The productive capacity of a soil depends on soil fertility. Achieving and maintaining…
The transparency nature of Open Data is beneficial for citizens to evaluate government work performance. In Indonesia, each government bodies or ministry have their own standard operating procedure on data treatment resulting in incoherent…