Temat: Random Forest Classifier

Skocz do pozycji: 1.

Tytuł:: Application of the forest classifier method for description of movements of an oscillator forced by a stochastic series of impulses
Autorzy:: Sulewski, Marek
Ozga, Agnieszka
Tematy:: machine learning
random forest classifier
stochastic series of impulses; Pokaż więcej
Wydawca:: Polskie Towarzystwo Mechaniki Teoretycznej i Stosowanej
Powiązania:: https://bibliotekanauki.pl/articles/59316045.pdf Link otwiera się w nowym oknie
Opis:: The article discusses the analysis of motion of an oscillator forced by a sequence of stochastic impulses with the use of decision tree algorithms and a random forest classifier. The aim of this paper is to verify the accuracy of distinguishing distributions in the desired time period and to check whether the length of the time interval affects the accuracy of data classification. Moreover, the statistical parameters directly influencing classification of distributions are presented. The analysis has been performed in Python environment, the data were obtained in computer simulation. The results of classification for two classification algorithms with regard to two divisions of the test and training set sizes are presented. In case of the decision tree classifier, it has been observed that for each time interval this algorithm classifies the data achieving a high level of accuracy, but for the purpose of data classification for each time period it selects different statistics, which makes it impossible to unequivocally determine which statistic influences the recognition of distribution. In case of the random forest classification algorithm, the importance and influence of the parameters on the distribution between the three distributions are the same both in 5-minute and 10-minute intervals. The differences between significance of the parameters depending on length of the interval are not significant.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 2.

Tytuł:: Predicting immunogenicity in murine hosts with use of Random Forest classifier
Przewidywanie immunogenności u myszy przy użyciu klasyfikatora Random Forest
Autorzy:: Marciniak, Anna
Tarczewska, Martyna
Kloska, Sylwester
Tematy:: Random Forest Classifier
immunogenicity
machine learning
entropy
Gini index
klasyfikator Random Forest
immunogenność
uczenie maszynowe
entropia; Pokaż więcej
Wydawca:: Politechnika Bydgoska im. Jana i Jędrzeja Śniadeckich. Wydawnictwo PB
Powiązania:: https://bibliotekanauki.pl/articles/2016293.pdf Link otwiera się w nowym oknie
Opis:: Biomedical data are difficult to interpret due to their large amount. One of the solutions to cope with this problem is to use machine learning. Machine learning can be used to capture previously unnoticed dependencies. The authors performed random forest classifier with entropy and Gini index criteria on immunogenicity data. Input data consisted of 3 columns: epitope (8-11 amino acids long peptide), major histocompatibility complex (MHC) and immune response. Presented model can predict the immune response based on epitope-MHC complex. Achieved results had accuracy of 84% for entropy and 83% for Gini index. The results are not fully satisfying but are a fair start for more complexed experiments and could be used as an indicator for further research.
Dane biomedyczne są trudne do interpretacji ze względu na ich dużą ilość. Jednym z rozwiązań radzenia sobie z tym problemem jest wykorzystanie uczenia maszynowego. Techniki te umożliwiają wychwycenie wcześniej niezauważonych zależności. W artykule przedstawiono wykorzystanie klasyfikatora Random Forest z kryterium entropii i indeksem Gini na danych dotyczących immunogenności. Dane wejściowe składają się z 3 kolumn: epitop (peptyd o długości 8-11 aminokwasów), główny kompleks zgodności tkankowej (MHC) i odpowiedź immunologiczna. Zaprezentowany model przewiduje odpowiedź immunologiczną na podstawie kompleksu epitop-MHC. Uzyskane wyniki osiągnęły dokładność na poziomie 84% (entropia) i 83% (indeks Gini). Wyniki nie są w pełni satysfakcjonujące, ale stanowią dobry początek dla bardziej złożonych eksperymentów i wyznacznik do dalszych badań.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 3.

Tytuł:: Semantic Segmentation of Diseases in Mushrooms using Enhanced Random Forest
Autorzy:: Yacharam, Rakesh Kumar
Sekhar, Dr. V. Chandra
Tematy:: mushroom diseases
semantic segmentation
computer aided
Machine Learning
significant feature extraction
Random Forest classifier; Pokaż więcej
Wydawca:: Szkoła Główna Gospodarstwa Wiejskiego w Warszawie. Instytut Informatyki Technicznej
Powiązania:: https://bibliotekanauki.pl/articles/31339414.pdf Link otwiera się w nowym oknie
Opis:: Mushrooms are a rich source of antioxidants and nutritional values. Edible mushrooms, however, are susceptible to various diseases such as dry bubble, wet bubble, cobweb, bacterial blotches, and mites. Farmers face significant production losses due to these diseases affecting mushrooms. The manual detection of these diseases relies on expertise, knowledge of diseases, and human effort. Therefore, there is a need for computer-aided methods, which serve as optimal substitutes for detecting and segmenting diseases. In this paper, we propose a semantic segmentation approach based on the Random Forest machine learning technique for the detection and segmentation of mushroom diseases. Our focus lies in extracting a combination of different features, including Gabor, Bouda, Kayyali, Gaussian, Canny edge, Roberts, Sobel, Scharr, Prewitt, Median, and Variance. We employ constant mean-variance thresholding and the Pearson correlation coefficient to extract significant features, aiming to enhance computational speed and reduce complexity in training the Random Forest classifier. Our results indicate that semantic segmentation based on Random Forest outperforms other methods such as Support Vector Machine (SVM), Naïve Bayes, K-means, and Region of Interest in terms of accuracy. Additionally, it exhibits superior precision, recall, and F1 score compared to SVM. It is worth noting that deep learning-based semantic segmentation methods were not considered due to the limited availability of diseased mushroom images.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 4.

Tytuł:: Decision trees with R.
Drzewa klasyfikacyjne z użyciem pakietu R
Autorzy:: Bryła, Jakub
Opis:: The main topic of this thesis is theory of decision trees and it's practical use in data mining with R program. It contains three chapters. In the first chapter we introduce in which kind of data set we are going to work, we also formulate definitions of classifier and classification problem. In the second chapter we present a theory regarding builiding of decision trees. A method of their interpretation with the use of graph theory is also introduced. The concept of the impurity measure is defined as well as it's two examples: Entropy and Gini Index. Moreover, stop criteria and prunning criteria of classification trees are provided. At the end of this chapter, the concept of a random forest was described. In the third chapter, the introduced theory of classification trees and random forest is used for build a classification model using R. The analysis was carried out on the banking and medical data.
Głównym tematem pracy jest teoria drzew klasyfikacyjnych oraz jej praktyczne wykorzystanie przy analizie danych w programie R. Praca składa się z trzech rozdziałów. W pierwszym rozdziale wprowadzona jest postać danych na których będziemy pracowali oraz zdefiniowane zostaje pojęcia klasyfikatora i problemu klasyfikacyjnego. Drugi rozdział poświęcony jest w pełni drzewom klasyfikacyjnym. Wprowadzony zostaje sposób ich interpretacji przy wykorzystaniu teorii grafów. Zdefiniowane zostaje pojęcie miary różnorodności a także jej dwa przykłady: Entropia, wskaźnik Giniego. Ponadto, podane zostają kryteria stopu oraz kryteria przycinania drzew klasyfikacyjnych. Na końcu rozdziału została wprowadzone pojęcie lasu losowego. W trzecim rozdziale wykorzystana zostaje wprowadzona teoria drzew klasyfikacyjnych i lasu losowego do budowy modelu klasyfikacyjnego z wykorzystaniem programu R. Analiza zostaje przeprowadzona na podstawie dwóch rodzajów danych: bankowych i medycznych.
Dostawca treści:: Repozytorium Uniwersytetu Jagiellońskiego

Inne

na półce

Skocz do pozycji: 5.

Tytuł:: Attribute selection for stroke prediction
Autorzy:: Zdrodowska, Małgorzata
Tematy:: data mining
classifier
J48 (C4.5)
CART
PART
naive Bayes classifier
random forest
support vector machine
multilayer perceptron
haemorrhagic stroke
ischemic stroke; Pokaż więcej
Wydawca:: Politechnika Białostocka. Oficyna Wydawnicza Politechniki Białostockiej
Powiązania:: https://bibliotekanauki.pl/articles/386466.pdf Link otwiera się w nowym oknie
Opis:: Stroke is the third most common cause of death and the most common cause of long-term disability among adults around theworld. Therefore, stroke prediction and diagnosis is a very important issue. Data mining techniques come in handy to help determine the correlations between individual patient characterisation data, that is, extract from the medical information system the knowledge necessary to predict and treat various diseases. The study analysed the data of patients with stroke using eight known classification algorithms (J48 (C4.5), CART, PART, naive Bayes classifier, Random Forest, Supporting Vector Machine and neural networks Multilayer Perceptron), which allowed to build an exploration model given with an accuracy of over 88%. The potential features of patients, which may be factors that increase the risk of stroke, were also indicated.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 6.

Tytuł:: Determining students online academic performance using machine learning techniques
Ocena wydajności akademickiej studentów w nauce online za pomocą technik uczenia maszynowego
Autorzy:: Islam, Atika
Bukhari, Faisal
Sattar, Muhammad Awais
Kashif, Ayesha
Tematy:: educational data mining
learning analytics
random forest
support vector classifier
edukacyjna eksploracja danych
analityka uczenia się
losowy las
klasyfikator wektora wsparcia; Pokaż więcej
Wydawca:: Politechnika Lubelska. Wydawnictwo Politechniki Lubelskiej
Powiązania:: https://bibliotekanauki.pl/articles/58907963.pdf Link otwiera się w nowym oknie
Opis:: Predicting student's academic performance during online learning has been considered a major task during the pandemic period. During the online mode of learning, academic activities have been affected in such a way that the management of educational institutions has planned to design support systems for predicting the student's performance to reduce the dropout ratio of the students and bring improvement in academic activities. During COVID-19, the main challenge is maintaining student's grades by predicting their academic performance using different techniques such as Education Data Mining and Learning Analytics. Different features have been identified related to the teaching mechanisms in online learning, which have a great impact on the improvement of academic performance. A high-quality dataset helps us to generate productive results, which in turn helps us to make effective decisions for promoting high-quality education. In this research, five prediction models for predicting academic performance have been proposed by collecting an imbalanced dataset of 350 students from the same computer science domain. After applying pre-processing techniques for cleaning the data, machine learning models have been applied, including K-Nearest Neighbor Classifier, Decision Tree, Random Forest, Support Vector Classifier, and Gaussian Naive Bayes. Results have been predicted for an imbalanced and balanced dataset after feature selection. Support Vector classifier has produced the best results in a balanced dataset with selected features by giving an accuracy of 96.89%.
Przewidywanie wyników akademickich studentów podczas nauki online było uważane za ważne zadanie w okresie pandemii. W trakcie nauki w trybie online działalność akademicka była zakłócana w taki sposób, że zarządy instytucji edukacyjnych planowały projektowanie systemów wsparcia do przewidywania wyników studentów w celu zmniejszenia wskaźnika rezygnacji ze studiów i poprawy działalności akademickiej. Podczas COVID-19 głównym wyzwaniem jest utrzymanie ocen studentów poprzez przewidywanie ich wyników akademickich za pomocą różnych technik, takich jak Edukacyjna Analiza Danych i Analityka Edukacyjna. Zidentyfikowano różne cechy związane z mechanizmami nauczania w nauce online, które mają duży wpływ na poprawę wyników akademickich. Wysokiej jakości zestaw danych pomaga generować produktywne wyniki, które z kolei pomagają podejmować skuteczne decyzje na rzecz promowania wysokiej jakości edukacji. W tym badaniu zaproponowano pięć modeli predykcyjnych do przewidywania wyników akademickich, zbierając niezrównoważony zestaw danych 350 studentów z tej samej dziedziny informatyki. Po zastosowaniu technik przetwarzania wstępnego do oczyszczania danych, zastosowano modele uczenia maszynowego, w tym klasyfikator K-Najbliższych Sąsiadów, Drzewo Decyzyjne, Las Losowy, Klasyfikator Wektorów Wspierających oraz Naiwny Klasyfikator Bayesa Gaussowskiego. Wyniki przewidziano dla niezrównoważonego i zrównoważonego zestawu danych po selekcji cech. Klasyfikator wektorów wspierających wyprodukował najlepsze wyniki w zrównoważonym zestawie danych z wybranymi cechami, osiągając dokładność 96,89%.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 7.

Tytuł:: Analiza danych z rozgrywek piłkarskich i opracowanie modeli uczenia maszynowego dla przewidywania ich wyników
Analysis of data about football games and development of machine learning models to predict their results.
Autorzy:: Turkowski, Kacper
Opis:: The goal of this thesis is to perform analysis of the football matches with an attempt to build a model for predicting results of the matches based on the matches statistics. Matches were assigned to three groups: home team victory, draw and away team victory. Various matches statistics were analyzed to find out which of them tell us the most about the course of the match. The thesis used three methods of selecting input information for prediction information and different classifiers such as Decision Tree, Random Forest, KNeighbours, SVM, Logistic Regression. Each method of selecting information has been tested with all classifiers in order to find the most optimal approach to predict match results. All calculations were made using Python and methods from scikit-learn libraries.
Celem pracy jest analiza zdarzeń meczowych oraz zbudowanie modelu dla przewidywania rezultatu meczy pilkarskich na podstawie statystyk meczowych. Każdy mecz został przydzielony do jednej z 3 grup: zwycięstwo gospodarza, remis lub zwycięstwo drużyny gości. Przeanalizowane zostały różne statystyki meczowe w celu stwierdzenia, które z nich mówią nam najwięcej o przebiegu spotkania. W pracy zostały uzyte trzy metody doboru informacji wejsciowej oraz takie klasyfikatory jak: Decision Tree, Random Forest, KNeighbours, SVM, Logistic Regression. Systematycznie przetestowane zostaly metody doboru informacji oraz klasyfikatory w celu znalezienia optymalnego podejścia do predykcji wyników spotkań. Obliczenia zostały wykonane za pomocą języka Python oraz biblotek pakietu scikit-learn.
Dostawca treści:: Repozytorium Uniwersytetu Jagiellońskiego

Inne

na półce

Informacja

Wyszukujesz frazę "Random Forest Classifier" wg kryterium: Temat