Temat: mining data - Prolib Integro

Skocz do pozycji: 1.

Tytuł:: Data mining tasks and methods – implementations in R
Autorzy:: Figielska, Ewa
Tematy:: data mining
R programming language
classification
prediction
clustering
association; Pokaż więcej
Wydawca:: Warszawska Wyższa Szkoła Informatyki
Powiązania:: https://bibliotekanauki.pl/articles/1397482.pdf Link otwiera się w nowym oknie
Opis:: The aim of the paper is to present how some of the data mining tasks can be solved using the R programming language. The full R scripts are provided for preparing data sets, solving the tasks and analyzing the results.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 2.

Tytuł:: Data mining
Autorzy:: Morzy, Tadeusz
Tematy:: data mining
data analysis
evolution of information technology
association analysis
classification
clustering
Web mining; Pokaż więcej
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Powiązania:: https://bibliotekanauki.pl/articles/703139.pdf Link otwiera się w nowym oknie
Opis:: Recent advances in data capture, data transmission and data storage technologies have resulted in a growing gap between more powerful database systems and users' ability to understand and effectively analyze the information collected. Many companies and organizations gather gigabytes or terabytes of business transactions, scientific data, web logs, satellite pictures, textreports, which are simply too large and too complex to support a decision making process. Traditional database and data warehouse querying models are not sufficient to extract trends, similarities and correlations hidden in very large databases. The value of the existing databases and data warehouses can be significantly enhanced with help of data mining. Data mining is a new research area which aims at nontrivial extraction of implicit, previously unknown and potentially useful information from large databases and data warehouses. Data mining, also referred to as database mining or knowledge discovery in databases, can help answer business questions that were too time consuming to resolve with traditional data processing techniques. The process of mining the data can be perceived as a new way of querying – with questions such as ”which clients are likely to respond to our next promotional mailing, and why?”. The aim of this paper is to present an overall picture of the data mining field as well as presents briefly few data mining methods. Finally, we summarize the concepts presented in the paper and discuss some problems related with data mining technology.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 3.

Tytuł:: A survey of big data classification strategies
Autorzy:: Banchhor, Chitrakant
Srinivasu, N.
Tematy:: big data
data mining
MapReduce
classification
machine learning
evolutionary intelligence
deep learning; Pokaż więcej
Wydawca:: Polska Akademia Nauk. Instytut Badań Systemowych PAN
Powiązania:: https://bibliotekanauki.pl/articles/2050171.pdf Link otwiera się w nowym oknie
Opis:: Big data plays nowadays a major role in finance, industry, medicine, and various other fields. In this survey, 50 research papers are reviewed regarding different big data classification techniques presented and/or used in the respective studies. The classification techniques are categorized into machine learning, evolutionary intelligence, fuzzy-based approaches, deep learning and so on. The research gaps and the challenges of the big data classification, faced by the existing techniques are also listed and described, which should help the researchers in enhancing the effectiveness of their future works. The research papers are analyzed for different techniques with respect to software tools, datasets used, publication year, classification techniques, and the performance metrics. It can be concluded from the here presented survey that the most frequently used big data classification methods are based on the machine learning techniques and the apparently most commonly used dataset for big data classification is the UCI repository dataset. The most frequently used performance metrics are accuracy and execution time.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 4.

Tytuł:: Data mining approach in diagnosis and treatment of chronic kidney disease
Autorzy:: Turiac, Andreea S.
Zdrodowska, Małgorzata
Tematy:: feature selection
classification
classification rules
action rules
data mining
chronic kidney disease; Pokaż więcej
Wydawca:: Politechnika Białostocka. Oficyna Wydawnicza Politechniki Białostockiej
Powiązania:: https://bibliotekanauki.pl/articles/2105985.pdf Link otwiera się w nowym oknie
Opis:: Chronic kidney disease is a general definition of kidney dysfunction that lasts more than 3 months. When chronic kidney disease is advanced, the kidneys are no longer able to cleanse the blood of toxins and harmful waste products and can no longer support the proper function of other organs. The disease can begin suddenly or develop latently over a long period of time without the presence of characteristic symptoms. The most common causes are other chronic diseases – diabetes and hypertension. Therefore, it is very important to diagnose the disease in early stages and opt for a suitable treatment - medication, diet and exercises to reduce its side effects. The purpose of this paper is to analyse and select those patient characteristics that may influence the prevalence of chronic kidney disease, as well as to extract classification rules and action rules that can be useful to medical professionals to efficiently and accurately diagnose patients with kidney chronic disease. The first step of the study was feature selection and evaluation of its effect on classification results. The study was repeated for four models – containing all available patient data, containing features identified by doctors as major factors in chronic kidney disease, and models containing features selected using Correlation Based Feature Selection and Chi-Square Test. Sequential Minimal Optimization and Multilayer Perceptron had the best performance for all four cases, with an average accuracy of 98.31% for SMO and 98.06% for Multilayer Perceptron, results that were confirmed by taking into consideration the F1-Score, for both algorithms was above 0.98. For all these models the classification rules are extracted. The final step was action rule extraction. The paper shows that appropriate data analysis allows for building models that can support doctors in diagnosing a disease and support their deci-sions on treatment. Action rules can be important guidelines for the doctors. They can reassure the doctor in his diagnosis or indicate new, previously unseen ways to cure the patient.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 5.

Tytuł:: Heterogeneous distance functions for prototype rules : influence of parameters on probability estimation
Autorzy:: Blachnik, M.
Duch, W.
Wieczorek, T.
Tematy:: prototype rules
probability estimation
heterogeneous distance functions
similarity-based methods
classification
data mining; Pokaż więcej
Wydawca:: Uniwersytet Przyrodniczo-Humanistyczny w Siedlcach
Powiązania:: https://bibliotekanauki.pl/articles/92882.pdf Link otwiera się w nowym oknie
Opis:: An interesting and little explored way to understand data is based on prototype rules (P-rules). The goal of this approach is to find optimal similarity (or distance) functions and position of prototypes to which unknown vectors are compared. In real applications similarity functions frequently involve different types of attributes, such as continuous, discrete, binary or nominal. Heterogeneous distance functions that may handle such diverse information are usually based on probability distance measure, such as the Value Difference Metrics (VDM). For continuous attributes calculation of probabilities requires estimations of probability density functions. This process requires careful selection of several parameters that may have important impact on the overall classification of accuracy. In this paper, various heterogeneous distance function based on VDM measure are presented, among them some new heterogeneous distance functions based on different types of probability estimation. Results of many numerical experiments with such distance functions are presented on artificial and real datasets, and quite simple P-rules for several heterogeneous databases extracted.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 6.

Tytuł:: Application of selected supervised classification methods to bank marketing campaign
Autorzy:: Grzonka, D.
Borowik, B.
Suchacka, G.
Tematy:: classification
supervised learning
data mining
decision trees
bagging
boosting
random forests
bank marketing
R project; Pokaż więcej
Wydawca:: Szkoła Główna Gospodarstwa Wiejskiego w Warszawie. Wydawnictwo Szkoły Głównej Gospodarstwa Wiejskiego w Warszawie
Powiązania:: https://bibliotekanauki.pl/articles/94739.pdf Link otwiera się w nowym oknie
Opis:: Supervised classification covers a number of data mining methods based on training data. These methods have been successfully applied to solve multi-criteria complex classification problems in many domains, including economical issues. In this paper we discuss features of some supervised classification methods based on decision trees and apply them to the direct marketing campaigns data of a Portuguese banking institution. We discuss and compare the following classification methods: decision trees, bagging, boosting, and random forests. A classification problem in our approach is defined in a scenario where a bank’s clients make decisions about the activation of their deposits. The obtained results are used for evaluating the effectiveness of the classification rules.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 7.

Tytuł:: Inefficiency of data mining algorithms and its architecture: with emphasis to the shortcoming of data mining algorithms on the output of the researches
Autorzy:: Tesema, Workineh
Tematy:: data mining
classification
clustering
association
regression
algorithms bottleneck
pozyskiwanie danych
klasyfikacja
grupowanie
asocjacja
regresja
wąskie gardło algorytmów; Pokaż więcej
Wydawca:: Polskie Towarzystwo Promocji Wiedzy
Powiązania:: https://bibliotekanauki.pl/articles/118221.pdf Link otwiera się w nowym oknie
Opis:: This review paper presents a shortcoming associated to data mining algorithm(s) classification, clustering, association and regression which are highly used as a tool in different research communities. Data mining researches has successfully handling large amounts of dataset to solve the problems. An increase in data sizes was brought a bottleneck on algorithms to retrieve hidden knowledge from a large volume of datasets. On the other hand, data mining algorithm(s) has been unable to analysis the same rate of growth. Data mining algorithm(s) must be efficient and visual architecture in order to effectively extract information from huge amounts of data in many data repositories or in dynamic data streams. Data visualization researchers believe in the importance of giving users an overview and insight into the data distributions. The combination of the graphical interface is permit to navigate through the complexity of statistical and data mining techniques to create powerful models. Therefore, there is an increasing need to understand the bottlenecks associated with the data mining algorithms in modern architectures and research community. This review paper basically to guide and help the researchers specifically to identify the shortcoming of data mining techniques with domain area in solving a certain problems they will explore. It also shows the research areas particularly a multimedia (where data can be sequential, audio signal, video signal, spatio-temporal, temporal, time series etc) in which data mining algorithms not yet used.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 8.

Tytuł:: Selection of data mining method for multidimensional evaluation of the manufacturing process state
Autorzy:: Rogalewicz, M.
Piłacińska, M.
Kujawińska, A.
Tematy:: jakość kontroli
proces produkcji
eksploaracja danych
metoda
klasyfikacja
quality control
process state evaluation
data mining methods
classification; Pokaż więcej
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Powiązania:: https://bibliotekanauki.pl/articles/407333.pdf Link otwiera się w nowym oknie
Opis:: The article deals with the issues involved in evaluating the process state on the basis of many measures, including: process parameters, diagnostic signals and events occurring during the process. These measures as well as those measurements traditionally used in the evaluation of process capability, offer a relevant source of information about the manufacturing process and the authors attempted to ascertain the most suitable method, or group of methods, for achieving this. They present the main criteria for the categorization division of the methods of the manufacturing process state evaluation and, from those identified, distinguish the traditional from Data Mining methods. The authors then specify some basic requirements regarding the desired method or group of methods and focus on the classification problem. A division and classification of the methods is made and briefly described. Finally, the authors specify the criteria for their selection of the Data Mining method type as being the most appropriate for the evaluation of the manufacturing process state and, from within this type, offer the most suitable groups of methods. Some directions for further research are discussed at the end of the article.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 9.

Tytuł:: A Data Mining Approach for Analysis of a Wire Electrical Discharge Machining Process
Autorzy:: Dandge, Shruti Sudhakar
Chakraborty, Shankar
Tematy:: wire electrical discharge machining
data mining
classification and regression tree
chi-squared
automatic interaction detection
classification; Pokaż więcej
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Powiązania:: https://bibliotekanauki.pl/articles/2023974.pdf Link otwiera się w nowym oknie
Opis:: Wire electrical discharge machining (WEDM) is a non-conventional material-removal process where a continuously travelling electrically conductive wire is used as an electrode to erode material from a workpiece. To explore its fullest machining potential, there is always a requirement to examine the effects of its varied input parameters on the responses and resolve the best parametric setting. This paper proposes parametric analysis of a WEDM process by applying non-parametric decision tree algorithm, based on a past experimental dataset. Two decision tree-based classification methods, i.e. classification and regression tree (CART) and Chi-squared automatic interaction detection (CHAID) are considered here as the data mining tools to examine the influences of six WEDM process parameters on four responses, and identify the most preferred parametric mix to help in achieving the desired response values. The developed decision trees recognize pulse-on time as the most indicative WEDM process parameter impacting almost all the responses. Furthermore, a comparative analysis on the classification performance of CART and CHAID algorithms demonstrates the superiority of CART with higher overall classification accuracy and lower prediction risk.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 10.

Tytuł:: Comparative evaluation of the different data mining techniques used for the medical database
Autorzy:: Kasperczuk, A.
Dardzińska, A.
Tematy:: data mining
classification
WEKA
J48
MLP
apriori
association rules
baza wiedzy medycznej
eksploracja danych
algorytm klasyfikacji; Pokaż więcej
Wydawca:: Politechnika Białostocka. Oficyna Wydawnicza Politechniki Białostockiej
Powiązania:: https://bibliotekanauki.pl/articles/386432.pdf Link otwiera się w nowym oknie
Opis:: Data mining is the upcoming research area to solve various problems. Classification and finding association are two main steps in the field of data mining. In this paper, we use three classification algorithms: J48 (an open source Java implementation of C4.5 algorithm), Multilayer Perceptron - MLP (a modification of the standard linear perceptron) and Naïve Bayes (based on Bayes rule and a set of conditional independence assumptions) of the Weka interface. These classifiers have been used to choose the best algorithm based on the conditions of the voice disorders database. To find association rules over transactional medical database first we use apriori algorithm for frequent item set mining. These two initial steps of analysis will help to create the medical knowledgebase. The ultimate goal is to build a model, which can improve the way to read and interpret the existing data in medical database and future data as well.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Informacja

Wyszukujesz frazę "mining data" wg kryterium: Temat