Temat: Clustering - Prolib Integro

Skocz do pozycji: 1.

Tytuł:: A new method for automatic determining of the DBSCAN parameters
Autorzy:: Starczewski, Artur
Goetzen, Piotr
Er, Meng Joo
Tematy:: clustering algorithms
DBSCAN
data mining; Pokaż więcej
Wydawca:: Społeczna Akademia Nauk w Łodzi. Polskie Towarzystwo Sieci Neuronowych
Powiązania:: https://bibliotekanauki.pl/articles/1837535.pdf Link otwiera się w nowym oknie
Opis:: Clustering is an attractive technique used in many fields in order to deal with large scale data. Many clustering algorithms have been proposed so far. The most popular algorithms include density-based approaches. These kinds of algorithms can identify clusters of arbitrary shapes in datasets. The most common of them is the Density-Based Spatial Clustering of Applications with Noise (DBSCAN). The original DBSCAN algorithm has been widely applied in various applications and has many different modifications. However, there is a fundamental issue of the right choice of its two input parameters, i.e the eps radius and the MinPts density threshold. The choice of these parameters is especially difficult when the density variation within clusters is significant. In this paper, a new method that determines the right values of the parameters for different kinds of clusters is proposed. This method uses detection of sharp distance increases generated by a function which computes a distance between each element of a dataset and its k-th nearest neighbor. Experimental results have been obtained for several different datasets and they confirm a very good performance of the newly proposed method.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 2.

Tytuł:: Ant colony metaphor in a new clustering algorithm
Autorzy:: Boryczka, U.
Tematy:: data mining
cluster analysis
ant clustering algorithm; Pokaż więcej
Wydawca:: Polska Akademia Nauk. Instytut Badań Systemowych PAN
Powiązania:: https://bibliotekanauki.pl/articles/969824.pdf Link otwiera się w nowym oknie
Opis:: Among the many bio-inspired techniques, ant clustering algorithms have received special attention, especially because they still require much investigation to improve performance, stability and other key features that would make such algorithms mature tools for data mining. Clustering with swarm-based algorithms is emerging as an alternative to more conventional clustering methods, such as k-means algorithm. This proposed approach mimics the clustering behavior observed in real ant colonies. As a case study, this paper focuses on the behavior of clustering procedures in this new approach. The proposed algorithm is evaluated on a number of well-known benchmark data sets. Empirical results clearly show that the ant clustering algorithm (ACA) performs well when compared to other techniques.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 3.

Tytuł:: A novel grid-based clustering algorithm
Autorzy:: Starczewski, Artur
Scherer, Magdalena M.
Książek, Wojciech
Dębski, Maciej
Wang, Lipo
Tematy:: data mining
grid-based clustering
grid structure; Pokaż więcej
Wydawca:: Społeczna Akademia Nauk w Łodzi. Polskie Towarzystwo Sieci Neuronowych
Powiązania:: https://bibliotekanauki.pl/articles/2031101.pdf Link otwiera się w nowym oknie
Opis:: Data clustering is an important method used to discover naturally occurring structures in datasets. One of the most popular approaches is the grid-based concept of clustering algorithms. This kind of method is characterized by a fast processing time and it can also discover clusters of arbitrary shapes in datasets. These properties allow these methods to be used in many different applications. Researchers have created many versions of the clustering method using the grid-based approach. However, the key issue is the right choice of the number of grid cells. This paper proposes a novel grid-based algorithm which uses a method for an automatic determining of the number of grid cells. This method is based on the kdist function which computes the distance between each element of a dataset and its kth nearest neighbor. Experimental results have been obtained for several different datasets and they confirm a very good performance of the newly proposed method.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 4.

Tytuł:: Supporting investment decisions using data mining methods
Autorzy:: Sysiak, W.
Trajer, J.
Janaszek, M.
Tematy:: data mining
decision support
k-means clustering
neural networks; Pokaż więcej
Wydawca:: Uniwersytet Przyrodniczo-Humanistyczny w Siedlcach
Powiązania:: https://bibliotekanauki.pl/articles/93017.pdf Link otwiera się w nowym oknie
Opis:: This paper presents an application of k-means clustering in preliminary data analysis which preceded the choice of input variables for the system supporting the decision about stock purchase or sale on capital markets. The model forecasting share prices issued by companies in the food-processing sector quoted at the Warsaw Stock Exchange was created in STATISTICA 7.1. It was based on neural modeling and allowed for the assessment of changes direction in securities values (increase, decrease) and generates the quantitative forecast of their future price.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 5.

Tytuł:: Data mining tasks and methods – implementations in R
Autorzy:: Figielska, Ewa
Tematy:: data mining
R programming language
classification
prediction
clustering
association; Pokaż więcej
Wydawca:: Warszawska Wyższa Szkoła Informatyki
Powiązania:: https://bibliotekanauki.pl/articles/1397482.pdf Link otwiera się w nowym oknie
Opis:: The aim of the paper is to present how some of the data mining tasks can be solved using the R programming language. The full R scripts are provided for preparing data sets, solving the tasks and analyzing the results.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 6.

Tytuł:: Life Insurance Customers segmentation using fuzzy clustering
Autorzy:: Jandaghi, Gholamreza
Moazzez, Hashem
Moradpour, Zahra
Tematy:: Market segmentation
customer segmentation
data mining
fuzzy clustering
life insurance; Pokaż więcej
Wydawca:: Przedsiębiorstwo Wydawnictw Naukowych Darwin / Scientific Publishing House DARWIN
Powiązania:: https://bibliotekanauki.pl/articles/1193938.pdf Link otwiera się w nowym oknie
Opis:: One of the important issues in service organizations is to identify the customers, understanding their difference and ranking them. Recently, the customer value as a quantitative parameter has been used for segmenting customers. A practical solution for analytical development is using analytical techniques such as dynamic clustering algorithms and programs to explore the dynamics in consumer preferences. The aim of this research is to understand the current customer behavior and suggest a suitable policy for new customers in order to attain the highest benefits and customer satisfaction. To identify such market in life insurance customers, We have used the FKM.pf.niose fuzzy clustering technique for classifying the customers based on their demographic and behavioral data of 1071 people in the period April to October 2014. Results show the optimal number of clusters is 3. These three clusters can be named as: investment, security of life and a combination of both. Some suggestions are presented to improve the performance of the insurance company.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 7.

Tytuł:: Możliwości zastosowania data mining w sektorze opieki zdrowotnej
Autorzy:: Sala, Karolina
Selwon, Adam
Tematy:: health care
data mining
data analysis
clustering
regression
prevention of errors; Pokaż więcej
Wydawca:: Instytut Studiów Międzynarodowych i Edukacji Humanum
Powiązania:: https://bibliotekanauki.pl/articles/2148180.pdf Link otwiera się w nowym oknie
Opis:: Health care is a dynamically developing sector of the economy, which generates a large amount of useful data about health of the inhabitants of the country and individual regions. These include information on the incidence of selected diseases, data on medical facilities and employees, as well as expenditure on health care. in recent years, many scientific articles about data mining in health care have been published. in this article, presented a review of the literature on health analytics and data mining techniques used in this field. based on the information gathered, the current development in this field and possibilities that can be used in the future are indicated.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 8.

Tytuł:: Abridged Symbolic Representation of Time Series for Clustering
Skrócona reprezentacja symboliczna szeregów czasowych dla analizy skupień
Autorzy:: Korzeniewski, Jerzy
Tematy:: analiza skupień
szereg czasowy
reprezentacja symboliczna
data mining
clustering
time series
symbolic representation; Pokaż więcej
Wydawca:: Uniwersytet Łódzki. Wydawnictwo Uniwersytetu Łódzkiego
Powiązania:: https://bibliotekanauki.pl/articles/658783.pdf Link otwiera się w nowym oknie
Opis:: W ostatnich latach pojawiły się metody symbolicznego reprezentowania szeregów czasowych. Te badania są zasadniczo motywowane względami praktycznymi, takimi jak oszczędzanie pamięci lub szybkie przeszukiwanie baz danych. Niektóre wyniki w temacie symbolicznego reprezentowania szeregów czasowych sugerują, że zapis skrócony może nawet poprawić wyniki grupowania. Artykuł zawiera propozycję nowego algorytmu ukierunkowanego na zagadnienie skróconej symbolicznej reprezentacji szeregów czasowych, a w szczególności na efektywne grupowanie szeregów. Idea propozycji polega na wykorzystaniu techniki PAA (piecewise aggregate approximation) z następną analizą korelacji otrzymanych segmentów szeregu. Podstawowym celem artykułu jest modyfikacja techniki PAA ukierunkowana na możliwość dalszego grupowania szeregów w ich skróconym zapisie. Próbowano również znaleźć odpowiedzi na następujące pytania: „Czy zadanie grupowania szeregów czasowych w ich oryginalnej postaci ma sens?”, „Ile pamięci można oszczędzić, stosując nowy algorytm?”. Efektywność nowego algorytmu została zbadana na empirycznych zbiorach danych szeregów czasowych. Wyniki pokazują, że nowa propozycja jest dość efektywna przy bardzo nikłym stopniu parametryzacji wymaganym od użytkownika.
In recent years a couple of methods aimed at time series symbolic representation have been introduced or developed. This activity is mainly justified by practical considerations such memory savings or fast data base searching. However, some results suggest that in the subject of time series clustering symbolic representation can even upgrade the results of clustering. The article contains a proposal of a new algorithm directed at the task of time series abridged symbolic representation with the emphasis on efficient time series clustering. The idea of the proposal is based on the PAA (piecewise aggregate approximation) technique followed by segmentwise correlation analysis. The primary goal of the article is to upgrade the quality of the PAA technique with respect to possible time series clustering (its speed and quality). We also tried to answer the following questions. Is the task of time series clustering in their original form reasonable? How much memory can we save using the new algorithm? The efficiency of the new algorithm was investigated on empirical time series data sets. The results prove that the new proposal is quite effective with a very limited amount of parametric user interference needed.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 9.

Tytuł:: Center-based l₁-clustering method
Autorzy:: Sabo, K.
Tematy:: l1 clustering
data mining
optimization
weighted median problem
metoda grupowania
eksploracja danych
optymalizacja; Pokaż więcej
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Powiązania:: https://bibliotekanauki.pl/articles/330910.pdf Link otwiera się w nowym oknie
Opis:: In this paper, we consider the l₁-clustering problem for a finite data-point set which should be partitioned into k disjoint nonempty subsets. In that case, the objective function does not have to be either convex or differentiable, and generally it may have many local or global minima. Therefore, it becomes a complex global optimization problem. A method of searching for a locally optimal solution is proposed in the paper, the convergence of the corresponding iterative process is proved and the corresponding algorithm is given. The method is illustrated by and compared with some other clustering methods, especially with the l₂-clustering method, which is also known in the literature as a smooth k-means method, on a few typical situations, such as the presence of outliers among the data and the clustering of incomplete data. Numerical experiments show in this case that the proposed l₁-clustering algorithm is faster and gives significantly better results than the l₂-clustering algorithm.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 10.

Tytuł:: Data mining
Autorzy:: Morzy, Tadeusz
Tematy:: data mining
data analysis
evolution of information technology
association analysis
classification
clustering
Web mining; Pokaż więcej
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Powiązania:: https://bibliotekanauki.pl/articles/703139.pdf Link otwiera się w nowym oknie
Opis:: Recent advances in data capture, data transmission and data storage technologies have resulted in a growing gap between more powerful database systems and users' ability to understand and effectively analyze the information collected. Many companies and organizations gather gigabytes or terabytes of business transactions, scientific data, web logs, satellite pictures, textreports, which are simply too large and too complex to support a decision making process. Traditional database and data warehouse querying models are not sufficient to extract trends, similarities and correlations hidden in very large databases. The value of the existing databases and data warehouses can be significantly enhanced with help of data mining. Data mining is a new research area which aims at nontrivial extraction of implicit, previously unknown and potentially useful information from large databases and data warehouses. Data mining, also referred to as database mining or knowledge discovery in databases, can help answer business questions that were too time consuming to resolve with traditional data processing techniques. The process of mining the data can be perceived as a new way of querying – with questions such as ”which clients are likely to respond to our next promotional mailing, and why?”. The aim of this paper is to present an overall picture of the data mining field as well as presents briefly few data mining methods. Finally, we summarize the concepts presented in the paper and discuss some problems related with data mining technology.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Informacja

Wyszukujesz frazę "Clustering" wg kryterium: Temat