Temat: DBSCAN - Prolib Integro

Skocz do pozycji: 1.

Tytuł:: Application of Density Based Clustering to Microarray Data Analysis
Autorzy:: Raczynski, L.
Wozniak, K.
Rubel, T.
Zaremba, K.
Tematy:: microarrays
cluster analysis
DBSCAN; Pokaż więcej
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Powiązania:: https://bibliotekanauki.pl/articles/226804.pdf Link otwiera się w nowym oknie
Opis:: In just a few years, gene expression microarrays have rapidly become a standard experimental tool in the biological and medical research. Microarray experiments are being increasingly carried out to address the wide range of problems, including the cluster analysis. The estimation of the number of clusters in datasets is one of the main problems of clustering microarrays. As a supplement to the existing methods we suggest the use of a density based clustering technique DBSCAN that automatically defines the number of clusters. The DBSCAN and other existing methods were compared using the microarray data from two datasets used for diagnosis of leukemia and lung cancer.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 2.

Tytuł:: A new method for automatic determining of the DBSCAN parameters
Autorzy:: Starczewski, Artur
Goetzen, Piotr
Er, Meng Joo
Tematy:: clustering algorithms
DBSCAN
data mining; Pokaż więcej
Wydawca:: Społeczna Akademia Nauk w Łodzi. Polskie Towarzystwo Sieci Neuronowych
Powiązania:: https://bibliotekanauki.pl/articles/1837535.pdf Link otwiera się w nowym oknie
Opis:: Clustering is an attractive technique used in many fields in order to deal with large scale data. Many clustering algorithms have been proposed so far. The most popular algorithms include density-based approaches. These kinds of algorithms can identify clusters of arbitrary shapes in datasets. The most common of them is the Density-Based Spatial Clustering of Applications with Noise (DBSCAN). The original DBSCAN algorithm has been widely applied in various applications and has many different modifications. However, there is a fundamental issue of the right choice of its two input parameters, i.e the eps radius and the MinPts density threshold. The choice of these parameters is especially difficult when the density variation within clusters is significant. In this paper, a new method that determines the right values of the parameters for different kinds of clusters is proposed. This method uses detection of sharp distance increases generated by a function which computes a distance between each element of a dataset and its k-th nearest neighbor. Experimental results have been obtained for several different datasets and they confirm a very good performance of the newly proposed method.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 3.

Tytuł:: Segmentation of the melanoma lesion and its border
Autorzy:: Surówka, Grzegorz
Ogorzałek, Maciej
Opis:: Segmentation of the border of the human pigmented lesions has a direct impact on the diagnosis of malignant melanoma. In this work, we examine performance of (i) morphological segmentation of a pigmented lesion by region growing with the adaptive threshold and density-based DBSCAN clustering algorithm, and (ii) morphological segmentation of the pigmented lesion border by region growing of the lesion and the background skin. Research tasks (i) and (ii) are evaluated by a human expert and tested on two data sets, A and B, of different origins, resolution, and image quality. The preprocessing step consists of removing the black frame around the lesion and reducing noise and artifacts. The halo is removed by cutting out the dark circular region and filling it with an average skin color. Noise is reduced by a family of Gaussian filters 3×3−7×7 to improve the contrast and smooth out possible distortions. Some other filters are also tested. Artifacts like dark thick hair or ruler/ink markers are removed from the images by using the DullRazor closing images for all RGB colors for a hair brightness threshold below a value of 25 or, alternatively, by the BTH transform. For the segmentation, JFIF luminance representation is used. In the analysis (i), out of each dermoscopy image, a lesion segmentation mask is produced. For the region growing we get a sensitivity of 0.92/0.85, a precision of 0.98/0.91, and a border error of 0.08/0.15 for data sets A/B, respectively. For the density-based DBSCAN algorithm, we get a sensitivity of 0.91/0.89, a precision of 0.95/0.93, and a border error of 0.09/0.12 for data sets A/B, respectively. In the analysis (ii), out of each dermoscopy image, a series of lesion, background, and border segmentation images are derived. We get a sensitivity of about 0.89, a specificity of 0.94 and an accuracy of 0.91 for data set A, and a sensitivity of about 0.85, specificity of 0.91 and an accuracy of 0.89 for data set B. Our analyses show that the improved methods of region growing and density-based clustering performed after proper preprocessing may be good tools for the computer-aided melanoma diagnosis.
Dostawca treści:: Repozytorium Uniwersytetu Jagiellońskiego

Artykuł

na półce

Skocz do pozycji: 4.

Tytuł:: Clustering based on poverty indicator data using K-Means cluster with Density-Based Spatial Clustering of Application with Noise
Autorzy:: Rasyid, Sapriadi
Siswanto, Siswanto
Sahriman, Sitti
Tematy:: Cluster
DBSCAN
poverty
K-Means
Silhouette Coefficient; Pokaż więcej
Wydawca:: Główny Urząd Statystyczny
Powiązania:: https://bibliotekanauki.pl/articles/61791783.pdf Link otwiera się w nowym oknie
Opis:: The Indonesian government has implemented poverty alleviation programs, including assistance programs for the poor. Despite these efforts, the number of impoverished individuals in South Sulawesi continues to rise. To address this issue, a statistical method is necessary to cluster the poor based on error indicators for each region, serving as a reference for providing assistance. The appropriate statistical method is cluster analysis by minimizing object differences within one cluster and maximizing object differences between clusters. This study employs two methods, namely K-Means and Density-Based Spatial Clustering of Application with Noise (DBSCAN), to compare their effectiveness based on the Silhouette Coefficient. The data used for the analysis included eight poverty indicators for the South Sulawesi province in 2022. The K-Means method yielded two optimal clusters, with cluster 1 comprised of 23 regencies and cities, and cluster 2 only of Makassar City. The results of further analysis on cluster 1 consisted of eight new clusters and produced a Silhouette Coefficient of 0.507. In contrast, the DBSCAN method yielded one cluster, that encompassed 23 regencies and cities, with Makassar City identified as noise. The results of the further analysis on the clusters consisted of one cluster with three noises and produced a Silhouette Coefficient of 0.318. The study concludes that K-Means provides a higher Silhouette Coefficient and a more accurate representation of poverty clusters in South Sulawesi, which renders it a more effective tool for targeted poverty alleviation efforts.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 5.

Tytuł:: Research on ship trajectory extraction based on multiattribute DBSCAN optimisation algorithm
Autorzy:: Xu, Xiaofeng
Cui, Deqaing
Li, Yun
Xiao, Yingjie
Tematy:: clustering algorithm
abnormal route
DBSCAN
feature trajectory extraction
fitting analysis; Pokaż więcej
Wydawca:: Politechnika Gdańska. Wydział Inżynierii Mechanicznej i Okrętownictwa
Powiązania:: https://bibliotekanauki.pl/articles/1551877.pdf Link otwiera się w nowym oknie
Opis:: With the vigorous development of maritime traffic, the importance of maritime navigation safety is increasing day by day. Ship trajectory extraction and analysis play an important role in ensuring navigation safety. At present, the DBSCAN (density-based spatial clustering of applications with noise) algorithm is the most common method in the research of ship trajectory extraction, but it has shortcomings such as missing ship trajectories in the process of trajectory division. The improved multi-attribute DBSCAN algorithm avoids trajectory division and greatly reduces the probability of missing sub-trajectories. By introducing the position, speed and heading of the ship track point, dividing the complex water area and vectorising the ship track, the function of guaranteeing the track integrity can be achieved and the ship clustering effect can be better realised. The result shows that the cluster fitting effect reaches up to 99.83%, which proves that the multi-attribute DBSCAN algorithm and cluster analysis algorithm have higher reliability and provide better theoretical guidance for the analysis of ship abnormal behaviour.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 6.

Tytuł:: GrDBSCAN: A granular density-based clustering algorithm
Autorzy:: Suchy, Dawid
Siminski, Krzysztof
Tematy:: granular computing
DBSCAN
clustering algorithm
GrDBSCAN
przetwarzanie ziarniste
algorytm grupowania; Pokaż więcej
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Powiązania:: https://bibliotekanauki.pl/articles/15548018.pdf Link otwiera się w nowym oknie
Opis:: Density-based spatial clustering of applications with noise (DBSCAN) is a commonly known and used algorithm for data clustering. It applies a density-based approach and can produce clusters of any shape. However, it has a drawback-its worst-case computational complexity is O(n2) with regard to the number of data items n. The paper presents GrDBSCAN: a granular modification of DBSCAN with reduced complexity. The proposed GrDBSCAN first granulates data into fuzzy granules and then runs density-based clustering on the resulting granules. The complexity of GrDBSCAN is linear with regard to the input data size and higher only for the number of granules. That number is, however, a parameter of the GrDBSCAN algorithm and is (significantly) lower than that of input data items. This results in shorter clustering time than in the case of DBSCAN. The paper is accompanied by numerical experiments. The implementation of GrDBSCAN is freely available from a public repository.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 7.

Tytuł:: Implementation and evaluation of clustering algorithm for streaming data
Implementacja i ewaluacja algorytmu strumieniowej klasteryzacji danych
Autorzy:: Grochal, Anna
Opis:: Problemem przedstawionym w tej pracy jest implementacja i ewaluacja algorytmu, który pozwoli uzyskać dokładne wyniki dla strumieniowej klasteryzacji danych. Podczas pracy ze strumieniami danych możliwe jest, że klastry będą migrować z czasem, mogą się nakładać, a następnie wracać do swoich poprzednich pozycji. Niektóre nowe klastry mogą się pojawić, a inne mogą zniknąć.Proponowany algorytm powinien zapewniać sposób na zapamiętanie wcześniej wykrytych klastrów w celu rozpoznania ich jako oddzielnych klastrów przez całą ich podróż. Zaimplementowany algorytm wykorzystuje algorytm DBSCAN w fazie offline.Ta praca zawiera wyjaśnienie, czym jest klastrowanie, kilka przykładów algorytmów stosowanych do klastrowania klasycznych zbiorów danych, czym są strumienie danych i jakie są wyzwania związane z klastrowaniem danych strumieniowych. W kolejnych rozdziałach znajduje się wyjaśnienie idei nowego algorytmu - DBSCANStream, jego kod oraz przykłady klastrowania z jego wykorzystaniem. Ocenę algorytmu przeprowadzono za pomocą metryk Rand Index i V-measure.
The problem presented in this thesis is how to implement and evaluate algorithm that will achieve accurate results for streaming data clustering. When working with data streams it is possible that clusters will migrate over time, they might overlap and then go back to their previous positions. Some new clusters might appear and others may fade. Proposed algorithm should provide a way to remember clusters detected previously in order to recognize them as separate clusters through all of their journey. Implemented algorithm uses DBSCAN in the offline phase.This thesis contains an explanation of what is clustering, some examples of batch clustering algorithms, what are data streams and what are the challenges with clustering streaming data. In the following chapters there is an explanation of the idea for the new algorithm - DBSCANStream, its code and examples of clustering using it. The evaluation of the algorithm was performed using Rand Index and V-measure metrics.
Dostawca treści:: Repozytorium Uniwersytetu Jagiellońskiego

Inne

na półce

Skocz do pozycji: 8.

Tytuł:: Automatic clustering of unidentified data using unsupervised machine learning algorithms
Automatyczna klasteryzacja niezidentyfikowanych danych z wykorzystaniem algorytmów nienadzorowanego uczenia maszynowego
Autorzy:: Mełech, Małgorzata
Opis:: The research problem described in the thesis was the process of clustering data that is characterized by a lack of labels. The idea was to analyze cluster analysis techniques based on 4 different algorithms, namely the k-means method, hierarchical, DBSCAN and meanshift. In addition, as part of the thesis, a web application was created to automatically cluster unidentified data through the aforementioned unsupervised machine learning algorithms.The first chapter contains an introduction to the thesis, which provides an overview of the thesis and thesis project, as well as their primary objectives. In addition, the motivation for choosing the topic and the possible use of the thesis project are presented.In the second chapter, all issues related to the topic are described in detail. Among other things, unsupervised machine learning was defined. In addition, the process of data preparation was characterized, especially scaling and standardization of data, and the methods that can be performed when working on sets that have missing values. In addition, the process of dimensionality reduction was described. Also defined are the various metrics that were used in the cluster analysis process. The various clustering algorithms are also described in turn in the chapter.The section on the k-means method additionally presents auxiliary methods for determining the appropriate number of clusters and the k-means++ method, which makes it possible to obtain more favorable and correct clustering results than the classic k-means algorithm. The next section describes the hierarchical method, especially its agglomerative approach in the clustering process. The DBSCAN and meanshift methods also provide an overview of the algorithm's operation, as well as the most important parameters and issues for the respective methods. Each description is additionally enriched with visualizations for the given methods.In the third chapter, detailed assumptions for the thesis project are described. UML diagrams depicting the flow of data in the application and the possible actions that the user can take are presented.In the fourth chapter, the structure of the service was described, i.e. the main page of the web application was presented, as well as the subsequent individual actions of the diploma project, along with a description of possible functions and choices. In addition, the sources and tools used in the application development process were described in detail.The last chapter presents the results and conclusions of the research work.It can be noted that the presented methods differ in their approach in the clustering process. As a result, different clustering results can be obtained on one and the same set. The resulting web application can facilitate the process of clustering different datasets, and by selecting the appropriate parameters, the appropriate cluster analysis result can be obtained.
Problemem badawczym opisanym w pracy magisterskiej był proces klasteryzacji danych, które charakteryzują się brakiem etykiet. Chodziło o przeanalizowanie technik analizy skupień na podstawie 4 różnych algorytmów, czyli metody k-średnich, hierarchicznej, DBSCAN oraz meanshift. Dodatkowo w ramach pracy dyplomowej powstała aplikacja webowa, której działanie polega na automatycznej klasteryzacji niezidentyfikowanych danych poprzez wymienione algorytmy uczenia maszynowego nienadzorowanego.Pierwszy rozdział zawiera wstęp do pracy magisterskiej, który zawiera ogólny zarys pracy i projektu dyplomowego oraz ich zasadnicze cele. Dodatkowo przedstawiono motywację wyboru tematu oraz możliwe wykorzystanie projektu dyplomowego.W drugim rozdziale zostały szczegółowo opisane wszelkie zagadnienia związane z tematyką. Między innymi zostało zdefiniowane uczenie maszynowe nienadzorowane. Dodatkowo scharakteryzowano proces przygotowania danych, a zwłaszcza skalowanie oraz standaryzacje danych, oraz metody, które można wykonać w przypadku pracy na zbiorach, które mają brakujące wartości. Ponadto został opisany proces redukcji wymiarowości. Również zdefiniowano poszczególne metryki, które zostały wykorzystane w procesie analizy skupień. W danym rozdziale kolejno zostały opisane również poszczególne algorytmy klastrowania. W dziale o metodzie k-średnich dodatkowo przedstawiono pomocnicze metody do określania odpowiedniej liczby skupień oraz metoda k-średnich++, która umożliwia otrzymanie korzystniejszych i poprawnych wyników klastrowania niż klasyczny algorytm k-średnich. W kolejnym podrozdziale opisano metodę hierarchiczną, a zwłaszcza jej aglomeracyjne podejście w procesie grupowania. W metodzie DBSCAN oraz meanshift przedstawiono również ogólny zarys działania algorytmu, oraz najważniejsze parametry i zagadnienia dla danych metod. Każdy opis dodatkowo został wzbogacony o wizualizacje dla danych metod.W trzecim rozdziale zostały opisane szczegółowe założenia dotyczące projektu dyplomowego. Przedstawiono diagramy UML obrazujące przepływ danych w aplikacji oraz możliwe działania, które może podjąć użytkownik.W rozdziale czwartym została opisana struktura serwisu, czyli przedstawiono stronę główną aplikacji webowej, oraz kolejne poszczególne działanie projektu dyplomowego wraz z opisem możliwych funkcji i wyboru. Dodatkowo szczegółowo zostały opisane źródła oraz narzędzia wykorzystane w procesie tworzenia aplikacji.W ostatnim rozdziale zostały przedstawione wyniki i wnioski płynące z pracy naukowej.Można zauważyć, że przedstawione metody różnią się w swoim podejściu w procesie klastryzacji. W wyniku czego można na jednym i tym samym zbiorze otrzymać różne wyniki grupowania. Powstała aplikacje webowa może ułatwić proces klasteryzacji różnych zbiorów danych, a poprzez dobór odpowiednich parametrów można otrzymać odpowiedni wynik analizy skupień.
Dostawca treści:: Repozytorium Uniwersytetu Jagiellońskiego

Inne

na półce

Skocz do pozycji: 9.

Tytuł:: Segmentation of the melanoma lesion and its border
Autorzy:: Surówka, Grzegorz
Ogorzałek, Maciej
Tematy:: computer aided diagnosis
DBSCAN
malignant melanoma
region growing
diagnoza wspomagana komputerowo
czerniak złośliwy
rozrost regionów; Pokaż więcej
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Powiązania:: https://bibliotekanauki.pl/articles/2172123.pdf Link otwiera się w nowym oknie
Opis:: Segmentation of the border of the human pigmented lesions has a direct impact on the diagnosis of malignant melanoma. In this work, we examine performance of (i) morphological segmentation of a pigmented lesion by region growing with the adaptive threshold and density-based DBSCAN clustering algorithm, and (ii) morphological segmentation of the pigmented lesion border by region growing of the lesion and the background skin. Research tasks (i) and (ii) are evaluated by a human expert and tested on two data sets, A and B, of different origins, resolution, and image quality. The preprocessing step consists of removing the black frame around the lesion and reducing noise and artifacts. The halo is removed by cutting out the dark circular region and filling it with an average skin color. Noise is reduced by a family of Gaussian filters 3×3−7×7 to improve the contrast and smooth out possible distortions. Some other filters are also tested. Artifacts like dark thick hair or ruler/ink markers are removed from the images by using the DullRazor closing images for all RGB colors for a hair brightness threshold below a value of 25 or, alternatively, by the BTH transform. For the segmentation, JFIF luminance representation is used. In the analysis (i), out of each dermoscopy image, a lesion segmentation mask is produced. For the region growing we get a sensitivity of 0.92/0.85, a precision of 0.98/0.91, and a border error of 0.08/0.15 for data sets A/B, respectively. For the density-based DBSCAN algorithm, we get a sensitivity of 0.91/0.89, a precision of 0.95/0.93, and a border error of 0.09/0.12 for data sets A/B, respectively. In the analysis (ii), out of each dermoscopy image, a series of lesion, background, and border segmentation images are derived. We get a sensitivity of about 0.89, a specificity of 0.94 and an accuracy of 0.91 for data set A, and a sensitivity of about 0.85, specificity of 0.91 and an accuracy of 0.89 for data set B. Our analyses show that the improved methods of region growing and density-based clustering performed after proper preprocessing may be good tools for the computer-aided melanoma diagnosis.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 10.

Tytuł:: Clustering algorithms in scikit library
Algorytmy klastrowania w bibliotece scikit
Autorzy:: Kaczyńska, Aneta
Opis:: Celem tej pracy jest szczegółowe omówienie niektórych algorytmów klastrowania i ich implementacji w bibliotece scikit-learn. Po krótkim wyjaśnieniu, czym zajmuje się analiza skupień i podaniu przykładów jej zastosowania, dokładnie opisuję pięć wybranych przeze mnie algorytmów: algorytm k-średnich, algorytm aglomeracyjny, DBSCAN, OPTICS i klasyfikację spektralną. Dla każdego z nich omawiam stojącą za nim ideę, przedstawiam jego działanie krok po kroku oraz opisuję jego implementacje w bibliotece scikit-learn. Na końcu dokonuję porównania moich implementacji z implementacjami z biblioteki scikit-learn na danych testowych oraz oceniam, jak omawiane algorytmy poradziły sobie z danymi rzeczywistymi.
The aim of this paper is to describe in detail a few clustering algorithms and their implementations in scikit-learn library. After a brief explanation of what cluster analysis is and giving a few examples of its usage, I thoroughly describe each of the five chosen algorithms: k-means, agglomerative clustering, DBSCAN, OPTICS and spectral clustering. At the end, I make a comparison between my own implementations and scikit-learn implementations on toy data sets. I also estimate how well the mentioned algorithms worked on real data.
Dostawca treści:: Repozytorium Uniwersytetu Jagiellońskiego

Inne

na półce

Informacja

Wyszukujesz frazę "DBSCAN" wg kryterium: Temat