Temat: MapReduce - Prolib Integro

Skocz do pozycji: 1.

Tytuł:: Recommendation systems based on Slope One algorithm
Systemy rekomendacji oparte o algorytm Slope One
Autorzy:: Dudek, Łukasz
Opis:: W pracy zaprezentowano szereg optymalizacji algorytmu Slope One oraz zaproponowano w jaki sposób może on zostać zaimplementowany przy użyciu algorytmu MapReduce wraz z impementacją w środowisku Microsoft Azure przy wykorzystaniu biblioteki Daytona. Praca zawiera również badania nad efektywnością algorytmu, które pokazały problemy związane z klasyczną implementacją oraz propozycje modyfikacji algorymu, w których problem startu nie występuje.
The thesis shows a set of improvements and optimizations of Slope One algorithm. It describes how to implement the algorithm using MapReduce approach, including implementaion with use of Daytona library for Microsoft Azure environment. The thesis contains results of the researchs on the algorithm performance that showed issues related with classic implementation. In the thesis modifications that slove isses with cold start problem were propesed.
Dostawca treści:: Repozytorium Uniwersytetu Jagiellońskiego

Inne

na półce

Skocz do pozycji: 2.

Tytuł:: Big Data – znaczenie, zastosowania i rozwiązania technologiczne
Big Data – meaning, applications and technology solutions
Autorzy:: Racka, Katarzyna
Tematy:: Big Data
NoSQL
MapReduce
Hadoop; Pokaż więcej
Wydawca:: Mazowiecka Uczelnia Publiczna w Płocku
Powiązania:: https://bibliotekanauki.pl/articles/446789.pdf Link otwiera się w nowym oknie
Opis:: Big Data technologies and their application to business processes is growing rapidly. Analytical and consulting enterprises specializing in issues of strategic use of IT technology indicate that the number of companies implementing or planning to implement technological solutions related to Big Data is increasing annually. A lot of companies believe that the analysis of unstructured data will be the key to a deeper understanding of customer behavior. They believe that the analyst is absolutely essential or very important to conduct the overall business strategy and improve operational results. The purpose of the article is to define Big Data, explain what the unstructured data are and how to apply them. Furthermore, in the article I present the results of reports on the Big Data technologies implementation and discuss the associated technologies.
Technologie Big Data i ich zastosowanie do procesów biznesowych rozwijają się w tempie dynamicznym. Przedsiębiorstwa analityczno-doradcze specjalizujące się w zagadnieniach strategicznego wykorzystania technologii IT informują, że z roku na rok zwiększa się liczba przedsiębiorstw wdrażających lub planujących wdrożenie rozwiązań technologicznych związanych z Big Data. Dużo przedsiębiorstw uważa, że analizy danych niestrukturalnych będą kluczem do głębszego zrozumienia zachowań klienta. Uważają one, że analityka jest absolutnie niezbędna lub bardzo ważna dla prowadzenia ogólnej strategii biznesowej przedsiębiorstwa oraz do poprawy wyników operacyjnych. Celem tego artykułu jest wyjaśnienie co dokładnie oznacza pojęcie Big Data, co to są dane niestrukturalne oraz jakie mogą mieć zastosowania. Ponadto, w artykule prezentuję wyniki raportów dotyczących wdrażanie technologii Big Data i omawiam przykładowe technologie związane z Big Data.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 3.

Tytuł:: Big data - modern challenges. Introduction to Hadoop platform
Big data - wyzwania współczesności. Wprowadzenie do platformy Hadoop
Autorzy:: Dudzik, Piotr
Opis:: Introduction to Hadoop in HDInsight: Big-data analysis and processing in the cloud.
Wprowadzenie do platformy Hadoop na przykladzie chmury Microsoft Azure.Apache Hadoop jest otwarta implementacja paradygmatu MapReduce. Umożliwia tworzenie działających w rozproszeniu aplikacji, które przeprowadzają obliczenia na dużych ilościach danych w myśl zasady że szybciej przenieść program używany do obliczeń niż same dane. Big Data to zbiory informacji o dużej objętości, dużej zmienności lub dużej różnorodności, które wymagają nowych form przetwarzania w celu wspomagania podejmowania decyzji, odkrywania nowych zjawisk oraz optymalizacji procesów.
Dostawca treści:: Repozytorium Uniwersytetu Jagiellońskiego

Inne

na półce

Skocz do pozycji: 4.

Tytuł:: Mapreduce and semantics enabled event detection using social media
Autorzy:: Yan, P.
Tematy:: event detection
social media
semantic relatedness
MapReduce; Pokaż więcej
Wydawca:: Społeczna Akademia Nauk w Łodzi. Polskie Towarzystwo Sieci Neuronowych
Powiązania:: https://bibliotekanauki.pl/articles/91751.pdf Link otwiera się w nowym oknie
Opis:: Social media is playing an increasingly important role in reporting major events happening in the world. However, detecting events from social media is challenging due to the huge magnitude of the data and the complex semantics of the language being processed. This paper proposes MASEED (MapReduce and Semantics Enabled Event Detection), a novel event detection framework that effectively addresses the following problems: 1) traditional data mining paradigms cannot work for big data; 2) data preprocessing requires significant human efforts; 3) domain knowledge must be gained before the detection; 4) semantic interpretation of events is overlooked; 5) detection scenarios are limited to specific domains. In this work, we overcome these challenges by embedding semantic analysis into temporal analysis for capturing the salient aspects of social media data, and parallelizing the detection of potential events using the MapReduce methodology. We evaluate the performance of our method using real Twitter data. The results will demonstrate the proposed system outperforms most of the state-of-the-art methods in terms of accuracy and efficiency.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 5.

Tytuł:: High Frequency Rule Synthesis in a Large Scale Multiple Database with MapReduce
Autorzy:: Bisoyi, Sudhanshu Shekhar
Mishra, Pragnyaban
Mishra, Saroja Nanda
Tematy:: multiple database
frequent itemset
association rule
rule synthesis
MapReduce
HDFS; Pokaż więcej
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Powiązania:: https://bibliotekanauki.pl/articles/2055260.pdf Link otwiera się w nowym oknie
Opis:: Increasing development in information and communication technology leads to the generation of large amount of data from various sources. These collected data from multiple sources grows exponentially and may not be structurally uniform. In general, these are heterogeneous and distributed in multiple databases. Because of large volume, high velocity and variety of data mining knowledge in this environment becomes a big data challenge. Distributed Association Rule Mining(DARM) in these circumstances becomes a tedious task for an effective global Decision Support System(DSS). The DARM algorithms generate a large number of association rules and frequent itemset in the big data environment. In this situation synthesizing highfrequency rules from the big database becomes more challenging. Many algorithms for synthesizing association rule have been proposed in multiple database mining environments. These are facing enormous challenges in terms of high availability, scalability, efficiency, high cost for the storage and processing of large intermediate results and multiple redundant rules. In this paper, we have proposed a model to collect data from multiple sources into a big data storage framework based on HDFS. Secondly, a weighted multi-partitioned method for synthesizing high-frequency rules using MapReduce programming paradigm has been proposed. Experiments have been conducted in a parallel and distributed environment by using commodity hardware. We ensure the efficiency, scalability, high availability and costeffectiveness of our proposed method.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 6.

Tytuł:: Massive simulations using MapReduce model
Model MapReduce w wielokrotnych obliczeniach numerycznych
Autorzy:: Krupa, A.
Sawicki, B.
Tematy:: mapreduce
cloud computing
platform performance
hadoop
chmura obliczeniowa
wydajność platformy; Pokaż więcej
Wydawca:: Politechnika Lubelska. Wydawnictwo Politechniki Lubelskiej
Powiązania:: https://bibliotekanauki.pl/articles/952714.pdf Link otwiera się w nowym oknie
Opis:: In the last few years cloud computing is growing as a dominant solution for large scale numerical problems. It is based on MapReduce programming model, which provides high scalability and flexibility, but also optimizes costs of computing infrastructure. This paper studies feasibility of MapReduce model for scientific problems consisting of many independent simulations. Experiment based on variability analysis for simple electromagnetic problem with over 10,000 scenarios proves that platform has nearly linear scalability with over 80% of theoretical maximum performance.
W ostatnich latach chmury obliczeniowe stały się dominującym rozwiązaniem używanym do wielkoskalowych obliczeń numerycznych. Najczęściej są one oparte o programistyczny model MapReduce, który zapewnia wysoką skalowalność, elastyczność, oraz optymalizację kosztów infrastruktury. Artykuł w analityczny sposób przedstawia wykorzystanie MapReduce w rozwiązywaniu problemów naukowych złożonych z wielu niezależnych symulacji. Przeprowadzony eksperyment, złożony z ponad 10 000 przypadków, oparty o analizę zmienności pola elektromagnetycznego pokazuje niemal liniową skalowalność platformy i jej ponad 80% wydajności w stosunku do teoretycznego maksimum.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 7.

Tytuł:: A survey of big data classification strategies
Autorzy:: Banchhor, Chitrakant
Srinivasu, N.
Tematy:: big data
data mining
MapReduce
classification
machine learning
evolutionary intelligence
deep learning; Pokaż więcej
Wydawca:: Polska Akademia Nauk. Instytut Badań Systemowych PAN
Powiązania:: https://bibliotekanauki.pl/articles/2050171.pdf Link otwiera się w nowym oknie
Opis:: Big data plays nowadays a major role in finance, industry, medicine, and various other fields. In this survey, 50 research papers are reviewed regarding different big data classification techniques presented and/or used in the respective studies. The classification techniques are categorized into machine learning, evolutionary intelligence, fuzzy-based approaches, deep learning and so on. The research gaps and the challenges of the big data classification, faced by the existing techniques are also listed and described, which should help the researchers in enhancing the effectiveness of their future works. The research papers are analyzed for different techniques with respect to software tools, datasets used, publication year, classification techniques, and the performance metrics. It can be concluded from the here presented survey that the most frequently used big data classification methods are based on the machine learning techniques and the apparently most commonly used dataset for big data classification is the UCI repository dataset. The most frequently used performance metrics are accuracy and execution time.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 8.

Tytuł:: Big problems with big data
Autorzy:: Goczyła, Krzysztof
Tematy:: big data
MapReduce
NoSQL database
data science
baza danych NoSQL
nauka o danych; Pokaż więcej
Wydawca:: Politechnika Gdańska
Powiązania:: https://bibliotekanauki.pl/articles/1954610.pdf Link otwiera się w nowym oknie
Opis:: The article presents an overview of the most important issues related to the phenomenon called big data. The characteristics of big data concerning the data itself and the data sources are presented. Then, the big data life cycle concept is formulated. The next sections focus on two big data technologies: MapReduce for big data processing and NoSQL databases for big data storage.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 9.

Tytuł:: Towards a flexible author name disambiguation framework
Autorzy:: Bolikowski, Łukasz
Dendek, Piotr Jan
Wydawca:: Masaryk University Press
Cytata wydawnicza:: Bolikowski, Łukasz & Dendek, Piotr. (2011). Towards a Flexible Author Name Disambiguation Framework. DML 2011 - Towards a Digital Mathematics Library, Proceedings. 27-37.
Opis:: In this paper we propose a flexible, modular framework for author name disambiguation. Our solution consists of the core which orchestrates the disambiguation process, and replaceable modules performing concrete tasks. The approach is suitable for distributed computing, in particular it maps well to the MapReduce framework. We describe each component in detail and discuss possible alternatives. Finally, we propose procedures for calibration and evaluation of the described system.
Competitiveness and Innovation Programme (Information and Communications Technologies Policy Support Programme, “Open access to scientific information”, Grant Agreement no. 250,503)
Łukasz Bolikowski
Dostawca treści:: Repozytorium Centrum Otwartej Nauki

Artykuł

na półce

Skocz do pozycji: 10.

Tytuł:: Performance evaluation of MapReduce using full virtualisation on a departmental cloud
Autorzy:: González-Vélez, H.
Kontagora, M.
Tematy:: przetwarzanie w chmurze
przetwarzanie równoległe
szkielet algorytmiczny
MapReduce
server virtualization
cloud computing
algorithmic skeletons
structured parallelism
parallel computing; Pokaż więcej
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Powiązania:: https://bibliotekanauki.pl/articles/907802.pdf Link otwiera się w nowym oknie
Opis:: This work analyses the performance of Hadoop, an implementation of the MapReduce programming model for distributed parallel computing, executing on a virtualisation environment comprised of 1+16 nodes running the VMWare workstation software. A set of experiments using the standard Hadoop benchmarks has been designed in order to determine whether or not significant reductions in the execution time of computations are experienced when using Hadoop on this virtualisation platform on a departmental cloud. Our findings indicate that a significant decrease in computing times is observed under these conditions. They also highlight how overheads and virtualisation in a distributed environment hinder the possibility of achieving the maximum (peak) performance.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Informacja

Wyszukujesz frazę "MapReduce" wg kryterium: Temat