Temat: OpenCL - Prolib Integro

Skocz do pozycji: 1.

Tytuł:: 3D objects cut effect in real-time
Efekt cięcia obiektów trójwymiarowych w czasie rzeczywistym
Autorzy:: Szlęk, Maciej
Opis:: Aim of this study was to develop an algorithm and implementation of the effect of any (non-predefined) cuts of three-dimensional objects in the form of triangle meshes in real-time. Due to the main use of this effect in video games, one of the basic assumptions was minimum duration of the operation. For this too, as well as the parallel nature of the problem, further attempts were made to transfer all possible computations on GPUs.
Celem pracy było stworzenie algorytmu i implementacja efektu dowolnego (niepredefiniowanego) cięcia w czasie rzeczywistym obiektów trójwymiarowych w postaci siatek trójkątów. Ze względu na główne zastosowanie tego efektu w grach wideo, jednym z podstawowych założeń był jak najkrótszy czas trwania operacji. Przez to też, jak i przez równoległą naturę samego problemu, podjęte zostały dodatkowo próby przeniesienia wszystkich możliwych obliczeń na procesory graficzne.
Dostawca treści:: Repozytorium Uniwersytetu Jagiellońskiego

Inne

na półce

Skocz do pozycji: 2.

Tytuł:: Perfomance comparison of CUDA and OpenCL platforms
Porównanie wydajności środowisk CUDA i OpenCL
Autorzy:: Ewak, Grzegorz
Opis:: Celem tej pracy magisterskiej było porównanie wydajności środowisk CUDA oraz OpenCL. W ramach pracy zaimplementowany został szereg testów, których wynikiem były wartości opóźnień i przepustowości podstawowych operacji arytmetycznych oraz funkcji matematycznych. W treści pracy zawarte zostały wyniki testów wraz z ich analizą.
The goal of this master's thesis was performance comparison of CUDA and OpenCL platforms. This thesis covers the implementation of several benchmarks which results consist of latencies and throughputs of basic arithmetic operations and mathematical functions. Thesis includes the results of those benchmarks along with their analysis.
Dostawca treści:: Repozytorium Uniwersytetu Jagiellońskiego

Inne

na półce

Skocz do pozycji: 3.

Tytuł:: Parallel implementation of neural networks with the use of GPGPU technology OpenCL
Autorzy:: Kłyś, M.
Szymczyk, M.
Szymczyk, P.
Gajer, M.
Tematy:: OpenCL
Artificial Neural Networks
GPGPU; Pokaż więcej
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Powiązania:: https://bibliotekanauki.pl/articles/114679.pdf Link otwiera się w nowym oknie
Opis:: The article discusses possibilities of implementing a neural network in a parallel way. The issues of implementation are illustrated with the example of the non-linear neural network. Parallel implementation of earlier mentioned neural network is written with the use of OpenCL library, which is a representative of software supporting general-purpose computing on graphics processor units (GPGPU). The obtained results demonstrate that some group of algorithms can be computed faster if they are implemented in a parallel way and run on a multi-core processor (CPU) or a graphics processing unit (GPU). In case of the GPU, the implemented algorithm should be divided into many threads in order to perform computations faster than on a multi-core CPU. In general, computations on a GPU should be performed when there is a need to process a large amount of data with the use of algorithm which is very well suited to parallel implementation.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 4.

Tytuł:: Assessment of various GPU acceleration strategies in text categorization processing flow
Autorzy:: Korduła, Ł.
Wielgosz, M.
Karwatowski, M.
Pietroń, M.
Żurek, D.
Wiatr, K.
Tematy:: GPU
NLP
text categorization
OpenCL; Pokaż więcej
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Powiązania:: https://bibliotekanauki.pl/articles/114132.pdf Link otwiera się w nowym oknie
Opis:: Automatic text categorization presents many difficulties. Modern algorithms are getting better in extracting meaningful information from human language. However, they often significantly increase complexity of computations. This increased demand for computational capabilities can be facilitated by the usage of hardware accelerators like general purpose graphic cards. In this paper we present a full processing flow for document categorization system. Gram-Schmidt process signatures calculation up to 12 fold decrease in computing time of system components.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 5.

Tytuł:: Design Exploration of AES Accelerators on FPGAs and GPUs
Autorzy:: Conti, V.
Vitabile, S.
Tematy:: AES
accelerators
FPGA prototyping
GPGPU
OpenCL; Pokaż więcej
Wydawca:: Instytut Łączności - Państwowy Instytut Badawczy
Powiązania:: https://bibliotekanauki.pl/articles/308701.pdf Link otwiera się w nowym oknie
Opis:: The embedded systems are increasingly becoming a key technological component of all kinds of complex technical systems and an exhaustive analysis of the state of the art of all current performance with respect to architectures, design methodologies, test and applications could be very interesting. The Advanced Encryption Standard (AES), based on the well-known algorithm Rijndael, is designed to be easily implemented in hardware and software platforms. General purpose computing on graphics processing unit (GPGPU) is an alternative to recongurable accelerators based on FPGA devices. This paper presents a direct comparison between FPGA and GPU used as accelerators for the AES cipher. The results achieved on both platforms and their analysis has been compared to several others in order to establish which device is best at playing the role of hardware accelerator by each solution showing interesting considerations in terms of throughput, speedup factor, and resource usage. This analysis suggests that, while hardware design on FPGA remains the natural choice for consumer-product design, GPUs are nowadays the preferable choice for PC based accelerators, especially when the processing routines are highly parallelizable.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 6.

Tytuł:: Zastosowanie bibliotek numerycznych w obliczeniach MEB
Numerical library usage in BEM
Autorzy:: Król, K.
Pańczyk, M.
Tematy:: MEB
biblioteki numeryczne
CUDA
OpenCL
numerical libraries; Pokaż więcej
Wydawca:: Politechnika Lubelska. Wydawnictwo Politechniki Lubelskiej
Powiązania:: https://bibliotekanauki.pl/articles/408740.pdf Link otwiera się w nowym oknie
Opis:: Zastosowanie bibliotek numerycznych pozwala na znaczne skrócenie czasu obliczeń i ułatwienie pisania kodu programu. Popularne biblioteki BLAS i LAPACK doczekały się dojrzałych implementacji pozwalających na wykorzystanie procesorów wielordzeniowych i środowisk obliczeń rozproszonych w postaci odpowiednio PBLAS i SCALAPACK. Aktualnie podobny proces rozwoju dotyczy środowisk związanych z obliczeniami wykonywanymi na procesorach GPU w dwóch głównych implementacjach GPGPU: NVIDIA CUDA i Kronos/ATI OpenCL. Równolegle z rozwojem tych ostatnich toczą się prace nad mieszanymi CPU-GPU wersjami tych bibliotek czego doskonałym przykładem jest MAGMA. W artykule przedstawione zostaną efekty implementacji kilku wybranych bibliotek z tego zakresu zastosowanych do rozwiązania dwuwymiarowego modelu kondensatora płaskiego metodą elementów brzegowych wykorzystującą stałe elementy brzegowe.
Numerical library usage effectively reduce computation time and facilitate code programming. There are modified versions of popular BLAS and LAPACK libraries, dedicated to multi-core and distributed programming respectively PBLAS and SCALAPACK. Currently, a similar development applies to the GPU programming in two major implementations of GPGPU: NVIDIA CUDA and Kronos / ATI OpenCL. In the same time hybrid CPU-GPU versions of these libraries are intensively developed, a good example of that is MAGMA. This paper will present the effects of some of those libraries implementation used to solve the two-dimensional planar capacitor model by the boundary element method with constant boundary elements.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 7.

Tytuł:: Heterogeneous GPU&CPU cluster for High Performance Computing in cryptography
Autorzy:: Marks, M.
Jantura, J.
Niewiadomska-Szynkiewicz, E.
Strzelczyk, P.
Góźdź, K.
Tematy:: parallel computing
HPC
clusters
GPU computing
OpenCL
cryptography
cryptanalysis; Pokaż więcej
Wydawca:: Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie. Wydawnictwo AGH
Powiązania:: https://bibliotekanauki.pl/articles/305288.pdf Link otwiera się w nowym oknie
Opis:: This paper addresses issues associated with distributed computing systems and the application of mixed GPU&CPU technology to data encryption and decryption algorithms. We describe a heterogenous cluster HGCC formed by two types of nodes: Intel processor with NVIDIA graphics processing unit and AMD processor with AMD graphics processing unit (formerly ATI), and a novel software framework that hides the heterogeneity of our cluster and provides tools for solving complex scientific and engineering problems. Finally, we present the results of numerical experiments. The considered case study is concerned with parallel implementations of selected cryptanalysis algorithms. The main goal of the paper is to show the wide applicability of the GPU&CPU technology to large scale computation and data processing.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 8.

Tytuł:: A Novel GPU-Enabled Simulator for Large Scale Spiking Neural Networks
Autorzy:: Szynkiewicz, P.
Tematy:: GPU computing
OpenCL programming technology
parallel simulation
spiking neural networks; Pokaż więcej
Wydawca:: Instytut Łączności - Państwowy Instytut Badawczy
Powiązania:: https://bibliotekanauki.pl/articles/307680.pdf Link otwiera się w nowym oknie
Opis:: The understanding of the structural and dynamic complexity of neural networks is greatly facilitated by computer simulations. An ongoing challenge for simulating realistic models is, however, computational speed. In this paper a framework for modeling and parallel simulation of biological-inspired large scale spiking neural networks on high-performance graphics processors is described. This tool is implemented in the OpenCL programming technology. It enables simulation study with three models: Integrate-andfire, Hodgkin-Huxley and Izhikevich neuron model. The results of extensive simulations are provided to illustrate the operation and performance of the presented software framework. The particular attention is focused on the computational speed-up factor.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 9.

Tytuł:: Widmowa i falkowa analiza prądu silnika LSPMSM z wykorzystaniem OpenCL
Spectral and wavelet analysis of phase current of LSPMSM machine using OpenCL
Autorzy:: Pietrowski, W.
Wiśniewski, G. D.
Górny, K.
Tematy:: analiza widmowa
analiza falkowa
silnik LSPMSM
OpenCL
obliczenia równoległe
obliczenia sekwencyjne; Pokaż więcej
Wydawca:: Politechnika Poznańska. Wydawnictwo Politechniki Poznańskiej
Powiązania:: https://bibliotekanauki.pl/articles/376228.pdf Link otwiera się w nowym oknie
Opis:: W artykule przedstawiono zastosowanie algorytmów obliczeń równoległych oraz funkcji zawartych w bibliotece OpenCL do analizy harmonicznej i analizy falkowej prądu fazowego silnika LSPMSM. Opisano interface programowania OpenCL oraz opracowane oprogramowanie w języku C++, w którym zaimplementowano zarówno algorytmy sekwencyjne realizowane przez CPU jak również algorytmy równoległe realizowane przez GPU. Przedstawiono porównanie czasu obliczeń algorytmem sekwencyjnym oraz algorytmem równoległym.
The article presents a comparison of a computing time of a parallel and a sequential algorithm in a spectral and a wavelet analysis of a motor LSPMSM current. The test calculations were made on two different sets of computer for different number of signals samples. On the basis of the results of test calculations of harmonic analysis it can be observed that using parallel algorithm a signal processing time has been reduced of several times compared to a sequential algorithm. The advantage of the parallel algorithm is the greater, the more signal samples are processed.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 10.

Tytuł:: Wyznaczanie równoległości pętli programowych w aplikacjach dedykowanych dla procesorów graficznych
Parallelizing program loops for graphics processing in general purpose computing
Autorzy:: Bielecki, W.
Pałkowski, M.
Tematy:: automatyczne zrównoleglanie pętli
fragmenty kodu
GPU
CUDA
OpenCL
obliczenia wysokiej wydajności
loop parallelization
slices; Pokaż więcej
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Powiązania:: https://bibliotekanauki.pl/articles/155271.pdf Link otwiera się w nowym oknie
Opis:: Ekstrakcja równoległości w postaci niezależnych fragmentów kodu pozwala wygenerować równoległe pętle programowe w sposób automatyczny. Kod taki umożliwia wykorzystanie mocy obliczeniowej maszyn równoległych, w tym wieloprocesorowych kart graficznych. W niniejszym artykule poddano analizie zastosowanie algorytmów wyznaczania fragmentów kodu dla aplikacji dedykowanych dla procesorów graficznych. Zbadano przyspieszenie i efektywność obliczeń oraz skalowalność wygenerowanego kodu równoległego.
Extracting synchronization-free slices allows automatically generating parallel loops. The code can be executed on multi-processors machines in a reduced period of time. Slicing techniques enable also generating parallel code for graphics processing in general purpose computing. Nowadays, graphic cards support executing multi-threaded applications. GPU systems consist of tens or hundreds of processors. CUDA (an acronym for Compute Unified Device Architecture) is a parallel computing architecture developed by NVIDIA. Graphics processing units (GPUs) are accessible to software developers through variants of industry standard programming languages. Using CUDA, the latest NVIDIA GPUs become accessible for computation like CPUs. The model for GPU computing is to use a CPU and GPU together in a heterogeneous co-processing computing model. The sequential part of the application runs on the CPU and the computationally-intensive part is accelerated by the GPU. From the user's perspective, the application just runs faster because it uses the high-performance of the GPU to boost performance. In this paper slicing algorithms are examined for generating a parallel code for graphic cards are examined. A short example of the code is presented. CUDA statements and technique are explained. Memory cost and transfer data is considered. Speed-up, efficiency and scalability of the code are analyzed.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 11.

Tytuł:: Zastosowanie sztucznych sieci neuronowych oraz architektury OPENCL w spektralnej i falkowej analizie prądu silnika LSPMSM
Aplication of artificial neural networks and OpenCL in spectral and wavelet analysis of phase current of LSPMSM machine
Autorzy:: Pietrowski, W.
Wiśniewski, G. D.
Górny, K.
Tematy:: analiza widmowa
analiza falkowa
silnik LSPMSM
OpenCL
obliczenia równoległe
sztuczne sieci neuronowe
algorytm wstecznej propagacji błędu; Pokaż więcej
Wydawca:: Politechnika Poznańska. Wydawnictwo Politechniki Poznańskiej
Powiązania:: https://bibliotekanauki.pl/articles/377377.pdf Link otwiera się w nowym oknie
Opis:: W artykule przedstawiono autorskie algorytmy obliczeń równoległych które zostały zastosowane w oprogramowaniu do diagnostyki silnika LSPMSM. Oprogramowanie umożliwia spektralną i falkową analizę prądu maszyny a także posiada wbudowane mechanizmy sztucznych sieci neuronowych (SSN) które to mogą służyć jako element decyzyjny systemu diagnostycznego. Ponadto przybliżono tematykę związaną ze strukturą zastosowanej sieci neuronowej, algorytmami nauczania sztucznych sieci neuronowych oraz standardem OpenCL.
The paper presents algorithms of parallel computing which have been used in program for diagnosis of LSPMSM machine. The software allows to spectral and wavelet analysis of phase current of LSPMSM motor. Moreover, the program has a built-in artificial neural network which is a decisive element of the diagnostic system. In addition, the article brought closer to issues related to the structure and learning algorithms of artificial neural networks and OpenCL.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Skocz do pozycji: 12.

Tytuł:: The GPU performance in coordination of parallel tasks in access to resource groups without conflicts
Autorzy:: Smoliński, M.
Tematy:: resource conflict elimination
conflict free task execution
mutual exclusion
deadlock avoidance
cooperative concurrency control
GPU massively parallel processing
SIMD control SISD
GPGPU using OpenCL; Pokaż więcej
Wydawca:: Szkoła Główna Gospodarstwa Wiejskiego w Warszawie. Wydawnictwo Szkoły Głównej Gospodarstwa Wiejskiego w Warszawie
Powiązania:: https://bibliotekanauki.pl/articles/94883.pdf Link otwiera się w nowym oknie
Opis:: In high contention environments, with limited number of shared resources, elimination of resource conflicts between tasks processed in parallel is required. Execution of all tasks without resource conflicts can be achieved by preparing a proper overall schedule for all of them. The effective calculation of conflict-free execution plan for tasks provides the conflictless scheduling algorithm that is dedicated to GPU massively parallel processing. The conflictless scheduling algorithm base on rapid resource conflict detection to mutual exclusion of conflicted tasks in access to global resources and is an alternative to other task synchronization methods. This article presents the performance of modern GPU in calculations of adaptive conflictless task schedule. The performance analysis also takes into account all data transfers to and from the GPU memory in various phases of the conflictless task scheduling algorithm.
Dostawca treści:: Biblioteka Nauki

Artykuł

na półce

Informacja

Wyszukujesz frazę "OpenCL" wg kryterium: Temat