A M D -

P R O G R A M M E R W R I T E R

A M D

P R O G R A M M E R

W R I T E R

A M D -

P R O G R A M M E R W R I T E R

The service

The service

Working with GPU Development, Certification & Technical Documentation team for certifying all components of AMD ROCM Product , documenting all the findings as a Programmer Writer for ROCm tool set, ecosystem and GPU developer tools. Some of the end to end handled tasks involve:


1) Deep Learning for Frameworks Guide : Prepared user and developer guides including machine learning and deep learning from AMD ROCM point of view. Also tested and documented about hardware installation, ROCm installation, Deep learning Frameworks-Installing and testing of Pytorch and Tensorflow using docker upon GPU with CIFAR and Inception Datasets, Deep Learning Training & optimization- inferencing including MIGraphX graph compiler focused on accelerating the machine learning inference that can target AMD GPUs .


2) Navi21 GPU Deep Learning Guide : Prepared developer and user guide which includes PyTorch deployment upon NAVI21 GPU, installation and testing of PyTorch on Navi21 using docker, models like imagenet on ROCm using Torch Vision


3) ROCm Profiler Guide: Prepared developer and user guide for the GPU profiling tool which includes an introduction to ROCProfiler, ROCTracer, rocTX, rocprof, profiler architecture, runtime environment, tracing JSON files collected as metrics upon tools like perfetto, grafana and finding MPI ranks for parallel processing using docker. Also including information on Hardware Counters, Profiling API, Command line tool(ROCProf).


4) UIF Project (United Inference): Prepared User and Developer Guide for Xilinx systems inferencing tools integrating with AMD MIGraphX and MIOpen providing an optional pre-compiled kernels package to reduce the startup latency. These precompiled kernels comprise a select set of popular input configurations and will expand in future releases to contain additional coverage.


5) Using Doxygen to generate API documentation via developers.Preparing How To Guides for Enviornment variables across ROCM for High Performance Computing.

Working with GPU Development, Certification & Technical Documentation team for certifying all components of AMD ROCM Product , documenting all the findings as a Programmer Writer for ROCm tool set, ecosystem and GPU developer tools. Some of the end to end handled tasks involve:


1) Deep Learning for Frameworks Guide : Prepared user and developer guides including machine learning and deep learning from AMD ROCM point of view. Also tested and documented about hardware installation, ROCm installation, Deep learning Frameworks-Installing and testing of Pytorch and Tensorflow using docker upon GPU with CIFAR and Inception Datasets, Deep Learning Training & optimization- inferencing including MIGraphX graph compiler focused on accelerating the machine learning inference that can target AMD GPUs .


2) Navi21 GPU Deep Learning Guide : Prepared developer and user guide which includes PyTorch deployment upon NAVI21 GPU, installation and testing of PyTorch on Navi21 using docker, models like imagenet on ROCm using Torch Vision


3) ROCm Profiler Guide: Prepared developer and user guide for the GPU profiling tool which includes an introduction to ROCProfiler, ROCTracer, rocTX, rocprof, profiler architecture, runtime environment, tracing JSON files collected as metrics upon tools like perfetto, grafana and finding MPI ranks for parallel processing using docker. Also including information on Hardware Counters, Profiling API, Command line tool(ROCProf).


4) UIF Project (United Inference): Prepared User and Developer Guide for Xilinx systems inferencing tools integrating with AMD MIGraphX and MIOpen providing an optional pre-compiled kernels package to reduce the startup latency. These precompiled kernels comprise a select set of popular input configurations and will expand in future releases to contain additional coverage.


5) Using Doxygen to generate API documentation via developers.Preparing How To Guides for Enviornment variables across ROCM for High Performance Computing.