A Deep Reinforcement Learning-based Transcoder Selection Framework for Blockchain-Enabled Wireless D2D Transcoding

The boom of video streaming industry has resulted in the increasing demands for transcoding services from heterogeneous users. Recent advances of blockchain technology allow some startups to realize decentralized collaborative transcoding through device-to-device (D2D) networks, where a group of transcoders are selected to perform transcoding cooperatively. For the blockchain-enabled D2D transcoding systems, it's imperative to jointly design transcoder selection, task scheduling and resource allocation schemes in order to provide efficient and trustworthy transcoding services. In this paper, viewing the involved multi-dimensional complex factors and channel fluctuation, we propose a novel deep reinforcement learning (DRL) based transcoder selection framework for blockchain enabled D2D transcoding systems where both the platform dynamics and channel statistics are captured. To reduce the action space size, we adopt a two-stage decision approach to first select the transcoders through a normal DRL based framework and then obtain the optimal task scheduling, power control, and resource allocation scheme by solving a stochastic optimization problem with the constrained stochastic successive convex approximation (CSSCA) approach. Simulation results show that our proposed framework can achieve high transcoding revenue while meeting the quality of service (QoS) requirements, and it can well handle dynamic cases.

Journal Article
Monday, June 15, 2020