Deep Learning in Physical Layer Communications

Author Topic: Deep Learning in Physical Layer Communications  (Read 801 times)

Offline khalid

  • Jr. Member
  • **
  • Posts: 84
  • Test
    • View Profile
Deep Learning in Physical Layer Communications
« on: June 27, 2019, 09:49:40 PM »
Deep learning (DL) has shown great potentials to revolutionizing communication systems. This article
provides an overview on the recent advancements in DL-based physical layer communications. DL can
improve the performance of each individual block in communication systems or optimize the whole
transmitter/receiver. Therefore, we categorize the applications of DL in physical layer communications
into systems with and without block structures. For the DL-based communication systems with the block
structure, we demonstrate the power of DL in signal compression and signal detection. We also discuss
the recent endeavors in developing DL-based end-to-end communication systems. Finally, the potential
research directions are identified to boost the intelligent physical layer communications.
Index Terms
Deep learning, end-to-end communications, physical layer communications, signal processing.
The idea of using neural networks (NN) to intelligentize machines can be traced to 1942 when a simple
model was proposed to simulate the status of a single neuron. Deep learning (DL) adopts a deep neural
network (DNN) to find data representation at each layer, which could be built by using different types
of machine learning (ML) techniques, including supervised ML, unsupervised ML, and reinforcement
learning. In recent years, DL has shown its overwhelming privilege in many areas, such as computer
vision, robotics, and natural language processing, due to its advanced algorithms and tools in learning
complicated models.
Zhijin Qin is with Queen Mary University of London, London E1 4NS, U.K., (email:
Hao Ye, Geoffrey Ye Li, and Biing-Hwang Fred Juang are with Georgia Institute of Technology, Atlanta, GA 30332 USA,
arXiv:1807.11713v3 [cs.IT] 19 Feb 2019
Different from the aforementioned DL applications, where it is normally difficult to find a concrete
mathematical model for feature representation, various theories and models, from information theory
to channel modelling, have been well developed to describe communication systems [1]. However, the
gap between theory and practice motivates us to work on intelligent communications. Particularly, the
following challenges have been identified in the existing physical layer communications:
• Mathematical model versus practical imperfection: The conventional communication systems rely on
the mathematically expressed models for each block. While in the real-world applications, complex
systems may contain unknown effects that are difficult to be expressed analytically. For example,
it is hard to model underwater acoustic channels or molecular communications. Therefore, a more
adaptive framework is required to handle the challenges.
• Block structures versus global optimality: The traditional communication systems consist of several
processing blocks, such as channel encoding, modulation, and signal detection, which are designed
and optimized within each block locally. Thus the global optimality cannot be guaranteed. Moreover,
the optimal communication system structure varies with environments. As a result, optimal or robust
communication systems for different scenarios are more than desired.
DL could be a pure data-driven method, where the networks/systems are optimized over a large training
data set and a mathematically tractable model is unnecessary. Such a feature motivates us to exploit
DL in communication systems in order to address the aforementioned challenges. In this situation,
communication systems can be optimized for specific hardware configuration and channel to address
the imperfection issues. On the other hand, many models in physical layer communications have been
established by researchers and engineers during the past several decades. Those models can be combined
with DL to design model-driven DL-based communication systems, which can take advantages of both
model-based algorithms and DL [2].
There is evidence that the “learned” algorithms could be executed faster with lower power consumption
than the existing manually “programmed” counterparts as NNs can be highly parallelized on the concurrent
architectures and implemented with low-precision data types. Moreover, the passion on developing
artificial intelligence-powered devices from manufacturers, such as Intel
c MovidiusTM Neural Compute
Stick, has also boosted the boom of DL-based wireless communications.
This article will identify the gains that DL can bring to wireless physical layer communications,
including the systems with the block structure and the end-to-end structure merging those blocks. The rest
of this article is organized as follows. Section II introduces the important basis of DNN and illustrates DLbased communication systems. Section III discusses how to apply DL to block-structured communication
systems. Section IV demonstrates DL-based end-to-end communication systems, where individual block
for a specific function, such as channel estimation or decoding, disappears. Section V concludes this
article with potential research directions in the area of DL-based physical layer communications.
In this section, we will first introduce the basis of DNN, generative adversarial network (GAN),
conditional GAN, and Bayesian optimal estimator, which are widely used in DL-based communication
systems. Then we will discuss the intelligent communication systems with DL.
A. Deep Neural Networks
1) Deep Neural Networks Basis: As aforementioned, research on NN started from the single neuron.
As shown in Fig. 1 (a), the inputs of the NN are {x1, x2, . . . , xn} with the corresponding weights,
{w1, w2, . . . , wn}. The neuron can be represented by a non-linear activation function, σ (•), that takes
the sum of the weighted inputs. The output of the neuron can be expressed as y = σ (
i=1 wixi + b),
where b is the shift of the neuron. An NN can be established by connecting multiple neuron elements
to generate multiple outputs to construct a layered architecture. In the training process, the labelled
data, i.e., a set of input and output vector pairs, is used to adjust the weight set, W, by minimizing a
loss function. In the NN with single neuron element, W = {b, w1, w2, . . . , wn}. The commonly-used
loss functions include mean-squared error (MSE) and categorical cross-entropy. To train the model for a
specific scenario, the loss function can be revised by introducing the l1- or l2-norm of W or activations. l1-
or l2-norm of W can also introduced in the loss function as the regularizer to improve the generalization
capabilities. Stochastic gradient descent (SGD) is one of the most popular algorithms to optimize W.
With the layered architecture, a DNN includes multiple fully connected hidden layers, in which each
of them represents a different feature of the input data. Fig. 1 (b) and (c) show two typical DNN
models: feedforward neural network (FNN) and recurrent neural network (RNN). In FNNs, each neuron
is connected to the adjacent layers while the neurons in the same layers are not connected to each other.
The deep convolutional network (DCN) is developed from the fully connected FNN by only keeping
some of the connections between neurons and their adjacent layers. As a result, DCN can significantly
reduce the number of parameters to be trained [3]. Recently, DL has boosted many applications due to the
powerful algorithms and tools. DCN has shown its great potential for signal compression and recovery
problems, which will be demonstrated in Section III-A.

for more: