自然科学版 英文版
自然科学版 英文版
自然科学版 英文版
自然科学版 英文版
英文版编委
自然科学版 英文版
英文版首届青年编委

您目前所在的位置:首页 - 期刊简介 - 详细页面

中南大学学报(英文版)

Journal of Central South University

Vol. 28    No. 6    June 2021

[PDF Download]    [Flash Online]

    

End-to-end dilated convolution network for document image semantic segmentation
XU Can-hui(许灿辉)1, 2, SHI Cao(史操)1, CHEN Yi-nong(陈以农)2

1. School of Information Sciences and Technology, Qingdao University of Science and Technology,
Qingdao 266061, China;
2. School of Computing, Informatics and Decision Systems Engineering, Arizona State University,
Tempe, AZ 85287-8809, USA

Abstract:Semantic segmentation is a crucial step for document understanding. In this paper, an NVIDIA Jetson Nano-based platform is applied for implementing semantic segmentation for teaching artificial intelligence concepts and programming. To extract semantic structures from document images, we present an end-to-end dilated convolution network architecture. Dilated convolutions have well-known advantages for extracting multi-scale context information without losing spatial resolution. Our model utilizes dilated convolutions with residual network to represent the image features and predicting pixel labels. The convolution part works as feature extractor to obtain multidimensional and hierarchical image features. The consecutive deconvolution is used for producing full resolution segmentation prediction. The probability of each pixel decides its predefined semantic class label. To understand segmentation granularity, we compare performances at three different levels. From fine grained class to coarse class levels, the proposed dilated convolution network architecture is evaluated on three document datasets. The experimental results have shown that both semantic data distribution imbalance and network depth are import factors that influence the document’s semantic segmentation performances. The research is aimed at offering an education resource for teaching artificial intelligence concepts and techniques.

 

Key words: semantic segmentation; document images; deep learning; NVIDIA jetson nano

中南大学学报(自然科学版)
  ISSN 1672-7207
CN 43-1426/N
ZDXZAC
中南大学学报(英文版)
  ISSN 2095-2899
CN 43-1516/TB
JCSTFT
版权所有:《中南大学学报(自然科学版、英文版)》编辑部
地 址:湖南省长沙市中南大学 邮编: 410083
电 话: 0731-88879765(中) 88836963(英) 传真: 0731-88877727
电子邮箱:zngdxb@csu.edu.cn 湘ICP备09001153号