lately , Want to use Google's Attention OCR Chinese text recognition , project github address
:, Chinese introduction for reference CSDN Blog
It is found that the training data of the model needs to be provided FSNS Training data in format , And the official didn't provide any relevant documents , Only one stackoverflow Links to
, But I don't know . So I refer to some methods on the Internet , Write a build FSNS format tfrecord Small code of .github address
by :
      FSNS The specific format of this paper :
      however , We just need to care about table four :
image/format Format for picture , yes ‘png’ , If you were born tfrecord Yes use jpg format , Can be changed to ‘raw’
image/encoded Show the specific content of the picture , Occupy one string, with ‘png’ Format code of

iamge/class Represents the real category of the picture id, yes 37 individual int64 data , every last int64 Corresponding to a character code , The specific mapping method is charset_size=134.txt In file , To generate your own data, you need to create your own dictionary , As my own creation contains 5400 Chinese
dic.txt <>.
image/unpadded_class Indicates that the image is real before it is filled id.
image/width: Represents the width of a picture's pixels
image/orig_width: Indicates the width of pixels before the picture is filled
image/height: Represents the height of pixels in the picture , stay tensorflow In code , This part of the code is not written , Because the image height is fixed to 150
image/test: Occupy one string, Yes use UTF-8 Encoded true character form of the mark  
         Here's the code :( The uploaded code is jpg Pictures are stored directly as tfrecord, Faster , If the reader wants to generate png Coded tfrecord, You can refer to my
github <>.

from random import shuffle import numpy as np import glob import tensorflow as
tf import cv2 import sys import os import PIL.Image as Image def
encode_utf8_string(text, length, dic, null_char_id=5462): char_ids_padded =
[null_char_id]*length char_ids_unpadded = [null_char_id]*len(text) for i in
range(len(text)): hash_id = dic[text[i]] char_ids_padded[i] = hash_id
char_ids_unpadded[i] = hash_id return char_ids_padded, char_ids_unpadded def
_bytes_feature(value): return
tf.train.Feature(bytes_list=tf.train.BytesList(value=[value])) def
_int64_feature(value): return
tf.train.Feature(int64_list=tf.train.Int64List(value=value)) dict={} with
open('dic.txt', encoding="utf") as dict_file: for line in dict_file: (key,
value) = line.strip().split('\t') dict[value] = int(key) print((dict))
image_path = 'data/*/*.jpg' addrs_image = glob.glob(image_path) label_path =
'data/*/*.txt' addrs_label = glob.glob(label_path) print(len(addrs_image))
print(len(addrs_label)) tfrecord_writer =
tf.python_io.TFRecordWriter("tfexample_train") for j in
range(0,int(len(addrs_image))): # This is the visualization of write operations print('Train data:
{}/{}'.format(j,int(len(addrs_image)))) sys.stdout.flush() img =[j]) img = img.resize((600, 150), Image.ANTIALIAS)
np_data = np.array(img) image_data = img.tobytes() for text in
open(addrs_label[j], encoding="utf"): char_ids_padded, char_ids_unpadded =
encode_utf8_string( text=text, dic=dict, length=37, null_char_id=5462) example
= tf.train.Example(features=tf.train.Features( feature={ 'image/encoded':
_bytes_feature(image_data), 'image/format': _bytes_feature(b"raw"),
'image/width': _int64_feature([np_data.shape[1]]), 'image/orig_width':
_int64_feature([np_data.shape[1]]), 'image/class':
_int64_feature(char_ids_padded), 'image/unpadded_class':
_int64_feature(char_ids_unpadded), 'image/text': _bytes_feature(bytes(text,
'utf-8')), # 'height': _int64_feature([crop_data.shape[0]]), } ))
tfrecord_writer.write(example.SerializeToString()) tfrecord_writer.close()