By Jeff Blackadar
Reposted from Digital History Learning Journal, February 28, 2021.
I would like to work with Arabic language maps and this post sets up transcription of one map tile using Google Cloud Vision.
I am grateful to Dr Kristen Hopper and Dr. Dan Lawrence of Durham University and Dr. Hector Orengo of the University of Cambridge for sending me a set of georeferenced digital maps to work with. Thank you!
I’m working with a map titled Djeble, dated 1942 which is centered on Jableh, Syria.
Set up Google Cloud Vision
The steps to step up the project for Google Cloud Vision are in here. https://cloud.google.com/vision/docs/setup. I have repeated the information below based on the steps I took in case it’s useful. Skip to the next step if you followed all of the instructions in the setup.
In the Dashboard of Google Cloud Platform:
Create Project and give it a name.
Check that Billing is enabled.
Enable the API.




Download the credentials as a .json. Upload the .json file to a secure directory on Google drive separate from your code. Keep this private.
Results

The program I used to do this is here: https://github.com/jeffblackadar/CRANE-CCAD-maps/blob/main/maps_ocr_google_cloud_vision_1_tile.ipynb
The above has errors and some transcriptions are missing. Still, this looks promising.
Notes about the program
In Google Colab I need to install google-cloud-vision to transcribe text and the other 3 packages to plot Arabic text.
!pip install --upgrade google-cloud-vision
!pip install arabic_reshaper
!pip install bidi
!pip install python-bidi
To transcribe Arabic text, Cloud Vision uses language_hints = “ar”. See https://cloud.google.com/vision/docs/languages.
client = vision.ImageAnnotatorClient()
with io.open(path, 'rb') as image_file:
content = image_file.read()
image = vision.Image(content=content)
response = client.text_detection(
image=image,
image_context={"language_hints": ["ar"]},
)
To plot Arabic text, I used a font and the code below. Thanks to Stack Overflow. https://stackoverflow.com/questions/59896297/how-to-draw-arabic-text-on-the-image-using-cv2-puttextcorrectly-pythonopenc
fontpath = "/content/drive/MyDrive/crane_font/arial.ttf" # <== https://www.freefontspro.com/14454/arial.ttf
font_pil = ImageFont.truetype(fontpath, 32)
img_pil = Image.fromarray(img)
draw = ImageDraw.Draw(img_pil)
for l in lines_of_text:
print(l[0])
pts = l[1]
#This is needed to handle the Arabic text
reshaped_text = arabic_reshaper.reshape(l[0])
bidi_text = get_display(reshaped_text)
draw.text((int(pts[0]), int(pts[1])),bidi_text, font = font_pil,fill=(255,0,0,255))
The next steps are process all of the tiles on the map. I also intend to process the tiles to remove some of the non-text elements on the map that confuse transcription.