Quintes van Aswegen: AWS Rekognition Text detection limited to 50 words

Tuesday, 21 August 2018

AWS Rekognition Text detection limited to 50 words

Feb 2022 - this post is permanently moved to https://architectfwd.com, my new site, and can be found here - https://architectfwd.com/architecture/cloud/amazon-web-services-aws/rekognition/2022/01/23/aws-rekognition-text-detection-limited/ please go and bookmark that site for all of my future content.

I built an API which uses Rekognition on AWS quickly, just to test the text detection capability.

Approach

I create a Flask API and utilised boto.

rek = boto3.client('rekognition', region_name="us-east-1")

After that I took the image bytes directly and ran a detect_text call, not too tough.

Detect text result

I uploaded an image with a small number of words and was pleased with the result. However when uploading an image containing a paragraph I found that only a subset of the words were returned.

The limit is 50 words - "DetectText can detect up to 50 words in an image."[0]

Text result response

The response splits up items by Type, either "line" or "word", and has a parentID when a word, so I filtered just the lines like this:

if label['Type']=="LINE"

It works, great result, but a solution for a larger number of words makes me think of just running this through Tesseract OCR.

[0] - https://docs.aws.amazon.com/rekognition/latest/dg/text-detection.html (Last paragraph)

Cheers

Quintes

Connect with me on LinkedIn or Twitter

Quintes van Aswegen

Tuesday, 21 August 2018

AWS Rekognition Text detection limited to 50 words

AWS Rekognition Text detection limited to 50 words

Approach

Detect text result

Text result response

No comments:

Post a Comment

Togaf 9 Certified Architect

Blog Archive