Currently, Taggun can recognize and extract the following properties from a receipt:
- Total amount
- Tax amount
- Merchant name
- Merchant address
- Merchant type
- Line amounts (a list of amounts detected for each line item)
- Line description (a list of description detected for each line item) *alpha
- Amounts (a list of all extractable amounts found on a receipt)
- Numbers (a list of reference numbers found on a receipt)
- Invoice number *alpha
- IBAN bank account number *alpha
Supported file type
Currently, Tagguns supports the following image and file types:
- PDF (only first 3 pages will be scanned)
Receipt countries test coverage
Taggun currently supports: English, French, Japanese and Hebrew. Please request to have your additional languages to be supported. Currently, Taggun benchmark and test receipts from following countries:
- New Zealand
- United Kingdom
Caveats: When entering a new other-than-English market, extraction and accuracy rate will be lower. The accuracy rate will improve with the amount of data to train the algorithm. Please request for other countries to be included in the benchmark.
Taggun calculates the confidence level for each properties. This provides a "proxy" accuracy level for each property. Also, an aggregated confidenceLevel for all properties of the receipt is provided at the root level of the result. Maximum confidence level is 0.99 Minimum confidence level is 0
Merchant name, address and type
Taggun uses Google Places to extract and validate the recognized merchant name and address of the receipt. It biases the result to the closest distance of the bias location (the geolocation of user's IP address or caller's IP Address).
Receipt location vs bias location
What happens when the original location of the receipt is not the same as the bias location of the user or caller? Eg: A user in Australia has scanned receipts from overseas trip in the USA. Think of IP Address geolocation as a mere "suggestion" to influence the result. It is not deterministic. Taggun algorithm is robust enough to extract information without any information of the IP Address.
User's IP address
When possible, it is recommended to include the user's IP Address to lookup for the bias location of the receipt. Include
ipAddress request parameter to improve the accuracy of receipt transcription. Taggun uses GeoLite2 data created by MaxMind, available from
Caller's IP address
Caller's IP address is the IP address of the server that makes the API request. When user's IP address is not available, Taggun uses caller's IP address to lookup for the bias location of the receipt.
Taggun recognizes dates of any formats. Bias location is used when there is an ambiguity between little-endian (DD-MM-YYYY) and middle-endian (MM-DD-YYYY). For example: a request with the bias location of New York, USA will recognize 07-12-2017 as 12th of July. But the same request with the bias location of Auckland, New Zealand will recognize that as 7th of December.
Number format and currency
Taggun recognizes both format decimal point(.) and decimal mark(,) to extract the correct amount.