Free Download - AI Handwritten Text Recognition (HTR) Tool

Batch transcribe historical handwritten documents!

Genealogy Assistant AI Handwritten Text Recognition Tool is a free cross-platform application designed to transcribe collections of historical documents into an easier-to-read format. It serves as a front-end to multiple AI APIs (OpenAI, Claude, OpenRouter, and Google Gemini), allowing you to convert image files and PDF’s into searchable, transcribed documents with or without the source image attached, as well as plain text (TXT) or CSV files. You can also create easily printable and sharable merged PDF files.

This tool can transcribe thousands of images in a single batch, without the need for user intervention. It is generally not meant to provide 100% accurate transcriptions, as AI transcription are still not perfect, but it is designed to make large collections of documents more readable at-a-glance.

In our testing, transcribing 500 pages of the Upper Canada Sundries cost between $2 – $4 USD using OpenAI’s o4-mini-high model, which when dealing with large numbers of images offers considerable savings over other paid services. Many free models are also supported.

From within the application you can customize the API you would like to use, as well as the prompt, model and parameters, enabling you to fine-tune how your images are processed. You can also enable multi-threading to have the tool work on more than one image at a time.

Download for Windows | Download for MacOS | Download for Linux | GitHub Source + Python Version

IMPORTANT NOTE: These open-source compiled applications are unsigned PyInstaller binaries and may need to be manually allowed via Windows Defender/SmartScreen, MacOS Privacy & Security Settings or result in a false-positive with third-party security software. If you have trouble opening the binaries, consider using the Python version.

Key Features:

Simple Drag and Drop Interface: No need for complicated command-line tools, simply drag and drop your image files directly into the application window to add them to the queue.
Efficient Batch Processing: This tool can transcribe multiple documents simultaneously, so you can speed up the process of working on large batches of images.
Multiple AI Providers: Choose from OpenAI, Anthropic’s Claude, Google Gemini, or any OpenRouter model with automatic fallback to secondary models if the primary model refuses or fails.
Many Output Formats: You can export your transcriptions as PDFs with or without the source image attached, plain text (TXT) files, or a merged CSV file.
Highly Customizable Settings: Fine-tune the AI’s transcription behaviour to match your specific transcription needs with adjustable prompts and parameters such as temperature and token limits.
Instantly Searchable: Your documents are converted into searchable transcriptions that are more easily readable.

Getting Started:

Getting an API Key

You will need an API key from the provider of your choice to use this application. You can acquire an API key directly from the providers website listed below. Remember to store it in a safe location and set a budget you are comfortable with.

OpenRouter: https://openrouter.ai/keys
Google: https://makersuite.google.com/app/apikey
Open AI: https://platform.openai.com/api-keys
*Note: You must verify your OpenAI account under Organization > General on their API Dashboard.
Anthropic: https://console.anthropic.com/

First time Configuration

Set Your API Key: Enter your API key for your chosen provider (OpenAI, Anthropic, Google, or OpenRouter) by clicking on the “API Settings” button in the top right corner. API keys can be obtained from the provider’s website. You can choose the provider you would like to use with the drop-down menu at the top of the application.
Personalize Your Settings (Optional): The default settings work seamlessly for most users, but if you prefer detailed adjustments, simply click “API Settings” to explore advanced configuration options.

Transcribing Your Document

Add Files: Drag any number of images that you would like to transcribe directly into the application’s main window, or click the folder icon to browse and select your files manually.
Review and Manage Files: Your chosen files will display in a list in the main application window under the section Selected Files. Use the buttons below to remove any files you no longer want or clear the entire selection at once.
Choose Output Format: Select from individual PDFs with or without images, merged PDF files, plain text (TXT) or CSV files from the Output Format dropdown menu located at the top of the application in the Transcription Configuration section.
Choose Output Folder: By default your transcribed files are saved to the same folder as the source images. You can modify the location where files are saved to by clicking the Choose Output Location button in the main application window.
Start Transcription: Click the green Process Files button to begin your transcription. A progress indicator shows real-time status, and you can use the buttons to access detailed logs or cancel anytime.
Access Your Results: Upon completion, a summary of successfully transcribed files appears. Your transcribed files will be saved to the source folder of your images.

Advanced Settings

Configure the primary and secondary prompts, models and parameters for each provider via the API Settings button in the main application window.

Primary Models: Each provider uses optimized primary models by default (OpenAI: o4-mini-high, Claude: claude-3-5-sonnet-20241022, Google: gemini-2.5-flash-lite-preview-06-17, OpenRouter: google/gemini-2.5-flash-lite-preview-06-17), with a maximum of 8000 tokens per image.
Fallback Models: Each provider has configured fallback models for reliability (OpenAI: gpt-4o, Claude: claude-3-5-haiku-20241022, Google: gemini-2.0-flash-lite, OpenRouter: gpt-4o), with a maximum of 8000 tokens per image and optimized parameters.

Process more than one image at once by changing the Number of threads in the Transcription Configuration .

1 thread: Safest method, processing one file at a time.
2-3 threads: Recommended balance for speed and reliability.
4+ threads: Faster but may risk hitting API rate limits.

Tips and Troubleshooting

Quality counts: Higher quality images improve transcription accuracy. Ensure you are using the best available files for optimal results.
Check your credits: Always verify your API provider account balance and validity of your API key to avoid interruptions while processing.
Start with small batches: Begin with a few images to ensure everything is working as expected before tackling larger batches.

Command Line Version

If you would prefer to use the command line Python version, clone or download from GitHub and follow the included README file.

Report a Bug or Suggest a Feature

If you are experiencing an issue with the application or you have a feature to suggest, use our request form.

Free Download – AI Handwritten Text Recognition (HTR) Tool