782 words
4 minutes
How Paperless-ngx Simplified My Workflow

In 2019, I wrote a Python tool to extract text from images. To enhance the extraction, the tool attempted to rotate the image to align as much text horizontally as possible. The motivation behind this tool was the high price of decent duplex document scanners and my desire to avoid organizing hard copy documentation manually. A major challenge with organization is handling documents that belong to multiple categories.

My solution was to stamp each document with a numbering stamp and store all documents in the same folder. Whenever I needed something, I would search for it digitally and quickly find the document’s number.

Then, I began self-hosting and stumbled across Paperless-ngx.

Paperless-ngx#

Paperless is an open-source document management system. It performs text recognition and enables you to search for documents based on their content. Additionally, you can assign titles, tags, and even an archive serial number (ASN) to documents, making it perfect for my needs.

One important aspect of self-hosting Paperless-ngx is that the uploaded data remains on my server. The data is stored in both its original and enhanced versions, which is crucial in case another tool becomes available or this one becomes obsolete.

For a full list of features, check out the official website.

Workflow#

A typical workflow involves scanning a document, uploading it to the consume folder of Paperless, waiting for it to process, and then using the web UI for document classification. The title, ASN, tags, correspondent, etc., can be set there. Depending on the configuration, new documents receive a default tag indicating that they are still in the inbox and need to be handled. Once everything for a document is set, this tag can be removed.

Plugins#

To improve the workflow of Paperless, several interesting plugins are available. With the rise of AI, I came across two tools that utilize AI for Paperless. A key requirement for such plugins is the ability to run the LLM locally, as Paperless-ngx hosts sensitive data that I do not want to expose.

For a detailed comparison of these tools, check out this Reddit thread, where the creator of Paperless-GPT compares it to Paperless-AI. The creator of Paperless-AI also comments in the thread.

Paperless GPT#

Paperless GPT aims to improve OCR results using LLMs. Simple OCR mistakes like New Yark can be corrected to New York by the LLM. With this improved text, the plugin can automatically set the title and tags for the document. Another useful feature is the ability to trigger processing for new documents automatically with one tag or manually with another. In manual mode, you must approve the suggested changes.

Paperless-AI#

This plugin also analyzes documents and uses LLMs to automatically assign tags. Its standout feature is the ability to chat with your documents, which can be handy for quickly gathering information, such as details about your car. Like Paperless GPT, it supports both manual and automatic modes.

QuickScan#

Since I do not own a scanner and document scanners are expensive, I explored alternatives and discovered QuickScan. I decided to try it and was impressed. It allows you to export documents via WebDAV directly to the server and offers many customization options. One of its most convenient features is the export favorites, which let you define regular tasks, such as uploading documents to Paperless. Since Paperless also stores documents for my kids, I have specific export favorites for them, which automatically store their documents in a different path within the Paperless system. Another aspect I liked is that the app is not subscription-based. I do not mind paying for software, but I prefer a one-time purchase over increasing my monthly expenses. QuickScan is free to use but accepts donations. Donors receive rewards like changing the app’s color, but all core functionality is available to everyone.

Final Thoughts#

I am glad I found a successor to my self-written Python app. The web UI is excellent, and as a popular self-hosted service, there is plenty of help available online. The workflow I have developed is efficient and makes the scanning process very convenient. Here are my typical steps:

  1. Gather a batch of documents to scan.
  2. Stamp the first document with a numbering stamp.
  3. Scan the first document and send it to the consume folder using QuickScan.
  4. Add the document to a binder if it needs to be kept as a physical copy. Otherwise, discard it.
  5. Repeat these steps until all documents are in Paperless-ngx.
  6. Use the Paperless-ngx web UI to configure the documents by adding titles, ASNs, etc.

When I discovered the plugins, I was excited to use them to further improve the digitization process. However, since I do not have a powerful computer and my initial local testing did not meet my expectations, I decided to wait. There is a lot of ongoing development in AI tools, and better models are becoming increasingly available.

Enjoyed the post? Have questions or feedback? I'd love to hear from you! Feel free to drop me an email at blog@jerey.at.

How Paperless-ngx Simplified My Workflow
https://jerey.at/posts/paperless-ngx/
Author
Anton A. Jerey
Published at
2025-04-02