Table of Contents
We had a old CanonScan 5600F scanner lying around and wanted to have scanning to a fileshare on a button press.
With a raspberry pi connected via usb to the scanner, scanbd and a little script we were able to achieve this.
graph LR A[Press Scan button] --> B[Scan to Image] B --> C[Convert Image to PDF] C --> D[Run OCR] D --> E[Output Document]
Required Packages
We need to to install the scanbd (Scanner Button Daemon) to act on button press. Additionally we need the sane packages to detect the scanner.
sudo apt install sane sane-utils sanebd
Configuration
Copy the sane configuration to the scanbd configuration.
cp -r /etc/sane.d/* /etc/scanbd/sane.d/
Modify /etc/sane.d/dll.conf
so that only net
is used and not commented out.
# genesys
net
# canon
Test if the scanner is detected
root@scanner:/opt/insaned# SANE_CONFIG_DIR=/etc/scanbd scanimage -L
device 'genesys:libusb:001:004' is a Canon CanoScan 5600F flatbed scanner
Start & enable the scanbd service
sudo systemctl enable --now scanbd
sudo systemctl enable scanbd
Edit the button configuration
/etc/scanbd/scanbd.conf
The scan
action runs a the script. The path of the script or the content can be changed.
action scan {
filter = "^scan.*"
numerical-trigger {
from-value = 1
to-value = 0
}
desc = "Scan to file"
script = "/usr/local/bin/scan-to-share"
}
At the bottom
# devices
# each device can have actions and functions, you can disable not relevant devices
include(scanner.d/canon.conf)
Debugging
systemctl stop scanbd
SANE_CONFIG_DIR=/etc/scanbd scanbd -f
More verbose:
systemctl stop scanbd
SANE_CONFIG_DIR=/etc/scanbd scanbd -f -d7
Scan script
#!/usr/bin/env bash
set -xeo pipefail
log_file="/var/scans/scan.log"
echo "Starting script" | tee -a "$log_file"
# Set the image scanning parameters
resolution=300
file_ending=jpg
format=jpeg
mode=color
file_data=$(date +'%Y_%m_%d-%H_%M_%S')
filename="$file_data.$file_ending"
temp_path="/tmp/$filename"
dest_path="/var/scans/scanned/$file_data.pdf"
echo "Destination path \"$dest_path\"" | tee -a "$log_file"
echo "Starting scan with resolution $resolution, format $format & mode $mode" | tee -a "$log_file"
export SANE_CONFIG_DIR=/etc/scanbd
scanimage --format "$format" --resolution="$resolution" --mode "$mode" -v -p > "$temp_path"
img2pdf "$temp_path" -o "$dest_path"
rm "$temp_path"
chmod 777 "$dest_path"
OCR Script
We want to seperate the scanning script with the ocr script, because scanbd runs the scan script until it is finsihed, before it can run the next script. While it is running, the scanner is blocked.
Create a file at /usr/local/bin/scan-ocr
.
#!/usr/bin/env bash
set -xeo pipefail
log_file="/var/scans/ocr.log"
local_scans_dir="/var/scans/scanned"
local_ocr_dir="/var/scans/ocr"
tesseract_language="deu"
if [ ! -d "$local_scans_dir" ]; then
echo "Error: Local scans directory $local_scans_dir does not exist."
exit 1
fi
if [ ! -d "$local_ocr_dir" ]; then
echo "Error: Local OCR directory $local_ocr_dir does not exist."
exit 1
fi
ls -la "$local_scans_dir"
for file in "$local_scans_dir"/*.pdf; do
name=$(basename "$file")
new_path="$local_ocr_dir/$name"
if ! [ -f "$new_path" ]; then
echo "Starting OCR on $file to $new_path" | tee -a "$log_file"
ocrmypdf -l "$tesseract_language" --force-ocr "$file" "$new_path" && rm "$file"
fi
done
systemd Service and Timer
To run the ocr script periodically, we can use a systemd timer.
Service
Create a new service file at /etc/systemd/system/scan-ocr.service
[Unit]
Description=OCR for Scans
[Service]
Type=simple
ExecStart=/usr/local/bin/scan-ocr
Timer / Cron
Create a new timer file at /etc/systemd/system/scan-ocr.timer:
[Unit]
Description=Runs the OCR every minute
[Timer]
OnBootSec=1min
OnUnitActiveSec=1min
[Install]
WantedBy=timers.target
Enable & Start the timer
sudo systemctl daemon-reload
sudo systemctl enable --now scan-ocr.timer
Verify
sudo systemctl status scan-ocr.timer
sudo systemctl status scan-ocr.service
To manually start or stop the orc service sudo systemctl start scan-ocr.service
can be used.
Access the logs with journalctl -u scan-ocr