Add "How To Use Your Email Client For Physical Mail" post
parent
faf474273e
commit
1adf37a91d
@ -1,2 +1,3 @@
|
|||||||
|
posts/use-your-mail-client-for-physical-mail/index.html
|
||||||
posts/simple-static-blog/index.html
|
posts/simple-static-blog/index.html
|
||||||
posts/introduction/index.html
|
posts/introduction/index.html
|
||||||
|
@ -0,0 +1,128 @@
|
|||||||
|
<!DOCTYPE HTML>
|
||||||
|
<html>
|
||||||
|
<head>
|
||||||
|
<title>Use Your Email Client For Physical Mail</title>
|
||||||
|
<meta charset="UTF-8">
|
||||||
|
</head>
|
||||||
|
|
||||||
|
<style type="text/css">
|
||||||
|
html {
|
||||||
|
font-family: Helvetica, Arial, sans-serif;
|
||||||
|
color: #5b4636;
|
||||||
|
background-color: #f4ecd8;
|
||||||
|
}
|
||||||
|
|
||||||
|
body {
|
||||||
|
padding: 1em;
|
||||||
|
margin: auto;
|
||||||
|
}
|
||||||
|
|
||||||
|
@media only all and (pointer: coarse), (pointer: none) {
|
||||||
|
body {
|
||||||
|
font-size: 5.5vmin;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
@media only all and (pointer: fine) {
|
||||||
|
body {
|
||||||
|
font-size: calc(16px + 0.6vmin);
|
||||||
|
min-width: 500px;
|
||||||
|
max-width: 50em;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
aside {
|
||||||
|
width: 30%;
|
||||||
|
min-width: 10em;
|
||||||
|
background-color: rgba(0,0,0, 0.1);
|
||||||
|
float: right;
|
||||||
|
padding: 1em;
|
||||||
|
margin: 1em;
|
||||||
|
}
|
||||||
|
</style>
|
||||||
|
|
||||||
|
<body>
|
||||||
|
<a href="../../blog.html">Home</a>
|
||||||
|
<article>
|
||||||
|
<h1>How To Use Your Email Client For Physical Mail</h1>
|
||||||
|
<p>
|
||||||
|
Whether it's to re-read a conversation, find a plane ticket I ordered or check
|
||||||
|
when a meeting was planned, I often find myself looking up old emails. It's
|
||||||
|
usually easy to do so because email clients are designed for the task: Many of
|
||||||
|
them support full-text search and some even complement that with advanced
|
||||||
|
tagging and categorization systems. To be honest I have become completely
|
||||||
|
dependent on those features for my day to day operation. Having full-text
|
||||||
|
search and some sort of categorization for mail can be a huge time
|
||||||
|
saver. Wouldn't it be nice if we had all of that functionality to deal with
|
||||||
|
physical mail as well? I thought it would, so I set out to find a way to
|
||||||
|
achieve just that. Turns out it's pretty simple!
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
The main objective here is to transform our physical mail into an email
|
||||||
|
that can be received, indexed and read by our email client of choice. Now,
|
||||||
|
one way to do that would be to type the contents of our mail into an email
|
||||||
|
by hand, but <i>ain't nobody got time for that!</i>. The (more appealing)
|
||||||
|
alternative is to use a document scanner. I have a single purpose scanner
|
||||||
|
unit from Canon that I hook up to my laptop for just this purpose.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
It isn't as simple as just emailing a scanned document to ourselves
|
||||||
|
though: email clients are smart, but they can't understand a word of text
|
||||||
|
in our PDF or JPEG of a physical document. They need content to be in
|
||||||
|
plain text form in order to provide us with some of their best features
|
||||||
|
like full-text search. We'll have to somehow transform our scanned
|
||||||
|
documents into plain text that we can include in our email. To do this, we
|
||||||
|
can use tesseract. Tesseract is an optical character recognition (OCR)
|
||||||
|
engine, meaning that it can recognize text in images and extract it for
|
||||||
|
us. Installing it should be easy on Debian derivative distros like
|
||||||
|
Ubuntu. My laptop is running Debian unstable so I just ran <code>apt
|
||||||
|
install tesseract</code> and started using it. Using it is as easy as
|
||||||
|
upening up a terminal and typing <code>tesseract FILE.jpg
|
||||||
|
OUTPUT</code>. That command will save all the text that tesseract is able
|
||||||
|
to recognize in the image FILE.jpg to a file called OUTPUT.txt.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<aside>
|
||||||
|
<i>
|
||||||
|
Side note: I am Dutch, so most of my physical mail is in Dutch. To
|
||||||
|
make tesseract better understand my mail I installed the
|
||||||
|
tesseract-ocr-nld package using <code>apt install
|
||||||
|
tesseract-ocr-nld</code>. You can check what other language packs are
|
||||||
|
available by using <code>apt search tesseract-ocr</code>.
|
||||||
|
</i>
|
||||||
|
</aside>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
All we have to do from there is copy-paste the contents of that file into
|
||||||
|
an email and send it to ourselves! Depending on the formatting of the
|
||||||
|
input document, the output may not always be pleasant to read. We can
|
||||||
|
account for this by including the original document as an attachment to
|
||||||
|
the email. That way we get the best of both worlds: we can use the search
|
||||||
|
functionality of our email client to find the document, and then read it
|
||||||
|
in its original form by opening the attachment.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
This is all easy enough, but I'm lazy. I didn't feel like opening up my
|
||||||
|
email client and doing manual copy-pasting, so I decided to automate the
|
||||||
|
process a little further. I have postfix setup on my system to relay to my
|
||||||
|
mail server, so I can simply use the <code>mail</code> command to send emails without a
|
||||||
|
GUI mail client. I combined that with tesseract in a little bash
|
||||||
|
script. The script iterates through all of its arguments and interprets
|
||||||
|
them as filenames of scanned documents. It calls tesseract to extract text
|
||||||
|
from them, concatenates the results, attaches the files to an email and
|
||||||
|
sends it to my personal email address. Now all I have to do is run the
|
||||||
|
script with filenames of some documents and my job is done. If anyone is
|
||||||
|
interested in an actual program that does the same thing and doesn't
|
||||||
|
require you to setup postfix, let me know! I might consider authoring one
|
||||||
|
if it's useful to more people than just myself. The script I'm currently
|
||||||
|
using can be found <a href="scan-to-mailpile.bash.html">here (pretty)</a>
|
||||||
|
and <a href="scan-to-mailpile.bash">here (raw)</a>, but I don't recommend
|
||||||
|
using it if you don't fully understand its contents, it's not a polished
|
||||||
|
user experience 🤓.
|
||||||
|
</p>
|
||||||
|
</article>
|
||||||
|
</body>
|
||||||
|
</html>
|
@ -0,0 +1 @@
|
|||||||
|
Mon 17 Feb 2020 11:55:42 AM CET
|
@ -0,0 +1,57 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
if ! [[ $# -ge 1 ]]; then
|
||||||
|
echo 'Usage: scan-to-mailpile ...FILES' >&2
|
||||||
|
|
||||||
|
exit
|
||||||
|
fi
|
||||||
|
|
||||||
|
if ! type_output="$(type readlink mktemp pdftotext tesseract mail mimetype basename cat 2>&1)"; then
|
||||||
|
printf 'scan-to-mailpile: Some required commands are missing, lookup results:\n%s\n' \
|
||||||
|
"$type_output" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
tmpdir=$(mktemp -d) || exit $?
|
||||||
|
|
||||||
|
printf -v trap 'rm -vr %q' "$tmpdir"
|
||||||
|
trap "$trap" EXIT
|
||||||
|
|
||||||
|
printf 'Changing directory: '
|
||||||
|
pushd "$tmpdir" || exit $?
|
||||||
|
|
||||||
|
declare -a file_args=()
|
||||||
|
|
||||||
|
{
|
||||||
|
for file in "$@"; do
|
||||||
|
file="$(readlink -f "$file")" || exit $?
|
||||||
|
|
||||||
|
# Note: pdftotext will not work for scanned documents, so those should just be
|
||||||
|
# saved as image files before feeding them to this script.
|
||||||
|
##
|
||||||
|
# It will however work fine for other types of PDFs.
|
||||||
|
if [[ "$file" == *.pdf ]]; then
|
||||||
|
pdftotext "$file" /dev/fd/1 || exit $?
|
||||||
|
else
|
||||||
|
tesseract "$file" stdout || exit $?
|
||||||
|
fi
|
||||||
|
|
||||||
|
mime="$(mimetype -b "$file")" || exit $?
|
||||||
|
|
||||||
|
attachment_args+=(--content-type="$mime" --attach="$file")
|
||||||
|
done
|
||||||
|
} > ./outfile.txt
|
||||||
|
|
||||||
|
cat ./outfile.txt
|
||||||
|
|
||||||
|
file1="$(basename "$1")"
|
||||||
|
|
||||||
|
read -i "${file1%.*}" -rep 'What should the subject of the email be? ' subject
|
||||||
|
|
||||||
|
mail --subject="$subject" \
|
||||||
|
"${attachment_args[@]}" \
|
||||||
|
--content-type="text/plain" \
|
||||||
|
--content-filename="content.txt" \
|
||||||
|
user@example.com < ./outfile.txt
|
||||||
|
|
||||||
|
popd
|
@ -0,0 +1,129 @@
|
|||||||
|
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
|
||||||
|
<!-- Created by htmlize-1.56 in css mode. -->
|
||||||
|
<html>
|
||||||
|
<head>
|
||||||
|
<title>scan-to-mailpile.bash</title>
|
||||||
|
<style type="text/css">
|
||||||
|
<!--
|
||||||
|
body {
|
||||||
|
color: #f6f3e8;
|
||||||
|
background-color: #242424;
|
||||||
|
}
|
||||||
|
.builtin {
|
||||||
|
/* font-lock-builtin-face */
|
||||||
|
color: #e5786d;
|
||||||
|
}
|
||||||
|
.comment {
|
||||||
|
/* font-lock-comment-face */
|
||||||
|
color: #99968b;
|
||||||
|
}
|
||||||
|
.comment-delimiter {
|
||||||
|
/* font-lock-comment-delimiter-face */
|
||||||
|
color: #99968b;
|
||||||
|
}
|
||||||
|
.flyspell-duplicate {
|
||||||
|
/* flyspell-duplicate */
|
||||||
|
text-decoration: underline;
|
||||||
|
}
|
||||||
|
.flyspell-incorrect {
|
||||||
|
/* flyspell-incorrect */
|
||||||
|
text-decoration: underline;
|
||||||
|
}
|
||||||
|
.keyword {
|
||||||
|
/* font-lock-keyword-face */
|
||||||
|
color: #8ac6f2;
|
||||||
|
font-weight: bold;
|
||||||
|
}
|
||||||
|
.negation-char {
|
||||||
|
}
|
||||||
|
.sh-escaped-newline {
|
||||||
|
/* sh-escaped-newline */
|
||||||
|
color: #95e454;
|
||||||
|
}
|
||||||
|
.sh-quoted-exec {
|
||||||
|
/* sh-quoted-exec */
|
||||||
|
color: #fa8072;
|
||||||
|
}
|
||||||
|
.string {
|
||||||
|
/* font-lock-string-face */
|
||||||
|
color: #95e454;
|
||||||
|
}
|
||||||
|
.variable-name {
|
||||||
|
/* font-lock-variable-name-face */
|
||||||
|
color: #cae682;
|
||||||
|
}
|
||||||
|
|
||||||
|
a {
|
||||||
|
color: inherit;
|
||||||
|
background-color: inherit;
|
||||||
|
font: inherit;
|
||||||
|
text-decoration: inherit;
|
||||||
|
}
|
||||||
|
a:hover {
|
||||||
|
text-decoration: underline;
|
||||||
|
}
|
||||||
|
-->
|
||||||
|
</style>
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<pre>
|
||||||
|
<span class="comment-delimiter"> #</span><span class="comment">!/bin/</span><span class="keyword">bash</span><span class="comment">
|
||||||
|
</span>
|
||||||
|
<span class="keyword"> if</span> <span class="negation-char">!</span> [[ $<span class="variable-name">#</span> -ge 1 ]]; <span class="keyword">then</span>
|
||||||
|
<span class="builtin">echo</span> <span class="string">'Usage: scan-to-mailpile ...FILES'</span> >&2
|
||||||
|
|
||||||
|
<span class="keyword">exit</span>
|
||||||
|
<span class="keyword"> fi</span>
|
||||||
|
|
||||||
|
<span class="keyword"> if</span> <span class="negation-char">!</span> <span class="variable-name">type_output</span>=<span class="string">"$(</span><span class="sh-quoted-exec">type</span><span class="string"> readlink mktemp pdftotext tesseract mail mimetype basename cat 2>&1)"</span>; <span class="keyword">then</span>
|
||||||
|
<span class="builtin">printf</span> <span class="string">'scan-to-mailpile: Some required commands are missing, lookup results:\n%s\n'</span> <span class="sh-escaped-newline">\</span>
|
||||||
|
<span class="string">"$type_output"</span> >&2
|
||||||
|
<span class="keyword">exit</span> 1
|
||||||
|
<span class="keyword"> fi</span>
|
||||||
|
|
||||||
|
<span class="variable-name"> tmpdir</span>=$(<span class="sh-quoted-exec">mktemp</span> -d) || <span class="keyword">exit</span> $<span class="variable-name">?</span>
|
||||||
|
|
||||||
|
<span class="builtin"> printf</span> -v trap <span class="string">'rm -vr %q'</span> <span class="string">"$tmpdir"</span>
|
||||||
|
<span class="keyword"> trap</span> <span class="string">"$trap"</span> EXIT
|
||||||
|
|
||||||
|
<span class="builtin"> printf</span> <span class="string">'Changing directory: '</span>
|
||||||
|
<span class="builtin"> pushd</span> <span class="string">"$tmpdir"</span> || <span class="keyword">exit</span> $<span class="variable-name">?</span>
|
||||||
|
|
||||||
|
<span class="builtin"> declare</span> -a <span class="variable-name">file_args</span>=()
|
||||||
|
|
||||||
|
{
|
||||||
|
<span class="keyword">for</span> file<span class="keyword"> in</span> <span class="string">"$@"</span>; <span class="keyword">do</span>
|
||||||
|
<span class="variable-name">file</span>=<span class="string">"$(</span><span class="sh-quoted-exec">readlink</span><span class="string"> -f "$file")"</span> || <span class="keyword">exit</span> $<span class="variable-name">?</span>
|
||||||
|
|
||||||
|
<span class="comment-delimiter"># </span><span class="comment">Note: </span><span class="comment"><span class="flyspell-duplicate">pdftotext</span></span><span class="comment"> will not work for scanned documents, so those should just be
|
||||||
|
</span> <span class="comment-delimiter"># </span><span class="comment">saved as image files before feeding them to this script.
|
||||||
|
</span> <span class="comment-delimiter">##</span><span class="comment">
|
||||||
|
</span> <span class="comment-delimiter"># </span><span class="comment">It will however work fine for other types of </span><span class="comment"><span class="flyspell-incorrect">PDFs</span></span><span class="comment">.
|
||||||
|
</span> <span class="keyword">if</span> [[ <span class="string">"$file"</span> == *.pdf ]]; <span class="keyword">then</span>
|
||||||
|
pdftotext <span class="string">"$file"</span> /dev/fd/1 || <span class="keyword">exit</span> $<span class="variable-name">?</span>
|
||||||
|
<span class="keyword">else</span>
|
||||||
|
tesseract <span class="string">"$file"</span> stdout || <span class="keyword">exit</span> $<span class="variable-name">?</span>
|
||||||
|
<span class="keyword">fi</span>
|
||||||
|
|
||||||
|
<span class="variable-name">mime</span>=<span class="string">"$(</span><span class="sh-quoted-exec">mimetype</span><span class="string"> -b "$file")"</span> || <span class="keyword">exit</span> $<span class="variable-name">?</span>
|
||||||
|
|
||||||
|
<span class="variable-name">attachment_args</span>+=(--content-type=<span class="string">"$mime"</span> --attach=<span class="string">"$file"</span>)
|
||||||
|
<span class="keyword">done</span>
|
||||||
|
} > ./outfile.txt
|
||||||
|
|
||||||
|
cat ./outfile.txt
|
||||||
|
|
||||||
|
<span class="variable-name"> file1</span>=<span class="string">"$(</span><span class="sh-quoted-exec">basename</span><span class="string"> "$1")"</span>
|
||||||
|
|
||||||
|
<span class="builtin"> read</span> -i <span class="string">"${file1%.*}"</span> -rep <span class="string">'What should the subject of the email be? '</span> subject
|
||||||
|
|
||||||
|
mail --subject=<span class="string">"$subject"</span> <span class="sh-escaped-newline">\</span>
|
||||||
|
<span class="string">"${attachment_args[@]}"</span> <span class="sh-escaped-newline">\</span>
|
||||||
|
--content-type=<span class="string">"text/plain"</span> <span class="sh-escaped-newline">\</span>
|
||||||
|
--content-filename=<span class="string">"content.</span><span class="string"><span class="flyspell-duplicate">txt</span></span><span class="string">"</span> <span class="sh-escaped-newline">\</span>
|
||||||
|
user@example.com < ./outfile.txt
|
||||||
|
|
||||||
|
<span class="builtin">popd</span>
|
||||||
|
</pre>
|
||||||
|
</body>
|
||||||
|
</html>
|
Loading…
Reference in New Issue