You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
162 lines
8.2 KiB
XML
162 lines
8.2 KiB
XML
<?xml version="1.0"?>
|
|
<rss version="2.0">
|
|
<channel>
|
|
<title>Hugot Blog</title>
|
|
<link>https://hugot.nl/blog.html</link>
|
|
<description>Hugo's personal blog</description>
|
|
<language>en-us</language>
|
|
<pubDate>Thu, 16 Apr 2020 08:36:12 +0200</pubDate>
|
|
<lastBuildDate>Thu, 16 Apr 2020 08:36:12 +0200</lastBuildDate>
|
|
<docs>http://blogs.law.harvard.edu/tech/rss</docs>
|
|
<generator>Hugo's Custom Bash Script</generator>
|
|
<managingEditor>social@hugot.nl (Hugot)</managingEditor>
|
|
<webMaster>infra@hugot.nl (Hugot Infra)</webMaster>
|
|
<item><title> How To Use Your Email Client For Physical Mail </title><link>https://hugot.nl/posts/use-your-mail-client-for-physical-mail/index.html</link><description> <h1>
|
|
How To Use Your Email Client For Physical Mail
|
|
</h1>
|
|
<p>
|
|
Whether it&#39;s to re-read a conversation, find a plane ticket I ordered or
|
|
check when a meeting was planned, I often find myself looking up old
|
|
emails. It&#39;s usually easy to do so because email clients are designed for
|
|
the task: Many of them support full-text search and some even complement
|
|
that with neat tagging and categorization systems. To be honest I have
|
|
become completely dependent on those features for my day to day
|
|
life. Having full-text search and some sort of categorization for email
|
|
can be a huge time saver. When it comes to physical mail however, I still
|
|
have to browse through stacks of paper to (hopefully) find what I&#39;m
|
|
looking for. I figured that it&#39;d be nice to use my fancy email client to
|
|
deal with physical mail as well, so I found a way to do just that. Turns
|
|
out it&#39;s pretty simple!
|
|
</p>
|
|
<p>
|
|
The main objective here is to transform our physical mail into an email
|
|
that can be received, indexed and read by our email client of choice. Now,
|
|
one way to do that would be to type the contents of our mail into an email
|
|
by hand, but
|
|
<i>
|
|
ain&#39;t nobody got time for that!
|
|
</i>
|
|
. The (more appealing)
|
|
alternative is to use a document scanner. I have a single purpose scanner
|
|
unit from Canon that I hook up to my laptop for just this purpose.
|
|
</p>
|
|
<p>
|
|
It isn&#39;t as simple as just emailing a scanned document to ourselves
|
|
though: email clients are smart, but they can&#39;t understand a word of text
|
|
in our PDF or JPEG of a physical document. They need content to be in
|
|
plain text form in order to provide us with some of their best features
|
|
like full-text search. We&#39;ll have to somehow transform our scanned
|
|
documents into plain text that we can include in our email. To do this, we
|
|
can use tesseract. Tesseract is an optical character recognition (OCR)
|
|
engine, meaning that it can recognize text in images and extract it for
|
|
us. Installing it should be easy on Debian derivative distros like
|
|
Ubuntu. My laptop is running Debian unstable so I just ran
|
|
<code>
|
|
apt
|
|
install tesseract
|
|
</code>
|
|
and started using it. Using it is as easy as
|
|
upening up a terminal and typing
|
|
<code>
|
|
tesseract FILE.jpg
|
|
OUTPUT
|
|
</code>
|
|
. That command will save all the text that tesseract is able
|
|
to recognize in the image FILE.jpg to a file called OUTPUT.txt.
|
|
</p>
|
|
<aside>
|
|
<i>
|
|
Side note: I am Dutch, so most of my physical mail is in Dutch. To
|
|
make tesseract better understand my mail I installed the
|
|
tesseract-ocr-nld package using
|
|
<code>
|
|
apt install
|
|
tesseract-ocr-nld
|
|
</code>
|
|
. You can check what other language packs are
|
|
available by using
|
|
<code>
|
|
apt search tesseract-ocr
|
|
</code>
|
|
.
|
|
</i>
|
|
</aside>
|
|
<p>
|
|
All we have to do from there is copy-paste the contents of that file into
|
|
an email and send it to ourselves! Depending on the formatting of the
|
|
input document, the output may not always be pleasant to read. We can
|
|
account for this by including the original document as an attachment to
|
|
the email. That way we get the best of both worlds: we can use the search
|
|
functionality of our email client to find the document, and then read it
|
|
in its original form by opening the attachment.
|
|
</p>
|
|
<p>
|
|
This is all easy enough, but I&#39;m lazy. I didn&#39;t feel like opening up my
|
|
email client and doing manual copy-pasting, so I decided to automate the
|
|
process a little further. I have postfix setup on my system to relay to my
|
|
mail server, so I can simply use the
|
|
<code>
|
|
mail
|
|
</code>
|
|
command to send
|
|
emails without a GUI mail client. I combined that with tesseract in a
|
|
little bash script. The script iterates through all of its arguments and
|
|
interprets them as filenames of scanned documents. It calls tesseract to
|
|
extract text from them, concatenates the results, attaches the files to an
|
|
email and sends it to my personal email address. Now all I have to do is
|
|
run the script with filenames of some documents and my job is done. If
|
|
anyone is interested in an actual program that does the same thing and
|
|
doesn&#39;t require you to setup postfix, let me know! I might consider
|
|
authoring one if it&#39;s useful to more people than just myself. The script
|
|
I&#39;m currently using can be found
|
|
<a href="scan-to-mailpile.bash.html">
|
|
here
|
|
(pretty)
|
|
</a>
|
|
and
|
|
<a href="scan-to-mailpile.bash">
|
|
here (raw)
|
|
</a>
|
|
, but I
|
|
don&#39;t recommend using it if you don&#39;t fully understand its contents, it&#39;s
|
|
not a polished user experience 🤓.
|
|
</p></description><pubDate>Mon, 17 Feb 2020 11:55:42 +0100</pubDate><guid isPermaLink="false"> How To Use Your Email Client For Physical Mail NDc2MDg1MjYxIDQxODUK</guid>
|
|
</item>
|
|
<item><title> Creating a Simple Static Blog </title><link>https://hugot.nl/posts/simple-static-blog/index.html</link><description></description><pubDate>Sat, 08 Feb 2020 12:14:16 +0100</pubDate><guid isPermaLink="false"> Creating a Simple Static Blog MjU5OTIyNDIwMyA2MTI5Cg==</guid>
|
|
</item>
|
|
<item><title> Introduction </title><link>https://hugot.nl/posts/introduction/index.html</link><description> <h1>
|
|
Introduction
|
|
</h1>
|
|
<p>
|
|
Hello, welcome to my blog! My name is Hugo. I am a 22 year old Software Engineering
|
|
student from the Netherlands. Software development is a huge part of my life, I write a
|
|
lot of (weird) programs to scratch my own itch and most software I create
|
|
is
|
|
<a href="https://github.com/hugot">
|
|
open source
|
|
</a>
|
|
by default. I also run a one-man
|
|
company that provides some IT services on the side.
|
|
</p>
|
|
<p>
|
|
Between working on projects and studying I like to watch movies &amp; series, listen to music
|
|
&amp; podcasts, ride my road bike and take hikes.
|
|
</p>
|
|
<h2>
|
|
What kind of blog is this?
|
|
</h2>
|
|
<p>
|
|
Because I&#39;m quite new to this and I want to keep myself interested, I won&#39;t be
|
|
limiting myself to a single topic. You can expect me to post about a variety of topics
|
|
that may interest/annoy/excite me at any given moment.
|
|
</p>
|
|
<p>
|
|
May my posts be interesting and my posting schedule be consistent 🤓🖖
|
|
</p>
|
|
<p>
|
|
I hope to see you around! - Hugo
|
|
</p></description><pubDate>Sat, 08 Feb 2020 09:30:06 +0100</pubDate><guid isPermaLink="false"> Introduction MzYzMzkyNDgwOCA5MDcK</guid>
|
|
</item>
|
|
</channel>
|
|
</rss>
|