website/posts/use-your-mail-client-for-ph.../index.html

<h1>How To Use Your Email Client For Physical Mail</h1>
<p>
 Whether it's to re-read a conversation, find a plane ticket I ordered or
 check when a meeting was planned, I often find myself looking up old
 emails. It's usually easy to do so because email clients are designed for
 the task: Many of them support full-text search and some even complement
 that with neat tagging and categorization systems. To be honest I have
 become completely dependent on those features for my day to day
 life. Having full-text search and some sort of categorization for email
 can be a huge time saver. When it comes to physical mail however, I still
 have to browse through stacks of paper to (hopefully) find what I'm
 looking for. I figured that it'd be nice to use my fancy email client to
 deal with physical mail as well, so I found a way to do just that. Turns
 out it's pretty simple!
</p>

<p>
 The main objective here is to transform our physical mail into an email
 that can be received, indexed and read by our email client of choice. Now,
 one way to do that would be to type the contents of our mail into an email
 by hand, but <i>ain't nobody got time for that!</i>. The (more appealing)
 alternative is to use a document scanner. I have a single purpose scanner
 unit from Canon that I hook up to my laptop for just this purpose.
</p>

<p>
 It isn't as simple as just emailing a scanned document to ourselves
 though: email clients are smart, but they can't understand a word of text
 in our PDF or JPEG of a physical document. They need content to be in
 plain text form in order to provide us with some of their best features
 like full-text search. We'll have to somehow transform our scanned
 documents into plain text that we can include in our email. To do this, we
 can use tesseract. Tesseract is an optical character recognition (OCR)
 engine, meaning that it can recognize text in images and extract it for
 us. Installing it should be easy on Debian derivative distros like
 Ubuntu. My laptop is running Debian unstable so I just ran <code class="nowrap">apt
  install tesseract</code> and started using it. Using it is as easy as
 upening up a terminal and typing <code class="nowrap">tesseract FILE.jpg
  OUTPUT</code>. That command will save all the text that tesseract is able
 to recognize in the image FILE.jpg to a file called OUTPUT.txt.
</p>

<aside>
 <i>
  Side note: I am Dutch, so most of my physical mail is in Dutch. To
  make tesseract better understand my mail I installed the
  tesseract-ocr-nld package using <code>apt install
   tesseract-ocr-nld</code>. You can check what other language packs are
  available by using <code>apt search tesseract-ocr</code>.
 </i>
</aside>

<p>
 All we have to do from there is copy-paste the contents of that file into
 an email and send it to ourselves! Depending on the formatting of the
 input document, the output may not always be pleasant to read. We can
 account for this by including the original document as an attachment to
 the email. That way we get the best of both worlds: we can use the search
 functionality of our email client to find the document, and then read it
 in its original form by opening the attachment.
</p>

<p>
 This is all easy enough, but I'm lazy. I didn't feel like opening up my
 email client and doing manual copy-pasting, so I decided to automate the
 process a little further. I have postfix setup on my system to relay to my
 mail server, so I can simply use the <code>mail</code> command to send
 emails without a GUI mail client. I combined that with tesseract in a
 little bash script. The script iterates through all of its arguments and
 interprets them as filenames of scanned documents. It calls tesseract to
 extract text from them, concatenates the results, attaches the files to an
 email and sends it to my personal email address. Now all I have to do is
 run the script with filenames of some documents and my job is done. If
 anyone is interested in an actual program that does the same thing and
 doesn't require you to setup postfix, let me know! I might consider
 authoring one if it's useful to more people than just myself. The script
 I'm currently using can be found <a href="scan-to-mailpile.bash.html">here
  (pretty)</a>  and <a href="scan-to-mailpile.bash">here (raw)</a>, but I
 don't recommend using it if you don't fully understand its contents, it's
 not a polished user experience 🤓.
</p>

<!-- Local Variables: -->
<!-- sgml-basic-offset: 1 -->
<!-- End: -->
Add hack font and add publish and edit dates to posts. Also changed the publish directory to ./publish in stead of using project root 2 years ago			`<h1>How To Use Your Email Client For Physical Mail</h1>`
			`<p>`
			`Whether it's to re-read a conversation, find a plane ticket I ordered or`
			`check when a meeting was planned, I often find myself looking up old`
			`emails. It's usually easy to do so because email clients are designed for`
			`the task: Many of them support full-text search and some even complement`
			`that with neat tagging and categorization systems. To be honest I have`
			`become completely dependent on those features for my day to day`
			`life. Having full-text search and some sort of categorization for email`
			`can be a huge time saver. When it comes to physical mail however, I still`
			`have to browse through stacks of paper to (hopefully) find what I'm`
			`looking for. I figured that it'd be nice to use my fancy email client to`
			`deal with physical mail as well, so I found a way to do just that. Turns`
			`out it's pretty simple!`
			`</p>`
Add "How To Use Your Email Client For Physical Mail" post 5 years ago
Add hack font and add publish and edit dates to posts. Also changed the publish directory to ./publish in stead of using project root 2 years ago			`<p>`
			`The main objective here is to transform our physical mail into an email`
			`that can be received, indexed and read by our email client of choice. Now,`
			`one way to do that would be to type the contents of our mail into an email`
			`by hand, but <i>ain't nobody got time for that!</i>. The (more appealing)`
			`alternative is to use a document scanner. I have a single purpose scanner`
			`unit from Canon that I hook up to my laptop for just this purpose.`
			`</p>`
Add "How To Use Your Email Client For Physical Mail" post 5 years ago
Add hack font and add publish and edit dates to posts. Also changed the publish directory to ./publish in stead of using project root 2 years ago			`<p>`
			`It isn't as simple as just emailing a scanned document to ourselves`
			`though: email clients are smart, but they can't understand a word of text`
			`in our PDF or JPEG of a physical document. They need content to be in`
			`plain text form in order to provide us with some of their best features`
			`like full-text search. We'll have to somehow transform our scanned`
			`documents into plain text that we can include in our email. To do this, we`
			`can use tesseract. Tesseract is an optical character recognition (OCR)`
			`engine, meaning that it can recognize text in images and extract it for`
			`us. Installing it should be easy on Debian derivative distros like`
			`Ubuntu. My laptop is running Debian unstable so I just ran <code class="nowrap">apt`
			`install tesseract</code> and started using it. Using it is as easy as`
			`upening up a terminal and typing <code class="nowrap">tesseract FILE.jpg`
			`OUTPUT</code>. That command will save all the text that tesseract is able`
			`to recognize in the image FILE.jpg to a file called OUTPUT.txt.`
			`</p>`
Add "How To Use Your Email Client For Physical Mail" post 5 years ago
Add hack font and add publish and edit dates to posts. Also changed the publish directory to ./publish in stead of using project root 2 years ago			`<aside>`
			`<i>`
			`Side note: I am Dutch, so most of my physical mail is in Dutch. To`
			`make tesseract better understand my mail I installed the`
			`tesseract-ocr-nld package using <code>apt install`
			`tesseract-ocr-nld</code>. You can check what other language packs are`
			`available by using <code>apt search tesseract-ocr</code>.`
			`</i>`
			`</aside>`
Add "How To Use Your Email Client For Physical Mail" post 5 years ago
Add hack font and add publish and edit dates to posts. Also changed the publish directory to ./publish in stead of using project root 2 years ago			`<p>`
			`All we have to do from there is copy-paste the contents of that file into`
			`an email and send it to ourselves! Depending on the formatting of the`
			`input document, the output may not always be pleasant to read. We can`
			`account for this by including the original document as an attachment to`
			`the email. That way we get the best of both worlds: we can use the search`
			`functionality of our email client to find the document, and then read it`
			`in its original form by opening the attachment.`
			`</p>`
Add "How To Use Your Email Client For Physical Mail" post 5 years ago
Add hack font and add publish and edit dates to posts. Also changed the publish directory to ./publish in stead of using project root 2 years ago			`<p>`
			`This is all easy enough, but I'm lazy. I didn't feel like opening up my`
			`email client and doing manual copy-pasting, so I decided to automate the`
			`process a little further. I have postfix setup on my system to relay to my`
			`mail server, so I can simply use the <code>mail</code> command to send`
			`emails without a GUI mail client. I combined that with tesseract in a`
			`little bash script. The script iterates through all of its arguments and`
			`interprets them as filenames of scanned documents. It calls tesseract to`
			`extract text from them, concatenates the results, attaches the files to an`
			`email and sends it to my personal email address. Now all I have to do is`
			`run the script with filenames of some documents and my job is done. If`
			`anyone is interested in an actual program that does the same thing and`
			`doesn't require you to setup postfix, let me know! I might consider`
			`authoring one if it's useful to more people than just myself. The script`
			`I'm currently using can be found <a href="scan-to-mailpile.bash.html">here`
			`(pretty)</a> and <a href="scan-to-mailpile.bash">here (raw)</a>, but I`
			`don't recommend using it if you don't fully understand its contents, it's`
			`not a polished user experience 🤓.`
			`</p>`
Change element indent to 2 spaces + update links on index page 2 years ago
			`<!-- Local Variables: -->`
			`<!-- sgml-basic-offset: 1 -->`
			`<!-- End: -->`