diff --git a/blog.html b/blog.html index 0e40ce3..ffc5fde 100644 --- a/blog.html +++ b/blog.html @@ -25,23 +25,155 @@
Whether it's to re-read a conversation, find a plane ticket I ordered or check -when a meeting was planned, I often find myself looking up old emails. It's -usually easy to do so because email clients are designed for the task: Many of -them support full-text search and some even complement that with neat tagging -and categorization systems. To be honest I have become completely dependent on ... Continue reading
+<article> + <h1> + How To Use Your Email Client For Physical Mail + </h1> + <p> + Whether it's to re-read a conversation, find a plane ticket I ordered or + check when a meeting was planned, I often find myself looking up old + emails. It's usually easy to do so because email clients are designed for + the task: Many of them support full-text search and some even complement + that with neat tagging and categorization systems. To be honest I have + become completely dependent on those features for my day to day + life. Having full-text search and some sort of categorization for email + can be a huge time saver. When it comes to physical mail however, I still + have to browse through stacks of paper to (hopefully) find what I'm + looking for. I figured that it'd be nice to use my fancy email client to + deal with physical mail as well, so I found a way to do just that. Turns + out it's pretty simple! + </p> + <p> + The main objective here is to transform our physical mail into an email + that can be received, indexed and read by our email client of choice. Now, + one way to do that would be to type the contents of our mail into an email + by hand, but + <i> + ain't nobody got time for that! + </i> + . The (more appealing) + alternative is to use a document scanner. I have a single purpose scanner + unit from Canon that I hook up to my laptop for just this purpose. + </p> + <p> + It isn't as simple as just emailing a scanned document to ourselves + though: email clients are smart, but they can't understand a word of text + in our PDF or JPEG of a physical document. They need content to be in + plain text form in order to provide us with some of their best features + like full-text search. We'll have to somehow transform our scanned + documents into plain text that we can include in our email. To do this, we + can use tesseract. Tesseract is an optical character recognition (OCR) + engine, meaning that it can recognize text in images and extract it for + us. Installing it should be easy on Debian derivative distros like + Ubuntu. My laptop is running Debian unstable so I just ran + <code> + apt + install tesseract + </code> + and started using it. Using it is as easy as + upening up a terminal and typing + <code> + tesseract FILE.jpg + OUTPUT + </code> + . That command will save all the text that tesseract is able + to recognize in the image FILE.jpg to a file called OUTPUT.txt. + </p> + <aside> + <i> + Side note: I am Dutch, so most of my physical mail is in Dutch. To + make tesseract better understand my mail I installed the + tesseract-ocr-nld package using + <code> + apt install + tesseract-ocr-nld + </code> + . You can check what other language packs are + available by using + <code> + apt search tesseract-ocr + </code> + . + </i> + </aside> + <p> + All we have to do from there is copy-paste the contents of that file into + an email and send it to ourselves! Depending on the formatting of the + input document, the output may not always be pleasant to read. We can + account for this by including the original document as an attachment to + the email. That way we get the best of both worlds: we can use the search + functionality of our email client to find the document, and then read it + in its original form by opening the attachment. + </p> + <p> + This is all easy enough, but I'm lazy. I didn't feel like opening up my + email client and doing manual copy-pasting, so I decided to automate the + process a little further. I have postfix setup on my system to relay to my + mail server, so I can simply use the + <code> + mail + </code> + command to send + emails without a GUI mail client. I combined that with tesseract in a + little bash script. The script iterates through all of its arguments and + interprets them as filenames of scanned documents. It calls tesseract to + extract text from them, concatenates the results, attaches the files to an + email and sends it to my personal email address. Now all I have to do is + run the script with filenames of some documents and my job is done. If + anyone is interested in an actual program that does the same thing and + doesn't require you to setup postfix, let me know! I might consider + authoring one if it's useful to more people than just myself. The script + I'm currently using can be found + <a href="scan-to-mailpile.bash.html"> + here + (pretty) + </a> + and + <a href="scan-to-mailpile.bash"> + here (raw) + </a> + , but I + don't recommend using it if you don't fully understand its contents, it's + not a polished user experience 🤓. + </p> +</article> ... Continue reading
I love personal websites. It's amazing that people can share content with the -entire world just by writing some text and throwing it behind a web server. I -wanted to know what that is like, so I set out to create a personal website of -my own. As you can see I succeeded in doing so, but getting here wasn't as -straight forward as I initially thought it would be. I thought that, being a ... Continue reading
+... Continue reading
Hello, welcome to my blog! My name is Hugo. I am a 22 year old Software -Engineering student from the Netherlands. Software development is a huge part -of my life, I write a lot of (weird) programs to scratch my own itch and most -software I create is open_source by default. I also run a one-man company that -provides some IT services on the side. ... Continue reading
+<article> + <h1> + Introduction + </h1> + <p> + Hello, welcome to my blog! My name is Hugo. I am a 22 year old Software Engineering + student from the Netherlands. Software development is a huge part of my life, I write a + lot of (weird) programs to scratch my own itch and most software I create + is + <a href="https://github.com/hugot"> + open source + </a> + by default. I also run a one-man + company that provides some IT services on the side. + </p> + <p> + Between working on projects and studying I like to watch movies & series, listen to music + & podcasts, ride my road bike and take hikes. + </p> + <h2> + What kind of blog is this? + </h2> + <p> + Because I'm quite new to this and I want to keep myself interested, I won't be + limiting myself to a single topic. You can expect me to post about a variety of topics + that may interest/annoy/excite me at any given moment. + </p> + <p> + May my posts be interesting and my posting schedule be consistent 🤓🖖 + </p> + <p> + I hope to see you around! - Hugo + </p> +</article> ... Continue reading