diff --git a/blog.html b/blog.html index 0e40ce3..ffc5fde 100644 --- a/blog.html +++ b/blog.html @@ -25,23 +25,155 @@

Blog

-

How To Use Your Email Client For Physical Mail

Mon 17 Feb 2020 11:55:42 AM CET

Whether it's to re-read a conversation, find a plane ticket I ordered or check -when a meeting was planned, I often find myself looking up old emails. It's -usually easy to do so because email clients are designed for the task: Many of -them support full-text search and some even complement that with neat tagging -and categorization systems. To be honest I have become completely dependent on ... Continue reading

+

How To Use Your Email Client For Physical Mail

Mon 17 Feb 2020 11:55:42 AM CET

<article> + <h1> + How To Use Your Email Client For Physical Mail + </h1> + <p> + Whether it&#39;s to re-read a conversation, find a plane ticket I ordered or + check when a meeting was planned, I often find myself looking up old + emails. It&#39;s usually easy to do so because email clients are designed for + the task: Many of them support full-text search and some even complement + that with neat tagging and categorization systems. To be honest I have + become completely dependent on those features for my day to day + life. Having full-text search and some sort of categorization for email + can be a huge time saver. When it comes to physical mail however, I still + have to browse through stacks of paper to (hopefully) find what I&#39;m + looking for. I figured that it&#39;d be nice to use my fancy email client to + deal with physical mail as well, so I found a way to do just that. Turns + out it&#39;s pretty simple! + </p> + <p> + The main objective here is to transform our physical mail into an email + that can be received, indexed and read by our email client of choice. Now, + one way to do that would be to type the contents of our mail into an email + by hand, but + <i> + ain&#39;t nobody got time for that! + </i> + . The (more appealing) + alternative is to use a document scanner. I have a single purpose scanner + unit from Canon that I hook up to my laptop for just this purpose. + </p> + <p> + It isn&#39;t as simple as just emailing a scanned document to ourselves + though: email clients are smart, but they can&#39;t understand a word of text + in our PDF or JPEG of a physical document. They need content to be in + plain text form in order to provide us with some of their best features + like full-text search. We&#39;ll have to somehow transform our scanned + documents into plain text that we can include in our email. To do this, we + can use tesseract. Tesseract is an optical character recognition (OCR) + engine, meaning that it can recognize text in images and extract it for + us. Installing it should be easy on Debian derivative distros like + Ubuntu. My laptop is running Debian unstable so I just ran + <code> + apt + install tesseract + </code> + and started using it. Using it is as easy as + upening up a terminal and typing + <code> + tesseract FILE.jpg + OUTPUT + </code> + . That command will save all the text that tesseract is able + to recognize in the image FILE.jpg to a file called OUTPUT.txt. + </p> + <aside> + <i> + Side note: I am Dutch, so most of my physical mail is in Dutch. To + make tesseract better understand my mail I installed the + tesseract-ocr-nld package using + <code> + apt install + tesseract-ocr-nld + </code> + . You can check what other language packs are + available by using + <code> + apt search tesseract-ocr + </code> + . + </i> + </aside> + <p> + All we have to do from there is copy-paste the contents of that file into + an email and send it to ourselves! Depending on the formatting of the + input document, the output may not always be pleasant to read. We can + account for this by including the original document as an attachment to + the email. That way we get the best of both worlds: we can use the search + functionality of our email client to find the document, and then read it + in its original form by opening the attachment. + </p> + <p> + This is all easy enough, but I&#39;m lazy. I didn&#39;t feel like opening up my + email client and doing manual copy-pasting, so I decided to automate the + process a little further. I have postfix setup on my system to relay to my + mail server, so I can simply use the + <code> + mail + </code> + command to send + emails without a GUI mail client. I combined that with tesseract in a + little bash script. The script iterates through all of its arguments and + interprets them as filenames of scanned documents. It calls tesseract to + extract text from them, concatenates the results, attaches the files to an + email and sends it to my personal email address. Now all I have to do is + run the script with filenames of some documents and my job is done. If + anyone is interested in an actual program that does the same thing and + doesn&#39;t require you to setup postfix, let me know! I might consider + authoring one if it&#39;s useful to more people than just myself. The script + I&#39;m currently using can be found + <a href="scan-to-mailpile.bash.html"> + here + (pretty) + </a> + and + <a href="scan-to-mailpile.bash"> + here (raw) + </a> + , but I + don&#39;t recommend using it if you don&#39;t fully understand its contents, it&#39;s + not a polished user experience 🤓. + </p> +</article> ... Continue reading

-

Creating a Simple Static Blog

Sat 08 Feb 2020 12:14:16 PM CET

I love personal websites. It's amazing that people can share content with the -entire world just by writing some text and throwing it behind a web server. I -wanted to know what that is like, so I set out to create a personal website of -my own. As you can see I succeeded in doing so, but getting here wasn't as -straight forward as I initially thought it would be. I thought that, being a ... Continue reading

+

Creating a Simple Static Blog

Sat 08 Feb 2020 12:14:16 PM CET

... Continue reading

-

Introduction

Sat 08 Feb 2020 09:30:06 AM CET

Hello, welcome to my blog! My name is Hugo. I am a 22 year old Software -Engineering student from the Netherlands. Software development is a huge part -of my life, I write a lot of (weird) programs to scratch my own itch and most -software I create is open_source by default. I also run a one-man company that -provides some IT services on the side. ... Continue reading

+

Introduction

Sat 08 Feb 2020 09:30:06 AM CET

<article> + <h1> + Introduction + </h1> + <p> + Hello, welcome to my blog! My name is Hugo. I am a 22 year old Software Engineering + student from the Netherlands. Software development is a huge part of my life, I write a + lot of (weird) programs to scratch my own itch and most software I create + is + <a href="https://github.com/hugot"> + open source + </a> + by default. I also run a one-man + company that provides some IT services on the side. + </p> + <p> + Between working on projects and studying I like to watch movies &amp; series, listen to music + &amp; podcasts, ride my road bike and take hikes. + </p> + <h2> + What kind of blog is this? + </h2> + <p> + Because I&#39;m quite new to this and I want to keep myself interested, I won&#39;t be + limiting myself to a single topic. You can expect me to post about a variety of topics + that may interest/annoy/excite me at any given moment. + </p> + <p> + May my posts be interesting and my posting schedule be consistent 🤓🖖 + </p> + <p> + I hope to see you around! - Hugo + </p> +</article> ... Continue reading


diff --git a/feed.xml b/feed.xml index 8c09096..5f4c7d4 100644 --- a/feed.xml +++ b/feed.xml @@ -5,29 +5,161 @@ https://hugot.nl/blog.html Hugo's personal blog en-us - wo 15 apr 2020 15:32:13 CEST - wo 15 apr 2020 15:32:13 CEST + do 16 apr 2020 7:43:28 CEST + do 16 apr 2020 7:43:28 CEST http://blogs.law.harvard.edu/tech/rss Hugo's Custom Bash Script social@hugot.nl infra@hugot.nl - How To Use Your Email Client For Physical Mail https://hugot.nl/posts/use-your-mail-client-for-physical-mail/index.htmlWhether it's to re-read a conversation, find a plane ticket I ordered or check -when a meeting was planned, I often find myself looking up old emails. It's -usually easy to do so because email clients are designed for the task: Many of -them support full-text search and some even complement that with neat tagging -and categorization systems. To be honest I have become completely dependent onMon 17 Feb 2020 11:55:42 AM CET How To Use Your Email Client For Physical Mail NDc2MDg1MjYxIDQxODUK + How To Use Your Email Client For Physical Mail https://hugot.nl/posts/use-your-mail-client-for-physical-mail/index.html<article> + <h1> + How To Use Your Email Client For Physical Mail + </h1> + <p> + Whether it&#39;s to re-read a conversation, find a plane ticket I ordered or + check when a meeting was planned, I often find myself looking up old + emails. It&#39;s usually easy to do so because email clients are designed for + the task: Many of them support full-text search and some even complement + that with neat tagging and categorization systems. To be honest I have + become completely dependent on those features for my day to day + life. Having full-text search and some sort of categorization for email + can be a huge time saver. When it comes to physical mail however, I still + have to browse through stacks of paper to (hopefully) find what I&#39;m + looking for. I figured that it&#39;d be nice to use my fancy email client to + deal with physical mail as well, so I found a way to do just that. Turns + out it&#39;s pretty simple! + </p> + <p> + The main objective here is to transform our physical mail into an email + that can be received, indexed and read by our email client of choice. Now, + one way to do that would be to type the contents of our mail into an email + by hand, but + <i> + ain&#39;t nobody got time for that! + </i> + . The (more appealing) + alternative is to use a document scanner. I have a single purpose scanner + unit from Canon that I hook up to my laptop for just this purpose. + </p> + <p> + It isn&#39;t as simple as just emailing a scanned document to ourselves + though: email clients are smart, but they can&#39;t understand a word of text + in our PDF or JPEG of a physical document. They need content to be in + plain text form in order to provide us with some of their best features + like full-text search. We&#39;ll have to somehow transform our scanned + documents into plain text that we can include in our email. To do this, we + can use tesseract. Tesseract is an optical character recognition (OCR) + engine, meaning that it can recognize text in images and extract it for + us. Installing it should be easy on Debian derivative distros like + Ubuntu. My laptop is running Debian unstable so I just ran + <code> + apt + install tesseract + </code> + and started using it. Using it is as easy as + upening up a terminal and typing + <code> + tesseract FILE.jpg + OUTPUT + </code> + . That command will save all the text that tesseract is able + to recognize in the image FILE.jpg to a file called OUTPUT.txt. + </p> + <aside> + <i> + Side note: I am Dutch, so most of my physical mail is in Dutch. To + make tesseract better understand my mail I installed the + tesseract-ocr-nld package using + <code> + apt install + tesseract-ocr-nld + </code> + . You can check what other language packs are + available by using + <code> + apt search tesseract-ocr + </code> + . + </i> + </aside> + <p> + All we have to do from there is copy-paste the contents of that file into + an email and send it to ourselves! Depending on the formatting of the + input document, the output may not always be pleasant to read. We can + account for this by including the original document as an attachment to + the email. That way we get the best of both worlds: we can use the search + functionality of our email client to find the document, and then read it + in its original form by opening the attachment. + </p> + <p> + This is all easy enough, but I&#39;m lazy. I didn&#39;t feel like opening up my + email client and doing manual copy-pasting, so I decided to automate the + process a little further. I have postfix setup on my system to relay to my + mail server, so I can simply use the + <code> + mail + </code> + command to send + emails without a GUI mail client. I combined that with tesseract in a + little bash script. The script iterates through all of its arguments and + interprets them as filenames of scanned documents. It calls tesseract to + extract text from them, concatenates the results, attaches the files to an + email and sends it to my personal email address. Now all I have to do is + run the script with filenames of some documents and my job is done. If + anyone is interested in an actual program that does the same thing and + doesn&#39;t require you to setup postfix, let me know! I might consider + authoring one if it&#39;s useful to more people than just myself. The script + I&#39;m currently using can be found + <a href="scan-to-mailpile.bash.html"> + here + (pretty) + </a> + and + <a href="scan-to-mailpile.bash"> + here (raw) + </a> + , but I + don&#39;t recommend using it if you don&#39;t fully understand its contents, it&#39;s + not a polished user experience 🤓. + </p> +</article>Mon 17 Feb 2020 11:55:42 AM CET How To Use Your Email Client For Physical Mail NDc2MDg1MjYxIDQxODUK - Creating a Simple Static Blog https://hugot.nl/posts/simple-static-blog/index.htmlI love personal websites. It's amazing that people can share content with the -entire world just by writing some text and throwing it behind a web server. I -wanted to know what that is like, so I set out to create a personal website of -my own. As you can see I succeeded in doing so, but getting here wasn't as -straight forward as I initially thought it would be. I thought that, being aSat 08 Feb 2020 12:14:16 PM CET Creating a Simple Static Blog MjU5OTIyNDIwMyA2MTI5Cg== + Creating a Simple Static Blog https://hugot.nl/posts/simple-static-blog/index.htmlSat 08 Feb 2020 12:14:16 PM CET Creating a Simple Static Blog MjU5OTIyNDIwMyA2MTI5Cg== - Introduction https://hugot.nl/posts/introduction/index.htmlHello, welcome to my blog! My name is Hugo. I am a 22 year old Software -Engineering student from the Netherlands. Software development is a huge part -of my life, I write a lot of (weird) programs to scratch my own itch and most -software I create is open_source by default. I also run a one-man company that -provides some IT services on the side.Sat 08 Feb 2020 09:30:06 AM CET Introduction MzYzMzkyNDgwOCA5MDcK + Introduction https://hugot.nl/posts/introduction/index.html<article> + <h1> + Introduction + </h1> + <p> + Hello, welcome to my blog! My name is Hugo. I am a 22 year old Software Engineering + student from the Netherlands. Software development is a huge part of my life, I write a + lot of (weird) programs to scratch my own itch and most software I create + is + <a href="https://github.com/hugot"> + open source + </a> + by default. I also run a one-man + company that provides some IT services on the side. + </p> + <p> + Between working on projects and studying I like to watch movies &amp; series, listen to music + &amp; podcasts, ride my road bike and take hikes. + </p> + <h2> + What kind of blog is this? + </h2> + <p> + Because I&#39;m quite new to this and I want to keep myself interested, I won&#39;t be + limiting myself to a single topic. You can expect me to post about a variety of topics + that may interest/annoy/excite me at any given moment. + </p> + <p> + May my posts be interesting and my posting schedule be consistent 🤓🖖 + </p> + <p> + I hope to see you around! - Hugo + </p> +</article>Sat 08 Feb 2020 09:30:06 AM CET Introduction MzYzMzkyNDgwOCA5MDcK diff --git a/generate-blog.bash b/generate-blog.bash index a7520c0..d847b12 100755 --- a/generate-blog.bash +++ b/generate-blog.bash @@ -5,8 +5,9 @@ # page. # Check if required executables can be found -if ! type readlink dirname html2text mv cat cksum base64; then +if ! type readlink dirname html2text mv cat cksum base64 pup; then echo 'One or more required executables are not present. Generation cancelled' >&2 + echo 'Note: You can install pup with "go get github.com/ericchiang/pup"' >&2 exit 1 fi @@ -130,7 +131,10 @@ while read -r post_html; do title="$(tail -n +3 <<<"$text" | head -n 1 | tr -d '*')" || exit $? # Use the first 5 lines after the title as post excerpt. - excerpt="$(tail -n +4 <<<"$text" | head -n 5)" || exit $? + # excerpt="$(tail -n +4 <<<"$text" | head -n 5)" || exit $? + + # Include full post content + excerpt="$(pup article < "$post_html" | escape-html)" # Escape the post html file name to safely use it in the generated html. href="$(escape-html <<<"$post_html")" || exit $? diff --git a/posts/introduction/index.html b/posts/introduction/index.html index 745264b..c05e68e 100644 --- a/posts/introduction/index.html +++ b/posts/introduction/index.html @@ -8,30 +8,32 @@ Home -

Introduction

+
+

Introduction

-

- Hello, welcome to my blog! My name is Hugo. I am a 22 year old Software Engineering - student from the Netherlands. Software development is a huge part of my life, I write a - lot of (weird) programs to scratch my own itch and most software I create - is open source by default. I also run a one-man - company that provides some IT services on the side. -

+

+ Hello, welcome to my blog! My name is Hugo. I am a 22 year old Software Engineering + student from the Netherlands. Software development is a huge part of my life, I write a + lot of (weird) programs to scratch my own itch and most software I create + is open source by default. I also run a one-man + company that provides some IT services on the side. +

-

- Between working on projects and studying I like to watch movies & series, listen to music - & podcasts, ride my road bike and take hikes. -

+

+ Between working on projects and studying I like to watch movies & series, listen to music + & podcasts, ride my road bike and take hikes. +

-

What kind of blog is this?

-

- Because I'm quite new to this and I want to keep myself interested, I won't be - limiting myself to a single topic. You can expect me to post about a variety of topics - that may interest/annoy/excite me at any given moment. -

+

What kind of blog is this?

+

+ Because I'm quite new to this and I want to keep myself interested, I won't be + limiting myself to a single topic. You can expect me to post about a variety of topics + that may interest/annoy/excite me at any given moment. +

-

May my posts be interesting and my posting schedule be consistent 🤓🖖

+

May my posts be interesting and my posting schedule be consistent 🤓🖖

-

I hope to see you around! - Hugo

+

I hope to see you around! - Hugo

+