Blog to Ebook Conversion

This is one way to make an ebook out of your blog. It largely depends on what style sheet (CSS) you are using, so actually you need a lot of adjustments before you actually get some practical output.

Here we assume that you want to extract the date, the title and the text body from each of your articles, and they are in an HTML format as in the following manner.

<h2 class='date-header'>
<span>
31/01/2014
</span>
</h2>
...
<h3 class='post-title entry-title'>
<a href='http://yourFBblog.com/2014/01/QRP_and_QRS.html'>
QRP and QRS
</a>
</h3>
...
<div class='post-body entry-content'>
Sometimes you may wish to get all the titles of your previous articles in your or someone else’s blog.
</div>
<div class='post-footer'>
...

Then you will write a short awk program such as:

BEGIN{
 nrsave_date=-999;
 nrsave_title=-999
}

 /post-body entry-content/, /post-footer/ {print}
{
 if(/<h2 class='date-header'>/) nrsave_date=NR
 if(/<h3 class='post-title entry-title' itemprop='name'>/) nrsave_title=NR
 if(NR==nrsave_date+2) print "<h1>" $0 "</h1>"
 if(NR==nrsave_title+2) print "<h2>" $0 "</h2>"
}

The output will be something like the following, if you remove unnecessary tags:

<h1>31/01/2014</h1>
<h2>QRP and QRS</h2>
I love QRP operations, but some people ...
<h1>20/02/2014</h1>
<h2>Snow on my ANT</h2>
Last night, we had a heavy snow...

This can be fed into an Ebook editor, for example, SIGIL to get your own ebook immediately.