Jekyll and the Plain Text Corpus

Writing with Continuous Delivery

(*0*;)

A blog post about writing blog posts - frictionless blogging with Jekyll, Dropbox and plain text

Having setup a simple static blog with Jekyll, I wanted to be able to minimise the amount of time required to publish changes. Every time I save this file, this page gets updated.

The process takes around ten seconds, and it is a good optimisation over traditional posting. In fact it doesn’t feel like posting at all, it feels like writing notes which happen to be publicly visible - traditional ‘blogging’ is dead for me.

An advantage this has over blogging with Github Pages is that it does not require Git commits mediating the process either.

This workflow fits with the incremental approach to writing which is suited to the fact that I am only able to add a sentence here and there when i can due to limited time and connectivity. Over time, the post evolves in material and succinctness with frequent refinements. I doodle on trains, on park benches, at a coffee shop, during a work break. I don’t have to set aside a chunk of the day to write, I steal time when I can and collect my thoughts.

Another way to describe this way of writing could be ‘accretion’:

Accretion describes the way an oyster makes a pearl, by gradually adding small amounts of calcium carbonate … Other words closely related to accretion are “incremental”, “adaptive”, and “evolutionary” … You first make the simplest possible version of the system that will run … You add a little code at a time until you have a fully working system. - McConnell:

I find this easy to maintain and data portable as it is simply a folder of text files. If you don’t believe me, look at the plain text. All that is required is an Ubuntu box on digital ocean, a headless instance of dropbox on the server and some simple bash scripts.

Here are the tools I am using for this workflow:

Setting up headless Dropbox

Install the official dropbox client

The 64-bit dropbox client for Ubuntu can be obtained with:

cd ~ && wget -O - "https://www.dropbox.com/download?plat=lnx.x86_64" | tar xzf - && ~/.dropbox-dist/dropboxd

Once installed we can selectively sync folders by adding everything but folders of interest to exclude list. This can take a while at the beginning.

Setup the server watch process

The watch process monitors my text file folder for entries beginning with a time stamp and publishes those entries only.

Writing Posts

Each text file needs to be setup with a Jekyll front matter as specified in their docs. This would ordinarily be performed quickly with a Textexpander shortcut.

The rest of the post is standard Markdown.

Operations

As with any personal information system It is useful to be able to draw relationships between notes as we can with hypertext and wikis but using plain text. Over time the amount of text accumulates. Relationships can be drawn between disparate bits of text.

Fortunately, Nvalt supports double bracket notation for linking to other notes. Creating note references has never been so fast - the following perl one-liner can assist with converting those Nvalt links to Jekyll post links:

Consolidation of all the bits of text occurs at the end using Ulysses which helps with scaffolding together lots of plain text files into a structured whole. This post was originally a series of posts on the topic of Jekyll which have now been consolidated into a single one.

nvalt wiki preview

2016-03-24 11:47 pm

Perl substitution to convert double bracket links to working post_urls in jekyll

Now this has to be run as a post-build on every item of the _posts directory

Uptime Monitoring

In order to receive real-time event information we can setup alerts and process restarts on the server.

 Email Notifications 

The dropbox triggered Jekyll publish sometimes fails due to parse errors and I won’t know until I realise the publisher has stopped working.

The following if else sendmail script is being used to email alerts if the build fails.

if jekyll build; then
    echo "Subject: blog updated" | sendmail myemail # replace with colon
else
    echo 'Subject: jekyll build failed' | sendmail myemail
fi We can also access the process exit status with $?

( ´∀`)

The following one liner can print all links on the main page of this blog:

for i in `curl shedali.co.uk | grep -o -E 'href="([^"#]+)"' | cut -d '"' -f2 | sort | uniq`; do echo http://www.shedali.co.uk$i; done

That provides a list of urls to stdout. Now we can curl each of these links using a loop and grep the response code:

for i in `curl shedali.co.uk | grep -o -E 'href="([^"#]+)"' | cut -d '"' -f2 | sort | uniq`; do sh isup.sh http://www.shedali.co.uk$i; done
//isup.sh 
#!/bin/bash
CURL=$(curl -s --head $1)

if echo $CURL | grep "200 OK" > /dev/null
then
        echo "The HTTP server on $1 is up!"
else

    MESSAGE="This is an alert that your site $1 has failed to respond 200 OK."
    echo $MESSAGE;
fi

UPDATE: @maliciousmind shared this elegant ruby gist which crawls the entire site.

Another gem is the link-checker gem by Ryan Porter.

Now - running the following script

bundle exec ruby status_crawler.rb http://www.shedali.co.uk

provides the following output:

On page: http://www.shedali.co.uk/contact
Broken link (999) to http://www.linkedin.com/in/sittampalam

On page: http://www.shedali.co.uk/tools/2014/03/05/screen-recording-tools/
Broken link (404) to https://www.dropbox.com/s/57vs1l9fic4l3db/record%20screen.alfredworkflow

I don’t know why I am receiving a 999 response code from Linkedin but thanks to the script I was able to find four broken links.

Another point at which it would be useful to know where links are broken is at the point of blog generation with Jekyll build.

SEO

Sites.txt

Generate a list of every post on the blog with this txt file which can be used as the STDIN of a crawling script.


layout: nil ---
  {-% for post in site.posts %}
            http://www.shedali.co.uk
  {-% endfor %}
  {-% for page in site.pages %}
          
  {-% endfor %}

Markdown Extras

Performance Optimisation

By using page speed insights or similar metrics and by taking regular measurements from site inception through development cycles, we can pinpoint bottle necks and load issues. 1

I am tracking the steps used to ensure this site trends towards 100/100 on pagespeed insights.

Pagespeed Insights

2014-08-25-17-42

Pagespeed insights is available as an online tool, a browser extension and a node module

General Optimisation A few steps I have taken to keep this site optimised at this early stage include:

Automating optimisation metrics

Using Gulp and PSI: the following report can be generated (see sample code by Addy Osmani here)

----------------------------------------------------------------

Number Resources                                 | 8
Number Hosts                                     | 4
Total Request Bytes                              | 1871
Number Static Resources                          | 2
Html Response Bytes                              | 26930
Css Response Bytes                               | 1306
Image Response Bytes                             | 415
Javascript Response Bytes                        | 25556
Other Response Bytes                             | 798
Number Js Resources                              | 1
Number Css Resources                             | 1

----------------------------------------------------------------

Avoid Landing Page Redirects                     | 0
Enable Gzip Compression                          | 0
Leverage Browser Caching                         | 0.5
Main Resource Server Response Time               | 0
Minify Css                                       | 0
Minify HTML                                      | 0.06
Minify Java Script                               | 0
Minimize Render Blocking Resources               | 6
Optimize Images                                  | 0
Prioritize Visible Content                       | 0

----------------------------------------------------------------

The output reveals there are render blocking resources above the fold which need to be addressed, in particular the non asynchronous load of a google font.

The other minor issue is that google analytics.js does not specify content expiry but this is by design - 2015-03-04 have kept a local version of google analytics and will setup a cron job to pull in the latest frequently. - 2014-10-18 See Navigation Timing API - 2015-02-27 Desktop 90/100 Mobile 70/100 - 2015-02-27 switching to cloudfare - 2015-02-04 Inline above the fold css with automation tooling see critical - 2015-03-01 current page speed 96/100 Mobile 98/100 Desktop - 2015-03-01 11:57 pm use jekyll compress layout to minify the html http://jch.penibelst.de/ - 2015-03-02 12:28 am site homepage is now 100/100 optimized. Google’s homepage is only 78/100.

Using Appcache

The application cache permits offline access of all blog pages when the visitor hits the main page.

Creating App Cache

Create a new file called manifest.appcache (the extension can be anything) and paste the following contents:

---
---
CACHE MANIFEST

# rev 

CACHE:
{#% asset_path all.css %}
{#% for page in site.pages %}/writing/2017/01/27/17-26-jekyll/
{#% endfor %}
{#% for item in site.images %}
{#% endfor %}
{#% for item in site.scripts %}
{#% endfor %}

NETWORK:
*
http://*
https://*

Then on the main index.html add the following attribute to the html tag

html manifest="http://shedali.co.uk/manifest.appcache"

This has to be served with the correct mime type which in the case of node express can be added with:

express.static.mime.define({'text/cache-manifest': ['appcache']});

You can check what has been cached by navigating to chrome://appcache-internals/

Cache Invalidation

The site utilises the appcache to work offline, which is now deprecated in favour of service workers. More on that later. For now this does the trick to invalidate the appcache every update.

vardate=$(date +%Y\-%m\-%d\_%H.%M.%S); 
name=`printf  $vardate | md5sum | cut -d " " -f 1`
cp manifest.appcache $name
perl -pi -e "s|http.*appcache|appcache\/$name.appcache|" _layouts/default.html

## Embedding Media

Being able to quickly reference a youtube video at a specific point of its timeline aids expression and transferral of ideas. I have been keeping references to lectures, news items and quotes on youtube with timecode links in the same way that I catalogue other hyperlinked assets. These links launch with the play head positioned at the most relevant position. YouTube links are mature and ubiquitous enough to be used as bibliography entries along with other web @media types.

This post seeks to consider and improve flow of timecode linked posts.

Timecode Format

There are three ways to link to a specific timecode, by appending either:

- # t=54 // seconds
- # t=10m5s // mins and seconds 
- ?start=605 // seconds 

Linking to a specific point can be achieved by right clicking on the video and copying the link.

Furthermore when embedding a video into a page, one is able to specify both start and end times using the following parameterised forms ?start=506&end=123.

The following url with start and end parameters is a time limited embed: https://www.youtube.com/embed/0JCRXKiyDfE?start=87&end=190 Which links to a full screen time coded YouTube video or can be used as an iframe src which would output a white player as follows:

<div class="embed-container widescreen">
<iframe width="auto" height="auto" src="http://www.youtube.com/embed/0JCRXKiyDfE?start=87&end=190" frameborder="0" class="youtube-player" type="text/html"></iframe>
</div>

It is possible to get the timecode using the Youtube JS Api Youtube Developer - something like the following could issue a prompt with the current timecode ready to be copied:

prompt('copy', document.location.href+'#t='+document.getElementById("movie_player").getCurrentTime());

This can be wrapped in a bookmarklet, drag the following to test on YouTube.

bookmarklet

Images

Having a dropbox client on the webserver provides at least two additional benefits:

  1. any dropbox item can be symlinked into the webserver root
  2. collaborative photos can be used as a gallery source.

I am exploring options for quickly sharing photo sets on the go:

2014-04-05-17-41

github/avillafiorita/gallery and this is automatically generated with the Jekyll build process but requires running .create_gallery.rb [gallery-name] for each individual gallery. This in turn generates a set of thumbnails and an index.textile file for the gallery to display like any other post.

This is redundant when Dropbox and Photostream perform a perfectly adequate job of displaying photos in their respective light-boxes. However sometimes those sites are blocked behind corporate proxies. They do not permit gallery embedding eithers.

Jekyll Gallery plugin output - 2015-02-11-miniature-me ## 2016-01-17-dropbox-inline ### Dropbox Inline Images

You can generate a dropbox share link which allows for embedding images. Just add ?raw=1 to the end of the url when you copy the dropbox link.

This also means that it’s really fast to post from a screenshot to jekyll as dropbox desktop can optionally uploads screenshots ready for embedding with the url on the clipboard.

Mobile workflow

Post creation works like this: Textexpander on the iPhone generates the Jekyll scaffold. A two-step draft action: initially to copy the draft to the clipboard and the second action with the following syntax to send text to launch center which posts to specified file name in Dropbox:

launch://x-callback-url/dropbox/new?text=&path=&name=

Alternatives to Jekyll

I thought I would give Hugo a try as an alternative to Jekyll. It would save me having to be concerned with ruby / gem environment concerns as it requires a single binary to run. It’s also meant to be much faster and more efficient which is attractive as constant Jekyll compilation on file updates is costing me CPU time.

The adventure ended abruptly when I discovered that Hugo doesn’t support symbolic links.

The dropbox publish flow relies on symbolic links to connect my dropbox notes folder with the blog posts folder so that’s a non starter. Looks like it’s not supported by Go at all…

  1. https://developers.google.com/speed/docs/best-practices/rules_intro

source