under the hood

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nulla eu dui tellus. Mauris nisi enim, posuere id laoreet eget, laoreet sagittis ante. Vivamus finibus arcu id metus molestie vehicula.

Just kidding. Always good to start off a page with a fresh batch of Lorem ipsum, especially on a tech page exuberantly bubbling with cheerful technobabble. Anyway, herein is even more technical stuff – everything you always wanted to know but not necessarily today. By the way, your basic desktop browser will more easy-to-use for this page, what with the hashes and copy-and-paste of code and whatnot.

all your hash are belong to us

For fun, we build a unique robot for each web page with information derived from that web page. The robot, then, is a simple image representing that web page. If the robot changes, the web page must have changed. If the web page changes, the robot changes. (Of course, the website is served with HTTPS, ensuring what is rendered in your browser is really what was intended.)

Robot graphics are generated by a local copy of the most excellent RoboHash code created by Colin Davis. See this write up about RoboHash. The RoboHash code takes a string, any string, and constructs a robot based on that string. For our robots, the string is a hash (also called a “message digest”) generated with a cryptographic hash function from preprocessed web page markup, processed with our secret brew of interlocking code gears, pulleys, and steam piston). So, before it is hashed, the the web page markup (XHTML5 in this case), is flattened, reduced, and normalized down to a block of only alpha characters. There are several benefits of, and compelling arguments for, some sort of reduction before hashing which might be worth more paragraphs in the future, which will probably happen here due to tight binding glue of several time machine algorithims.

So, suffice to say, the web page hashes are spun up with our own vaguely secret prep-then-hash mechanisms and, yes, you, too, can do the same thing and generate a hash and see if matches. After generating the web page markup in our vast, underground secret web factories, we hash it. For this work, hashing is done with SHA256. If only building the unique Robohash robots with a small number of permutations relative to a hash’s permutations, we could get by with MD5, say, and not worry about collisions and whatnot. However, SHA256 moves us to a more useful place, versatile for later useful verifications we might want to do.

Note that this is not a comprensive hash of everything that results in the final rendered web page. For example, cascading style sheets (CSS) and images (PNG, SVG, JPEG) are not hashed, just the markup, and not all of the markup at that. The preprocessing of the markup is a arbitrary compression scheme that tosses out data (more on that in a moment). The tossed data is not part of the hash, so to speak, therefore comparing hashes isn’t a complete validation. Even so, utilitarian enough at this level, weighing practical risks of, say, just the punctuation being compromised.

Anyway, want to see the secret code? Keep reading, it’s up just ahead. Indeed, it’s not every day that such light lunch reading magically appears.

first, preprocess and then hash the web page’s markup

There are a million stories in Hash Prep City and this is just one of them. By the way, you can hash anything “as is” – it doesn’t have to be prepped. We prep to reduce and simplify what we are hashing, yet not so reduced that uniqueness is lost. Think of it as one variation on lossy compression. In this arbitrarily determined process, we first flatten the web page markup with the quite useful program tr(1) to ensure a predictable, reduced block of text for hashing by simply stripping out spaces, blanks, punctuation, digits, control characters, and finally anything left that is not printable. This gives a reduced, predictable, consistent block of “enough content that counts,” minimizing error possibility, say, from retrieval or platform text differences. Then we pipe that reduced block of text to the program sha256(1) which outputs the SHA256 hash. Here’s the code snippet, written for ksh(1).

    function htmlhash
    {
        tr -cd "[:print:]" | tr -d "[:space:]" | \\
        tr -d  "[:punct:]" | tr -d "[:digit]"  | sha256
    }

get the web page

Now, pull down the website code with curl(1) and pipe it to the htmlhash function.

    curl -s [pageurl] | htmlhash > somehash

then, compare hashes

Then diff(1) it with the hash listed for the page.

    diff somehash publishedhash

As a practical matter for determining validity, there will be an off-line protected list of hashes to diff with hashes generated from retrieved web page stuff. In other words, obviously if the website is compromised, anything can be compromised, including the list of hashes on this page. And, of course, the robots. But this exercise using the published list suffices for trivial proof of concept and other cool buzzphrases.

Note that the program names mentioned above are written in Unix notation style of the program name followed by manual page number in parentheses, generalized as name(section). See man(1), Unix, and Unix-like.

page hashes and related spiffy robots

A motley crew, indeed. Notably, this page is the only page not listed. Wait, what? A mystery! The answer to which is left as an exercise for the reader.


creative-energy-and-hard-work
edcfc80ca8f2e10552f28774fa2cc8342e87279780bd0c24cf272ff7f09e3b2e


the-importance-of-art-education
151618b4d99e76fa4747680d51ef2de8dce811bcb16e60f5e2bd4f31b24b4e58


wordle
2dcca7a5b3e5f2e570cfe542907a273ca8fed839823d23666f6f50225e695fbc


finishing-touches
f6725f7556aeeb229883d27e1790d16bddd16ace05c01b9c53e46ed0a8f30381


ride-the-wave
034b2923cdd642320ab6daf83909533dd5d9c41ba64d6219d2483c8e6cec576e


living-in-the-digital-age
8d4fc9222fb49b2bb646b97ae6eb74da0e38bfe2d27b7337f99581b350430904


creative-juices
07b23636d7031a82f6656d8be09632a75796463c9250c87dbbe01a098164a764


mothers-day-gnomes
6c2599ceb00e4290a9c0b51c10b99579166255a01e54f904b1c7c854f98bcb4e


poppies-and-sketch-crawls-and-rain-oh-my
a6db1bf4ee2b5322f94f648c56042154610813641365286fed0975a4f8bcec90


22nd-world-wide-sketch-crawl-in-georgetown-texas
8f1813dd755b092071cec5c7c5dc7714e78c04b8064031ceab8cecefad308fb4


creative-energy-put-to-good-use
565785b6b2d40c37525e3d030756656fcf020317ffd0caee12c54e58712762c6


creativity-tastes-good
3f427acd36a38da9384e24082f7d82df084bafd46ae8233cf6523f305470b6ab


of-chairs-and-scale
a7f6b35519065fbbeed41ea240f1cc6780bc64d0348d709cb08026447d2a3733


convergence-bead-set–3
ec289196bdc1074dbc2908d6912da9873fca6d60330c57e3b0fb23d33d5e57bc


another-convergence-bead-set
7e0cafce14aa3ee0a88587e94598571e244f2d66a9d555c17c96cae0d6c98963


convergence
9fd4c833c95714adbd1cdebc493abd9e04517c9375148a526529102c09bc94f9


finished-the-shrines-for-now
343a5fa17946579af2b8f57ac9e2d895441b1c86e11abce93e2917ff3461e3e9


shrines-continued
50a5da4028240bbdf598a741c03ac17e88950e8b6dfa276915bc195b4e854275


something-new
9645ca9963c5988d2675b5fab6a1f42594b1ecfc5fac696d8844ff6d28b575da


charismacolors
b81f560ae1e1b0b9d83dead18645900e4e1f8b4703260ccaf5a5ccac4b43d999


creative-stirrings
bd52342830031ba6a51cb507ac9c3e747185120a3efc7bbad59666fb30c2c343


taking-advantage-of-down-time
215fe37ccc93e51d884fdb52ec36100574b087c674830f411e418a926715b6cc


okay-so-it-worked
9adcf77b52ace408db4495c128f7a05f3c1cb025f34cb0d14218119b39b18979


the-point-of-this-is
b17028be3424a6c5b7da3259c6f7f8130d2e4625542fcab8ee7d022d96005995


getting-started
ae19e5e9e67635cdd809ad303ec4f75d7c9dc722a44688dce1a1da5eea08ef99


index
3293ada32a683a81a1c1ef1e7bcff7de21d382f09e39b5a9c9fce3a53eb18177

typefaces

Typefaces are courtesy Google Fonts and Font Squirrel. Licenses are documented for each typeface on their respective web pages linked in the colophon.

Web font format is WOFF2 and is the only font format delivered by this site. Nothing but cutting edge here. Typeface data is embedded in CSS for performance. The CSS property control font-variant-ligatures is in play. So is some kerning magic.

In theory, font rendering should only depend on your browser but, for example, in the case of Mac OS X, the OS version is also a factor; WOFF2 is not supported Mac OS X Yosemite and lower even with the very-latest new-fangled WOFF2-capable modern browsers. Peruse browser WOFF2 support.

If your browser of choice can’t handle WOFF2, the browser will roll back to generic defaults, which in any case will be far less of a typographical experience than designed and intended and maybe even cause your computing device to suddenly fold into a black hole, perhaps leaping time and space back to, say, 1874, with no electrical grid, internets or webtubulars to be found. Upgrading is no doubt prudent in that case.

Body fallbacks are typically Georgia and serif. Heading fallbacks are usually HelveticaNeue‑CondensedBold, Arial, and sans-serif.

modular scale

We use a modular scale to calculate heading sizes. In specific, the ratio of choice here is 1.250 otherwise known as Major Third. H2 and H3 are the only headings used through out, at least with this edition. Heading sizes calculated with Jeremy Church’s magical type scale. The relative value rem is used throughout the CSS, with just a tiny part using px or em. Plenty of good articles on the web about all this and more, starting with this excellent treatise on rem and em.

your browser’s secret life

For light lunch reading, check your browser’s ideally excellent support for HTML5 and CSS3.

the source and only the source

Coming soon! All the source, served by mercurial. Yep, no plans to use git around here. Source, no doubt, will definitely include the magical pumpkin code.