18 Oct 2019

Build a minimalist blog, powered by Emacs and Common Lisp

Look at this site.

So let's go over it.

Update: I figured out somebody figured out I didn't validate empty name. Just fixed it. Guess how do I fix it? I fix it in the REPL! Yes I run my server in REPL. Most modern people have forgot about this kind of ancient magic. XD

Update 2: Somebody actually discovered a bug about reply # (though not obvious), many thanks! And finally ad spam bot visited this site. I'm doing some anti-spam now but don't be surprise if something funny show up in the comments.

System Overview

The desired feature set:

  1. Post blogs.
  2. Comment board. IRC-like, no authorization required (Following the idea described in Why no HTTPS at this site: Morality instead of Barriers).
  3. Better not look bad.

It seems hard to choose blog engines, some of them are over bloated and others seems too elementary. But following the goals of the system it's not that hard to choose our tools.

  1. We just need some HTML generator. org-static-blog seems like a good choice. It's a simple org-to-html generator and is super easy to use. One canveat is that up to <2019-10-18 Fri> the generated index page contain the whole body of the posts. That's not a hard problem to fix.
  2. Write some Common Lisp. We always need a HTTP server to run a site anyway so just add some more things besides publish-directory. We use the awesome framework AllegroServe.
  3. Write some CSS.

It seems that no JavaScript is actually needed. That saves us lots of effort, battery and telecomm fee.

Publishing Subsystem

M-x package-install org-static-blog, play around with it and look at its source. Use C-h f.

The function used for generating index page is org-static-blog-assemble-index, which calls org-static-blog-assemble-multipost-page that does most of the actual job. org-static-blog-assemble-multipost-page gets content of the posts from org-static-blog-get-body.

Bingo! To alternate the appearance of the index page we just need to use another function to replace this call.

(defun org-static-blog-get-preview (post-filename)
  (with-temp-buffer
    (insert-file-contents (org-static-blog-matching-publish-filename post-filename))
    (let ((title-start)
          (paragraph-end)
          (post-start)
          (post-end))
      (goto-char (point-min))
      (setq title-start (search-forward "<div id=\"content\">"))
      (search-forward "<h1 class=\"post-title\">")
      (replace-match "<h2 class=\"post-title\">")
      (search-forward "</h1>")
      (replace-match "</h2>")
      (when (search-forward "<p>" nil t)
        (search-forward "</p>"))
      (setq paragraph-end (point))
      (goto-char (point-max))
      (search-backward "<div id=\"postamble\" class=\"status\">")
      (setq post-end (search-backward "</div>"))
      (search-backward "<div class=\"taglist\">")
      (search-backward ">") ;; eat the returns/white spaces
      (setq post-start (+ (point) 1))
      (concat (buffer-substring-no-properties
               title-start
               paragraph-end)
              (if (equal paragraph-end post-start)
                  ""
                "(...)")
              (buffer-substring-no-properties
               post-start
               post-end)))))

Those searching and replacing does not look elegant (it is used in org-static-blog everywhere anyway), but since this is just a small site generating tool it's ok. The code snippet above basically search in a temporary Emacs buffer containing the full HTML of the posts to find the first paragraph and the taglist of it. Then it concatenates them to generate a preview.

Comment Subsystem

Managing the data

Ok here comes the big part. Lots of things (like DBMS) come to my head…

But is this that complicated? Let's look at the goal:

  • Comment board. IRC-like, no authorization required.

This is actually dead simple! Because no authorization would mean normally users cannot delete or edit comments (this also make some sense on system with authorizations, too, because this might make users feel more responsible for their comments). This naturally fits into the designed use case of plain file. You need DB only when you need frequent insertion (or a high performance system, which is not the very case here), but if users only add comments, we can solely append to a log file, and for single server this is almost the most optimal implementation. Solving the problem by arguing the problem does not exist, yeah!

So now let's think about the implementation detail.

Since each post is independent from others, we can store the log file of them seperately. For in memory cache, we can just make a hashtable mapping from the name of a post to its own data structure. Now the problem reduces to handling comments for one particular post.

We'd like to make hierarchial comment threads, which means each comment can reply to another comment and they are displayed in a tree. This means

  1. each comment has an id and a parent
  2. we'd like to traverse the tree of the children of a comment when doing formatting

For 2. it is quite intuitive to store the list of direct children for each parent comment. We can then store all lists of child comment in a hashtable with their parent-id as their key. A schematic diagram of the data structure looks like:

                   +-----------------+
                   |    Hashtable    |
Key: parent ID     | 1  - 2  - 3  ...|
                   +-+----+----+---- +
                     |    |    |
                     |   ...  ...
                     |
                  +----+   +----+   +----+
List of children: |cons|---|cons|---|cons|-...
                  +----+   +----+   +----+
                     |       |        |
                     |      ...      ...
                     |
                  +----+   +------------+
Child content     |cons|---|HTML Content|
                  +----+   +------------+
                     |
                   +---+
Child ID           | 3 |
                   +---+

This is illustrated in the following code to insert a comment into our data structure:

(defun comment-table-insert (comment-table
                             num-id
                             num-parent
                             content)
  (multiple-value-bind (child-list exist)
      (gethash num-parent comment-table)
    (if exist
        (progn
          (push (cons num-id content) (gethash num-parent comment-table))
         (setf (gethash num-id comment-table) '()))
        (format t "ERROR: parent ~D does not exist.~%" num-parent))))

This simple design actually has more good things than we think!

  1. Posting new comment has time complexity O(1). Baseline. (not true if somebody uses Python dict to write a server)
  2. The traversal has time complexity O(n), which is already optimal.
  3. The time complexity for creating the data structure from log file is O(n), which is also optimal.
  4. When restart the server (or after GC cleaned up the memory data structure, if we future add weak pointers), loading from log file is guaranteed to restore the system to the state when it was shutdown – Just reuse the code for posting comments! When read from log file, we recreate the time order of the comments and recreate the process of building our data structure.
  5. When formatting HTML, if we just recursively traverse the list of children in order, then we naturally get new posts at the top of the page. (Pushing to child list preserves time order!)

Now there's a subtle problem: what to do with the "standalone comments" without parents? The solution is simple and beautiful: add an imaginary comment with ID 0 (root comment), and make any "standalone comment" a child of root. Then we can just reuse all of our code of formatting child comments.

Formatting HTML

As we've described, when serving GET requests, we just need to recursively traverse the comment tree and write to html stream.

(defun format-comment-list (comment-table content child-list)
  (html ((:div class "comment") (:princ content)
              (mapc (lambda (child)
                     (format-comment-list comment-table
                                          (cdr child)
                                          (gethash (car child) comment-table)))
                    child-list))))
(defun format-comments (filename)
  (touch-comment-table filename)
  (let* ((comment-table
           (gethash filename *comment-table*))
         (root-list (gethash 0 comment-table)))
    (if (null root-list)
        (html "No comments yet.")
        (mapc (lambda (child)
                     (format-comment-list comment-table
                                          (cdr child)
                                          (gethash (car child) comment-table)))
                    root-list))))

Here we use htmlgen from the AllegroServe framework. Check its documentation. Be careful about the "list beginning with a keyword symbol" and "list beginning with a list beginning with a keyword symbol". If you mysteriously get some UNBOUND-VARIABLE you've probably messed up with those list (aka parenthesis) structures because then the macro will try to evaluate some part of your markup as Lisp expressions.

In fact, evaluating part of the markup language tree is a very powerful feature. The documentation says it throws the value of them away, that means if you want to generate some output from those inline Lisp expressions, just use nested html macro. A simple example of conditionals:

(let ((content-stream
                  (make-string-output-stream)))
            (html-stream content-stream
                         (:p (:b (:princ-safe
                                  (format nil "#~D"
                                          (gethash filename *next-comment-id-table*))))
                             (:princ-safe
                                 (format nil " by ~a<"
                                          nickname))
                             (if empty-contact
                                 (html "CIA top secret")
                                 (html ((:a href (concatenate 'string
                                                              "mailto:"
                                                              contact)) (:princ-safe contact))))
                             (:princ-safe ">"))
                         (:p
                          (unless empty-uri
                            (html ((:a href uri) (:princ-safe uri))))
                          (unless (or empty-uri empty-text)
                            (html :br))
                          (unless empty-text
                            (html (:princ-safe text)))))
            (get-output-stream-string content-stream))

Serving the clients

For generating full HTML page, we just look for a special line and replace it. Simple task, no need for a template system.

(defun response-with-comments (req ent filename info)
  (with-open-file
      (src-stream filename)
    (loop for line = (read-line src-stream nil)
          while line
          do (if (string-equal line
                               (format nil "<!--%comments%-->"))
                 (format-comments filename)
                 (html (:princ line) :newline)))
    t))

There's a subtle problem on handling user POST. If we just directly return the updated page, when user hit refresh or back button or whatever the form might be resubmitted. A common practice is the PRG pattern. We do it here.

(publish-directory :prefix "/" :destination *document-root*
                   :filter
                   (lambda (req ent filename info)
                     (if (string-equal "text/html" (gethash
                                                    (pathname-type (pathname filename))
                                                    *mime-types*))
                         (case (request-method req)
                           (:post (let ((nickname (request-query-value "nickname" req))
                                        (contact (request-query-value "contact" req))
                                        (parent (request-query-value "rep" req))
                                        (text (request-query-value "text" req))
                                        (uri (request-query-value "url" req)))
                                    (if (and nickname contact
                                             parent text uri)
                                        (post-new-comment
                                         req
                                         filename
                                         nickname
                                         contact
                                         parent
                                         text
                                         uri)
                                        (failed-request req)))
                            (with-http-response (req ent :response *response-found*)
                              (setf (reply-header-slot-value req :location)
                                                         (request-uri req)) ;;redirect to same page
                             (with-http-body (req ent)))
                            t)
                           (:get
                            (with-http-response (req ent)
                              (with-http-body (req ent)
                                  (response-with-comments req ent filename info)))
                            t)
                           (otherwise (failed-request req)))
                         nil)))

That's basically how we make a comment system! The full code is at https://github.com/BlueFlo0d/site-server. There's some canveat, e.g., no cache for formatting comment and open file each time a comment is posted, I'll probably fix it but not very likely because none of this affects asymptomatic complexity, who cares about constants? BTW, the file opening problem is the fault of modern OSs which induce too much overhead on file opening, not my fault. (I'm kiddin lol)

Make it look better

Write some CSS. See CSS for this site.

Tags: meta Emacs Common-Lisp web org-mode

Comments

#8 by casouri<CIA top secret>

Very cool! I like the simple comment system. Old-school and no JavaScript-shit.

#6 by Lenard<lenardbarkly@yahoo.com>

https://funnypictures.photos/
Keep on working, great job!

#7 by admin<CIA top secret>

This looks like spam but I'm keeping it

#5 by tst<CIA top secret>

cool

#4 by sam<CIA top secret>

test

#3 by huzi<CIA top secret>

cool.

#1 by admin<qhong@mit.edu>

Let me do a test in production environment (you should never do that!)

#2 by admin<qhong@mit.edu>

https://stallman.org
And I can reply with the link to St IGNUcious.


Other posts
Creative Commons License
cat-v.mit.edu by Q. Hong is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.