How I track my Chinese reading speed

The evolution of how I track my reading speed and how I settled on using Org-Mode and Beorg for a seamless tracking experience

📅 31 Oct 2022 | ~7 min read
Tags: #chinese

Earlier this year, I read several interesting posts on Chinese Forums in which users recorded their Chinese speed. It was fascinating to see the progress that the users had made and I was curious to try it for myself.

Most of the users had a workflow that consisted of recording their time in a time tracking application and then manually inputting all times into a spreadsheet. I have come up with my own method which after some tweaking, I now believe to be vastly superior and essentially friction free.

The key to all of this, rather surprisingly, is Org-Mode.

You may be wondering what is Org-Mode. This is a difficult question to answer as it is such a powerful tool. However, it is essentially a plain-text system built into Emacs for managing whatever you may need it too.

Now before you go clicking off, my technique does not rely on Emacs, and I still believe that this system can work well for people who don’t use Emacs. In fact, it was important for me that my way of recording could be portable. I don’t want to have to keep my laptop at hand when I read.

I am a big fan of Beorg, a very useful Org-Mode app for iOS. There is a paid extension for tracking time which is essential to how I track many things in my life.

Even though the way of going about things has changed, the tools that I am using have been a great fit for the job since the very begining. I now want to show you how my technique has evolved and highlight just how powerful and flexible Org-Mode and this technique can be.

The first attempt

My initial technique was admittedly not great. I made a simple Org heading and just kept clocking every reading session as I went.

* 活着

Once I finished the book, I worked out how many characters were in the book and then divide that by the time spent reading for my reading speed. In this case I read ~76000 characters in 472 minutes, giving me a reading speed of 161 characters per minute.

* DONE 活着
CLOSED: [2022-02-23 Wed 13:41]
:LOGBOOK:
CLOCK: [2022-02-23 Wed 12:59]--[2022-02-23 Wed 13:37] =>  0:38
CLOCK: [2022-02-22 Tue 22:00]--[2022-02-22 Tue 22:08] =>  0:08
CLOCK: [2022-02-22 Tue 21:28]--[2022-02-22 Tue 21:52] =>  0:24
CLOCK: [2022-02-22 Tue 21:08]--[2022-02-22 Tue 21:28] =>  0:20
CLOCK: [2022-02-22 Tue 13:55]--[2022-02-22 Tue 14:21] =>  0:26
CLOCK: [2022-02-22 Tue 12:31]--[2022-02-22 Tue 13:12] =>  0:41
CLOCK: [2022-02-18 Fri 07:55]--[2022-02-18 Fri 08:22] =>  0:27
CLOCK: [2022-02-12 Sat 13:47]--[2022-02-12 Sat 15:04] =>  1:17
CLOCK: [2022-02-12 Sat 13:29]--[2022-02-12 Sat 13:42] =>  0:13
CLOCK: [2022-02-12 Sat 11:41]--[2022-02-12 Sat 11:59] =>  0:18
CLOCK: [2022-02-11 Fri 15:33]--[2022-02-11 Fri 15:34] =>  0:01
CLOCK: [2022-02-09 Wed 20:00]--[2022-02-09 Wed 20:09] =>  0:09
CLOCK: [2022-02-07 Mon 21:00]--[2022-02-07 Mon 21:39] =>  0:39
CLOCK: [2022-01-29 Sat 15:19]--[2022-01-29 Sat 15:46] =>  0:27
CLOCK: [2022-01-29 Sat 14:29]--[2022-01-29 Sat 14:36] =>  0:07
CLOCK: [2022-01-29 Sat 13:33]--[2022-01-29 Sat 13:42] =>  0:09
CLOCK: [2022-01-28 Fri 21:30]--[2022-01-28 Fri 21:54] =>  0:24
CLOCK: [2022-01-28 Fri 20:45]--[2022-01-28 Fri 21:23] =>  0:38
CLOCK: [2022-01-28 Fri 16:20]--[2022-01-28 Fri 16:30] =>  0:10
CLOCK: [2022-01-28 Fri 16:13]--[2022-01-28 Fri 16:16] =>  0:03
CLOCK: [2022-01-28 Fri 15:59]--[2022-01-28 Fri 16:12] =>  0:13
:END:

This technique was okay, but there were a couple of problems I had with it.

My main issue is that I don’t like how I need to wait so long in order to get back one useful piece of data. It’s impossible to know if I’m making progress while in the middle of the book. I also used this technique for the second book I read this year, and it took me the best part of a month to realise that I had essentially made no progress.

The second technique

In order to combat this issue, I decided to record my data more regularly, and I was quite happy that I managed to generate 10 data points for this book. I continued using Beorg to record the times and manually copied them over to the following spreadsheet (Yes. Org-Mode can deal with plain-text spreasheets. Also, yes. That is a spreadsheet formula at the bottom.)

** DONE 许三观卖血记
| Chapter | Time (m) | Characters | Reading speed (cpm) |
|---------+----------+------------+---------------------|
|       1 |       30 |       5491 |                 183 |
|       3 |       54 |      10241 |                 189 |
|       7 |       99 |      19298 |                 194 |
|      10 |      126 |      24441 |                 193 |
|      11 |      139 |      26591 |                 191 |
|      15 |      173 |      33558 |                 193 |
|      20 |      226 |      45048 |                 199 |
|      25 |      295 |      67591 |                 229 |
|      27 |      348 |      83303 |                 239 |
|     end |      443 |     105101 |                 237 |
#+TBLFM: $4='(/ $3 $2);N

This new way of doing things wasn’t without its own problems. The main one is that it now required a lot more manual input, and that means that I was much more likely to make mistakes while inputting.

Another problem is that the data is slightly misleading. As the time and character counts represent cumulative counts for the whole book up until that time, the reading speed also represents the speed for the whole book up until that point and is not representative of each reading session. For example my speed for the penultimate entry would have been 296 cpm, but that is not obvious at a glance.

The final technique

The key to my final technique was learning to take advantage of the :PROPERTIES: attribute in Org-Mode. The following is an example of a single chapter that I read recently.

* 紫川
** DONE 第1集 第1章
CLOSED: [2022-08-30 Tue 16:16]
:PROPERTIES:
:characters: 13510
:END:
- State "DONE"       from "TODO"       [2022-08-30 Tue 16:16]
:LOGBOOK:
CLOCK: [2022-08-30 Tue 15:56]--[2022-08-30 Tue 16:13] =>  0:17
CLOCK: [2022-08-30 Tue 15:33]--[2022-08-30 Tue 15:37] =>  0:04
CLOCK: [2022-08-30 Tue 15:13]--[2022-08-30 Tue 15:26] =>  0:13
CLOCK: [2022-08-30 Tue 12:36]--[2022-08-30 Tue 13:20] =>  0:44
:END:

I have written two Python scripts for analysing my reading data.

The first script takes an .epub file as input and returns a skeleton of the book in which each chapter has a character count attached.


*  撒哈拉的故事
** 沙漠中的饭店
:PROPERTIES:
:characters: 2655
:END:
** 结婚记
:PROPERTIES:
:characters: 4239
:END:
** 悬壶济世
:PROPERTIES:
:characters: 4226
:END:
** 娃娃新娘
:PROPERTIES:
:characters: 3874
:END:
** 荒山之夜
:PROPERTIES:
:characters: 6238
:END:
** 沙漠观浴记
:PROPERTIES:
:characters: 4208
:END:
** 爱的寻求
:PROPERTIES:
:characters: 5055
:END:
** 芳 邻
:PROPERTIES:
:characters: 4457
:END:
** 素人渔夫
:PROPERTIES:
:characters: 5586
:END:
** 死果
:PROPERTIES:
:characters: 7161
:END:
** 天 梯
:PROPERTIES:
:characters: 6983
:END:
** 白手成家
:PROPERTIES:
:characters: 14393
:END:
** 收魂记
:PROPERTIES:
:characters: 5054
:END:
** 沙巴军曹
:PROPERTIES:
:characters: 6980
:END:
** 搭车客
:PROPERTIES:
:characters: 8138
:END:
** 哑 奴
:PROPERTIES:
:characters: 8054
:END:
** 哭泣的骆驼
:PROPERTIES:
:characters: 18123
:END:

The second script uses orgparse to extract all the data from the Org document and extract lots of useful information from it.

I plan to cover this in a separate post when I have my graphs in a state where I am happier with them. But for now, I would like to share some simple statistics as I just passed one million characters for the year.

Chars tracked: 1006996
Time recorded: 3 days, 9 hours, and 59 minutes.
Average speed: 205 cpm

Conclusion

I have published the code for both of my scripts on Github. Hopefully there may be at least one other person in the world who may be interested in such a niche topic.

Next time I will talk about my data visualisation techniques for this project and then I will write summaries of all the Chinese books that I’ve read closer to the end of the year.

✉️ Respond by Email.