Back in school, if anybody had told me, some day I would become a published author I probably would have declared them crazy. What's more, if they told me, most of my written texts would have been in anything but German, I would have lost all faith in them. But here we are. 15 years later I was able to publish two papers already and ideas for the next few are already forming. What do you know…?
Aside from finding the right time to getting into a creative state of mind for writing, one my biggest problems remains spelling and grammar. Also, as a non-native speaker, sometimes style is an issue in English texts. Especially in scientific writing, one tends to over-complicate things or use too many unnecessary conjunctions and some false friends have left their scars on my language intuition.
Modern text processors do a decent job of pointing out different flaws in your texts. As a Free and Open Source (FOSS) aficionado, tough, I thought it is just a law of nature that I had to miss out on the goodies baked into modern proprietary solutions. LibreOffice's default spell checker, for example, would only detect spelling mistakes but would not highlight grammar or style issues too convoluted. But, as is often the case, after some digging I found a brilliant software package a few years back that solved all my issues: LanguageTool.
LanguageTool to the rescue!
LanguageTool is an Open Source proofreading software for English, French, German, Polish, Russian, and more than 20 other languages. It finds many errors that a simple spell checker cannot detect. (source: LanguageTool GitHub)
LanguageTool is developed since 2003 as an open source project and the code is licensed under the Free Software Foundation's GNU LGPL-2.1 license. To fund development, a Germany-based company provides solutions for teams and businesses. That is, the website is, in my opinion, a bit too much tailored to attract paying customers, but that is totally my only personal pet peeve with the project. After all, well funded devs are (hopefully) happy devs.
Besides the usual spell checking, LanguageTool also provides help with your style. For example, – to this day I remember the very first style hint it pointed out to me as if it was yesterday – one of my mistakes, mixing "both A and B" and "A as well as B" to "both A as well as B", LanguageTool kindly pointed out that it would be better style to use the one or the other (even while checking this very blog post it does not get tired of reminding me 🧐). It is seemingly small things like this that keep surprising me and highlighting the power of LanguageTool.
And the best thing is, this is nothing that is hidden inside a walled garden, behind a paywall or part of a proprietary software package. It is free. Free as in free beer but also, even better, free as in free speech! And you can run it on your hardware in a matter of minutes if you like. And this is just what I did in the last days and want to shortly discuss in this blog post.
Local LanguageTool Server
LanguageTool is written in Java. The last couple of years I just downloaded the server
ran it locally. This, however, came with a delay for starting the server. Also, it can be a
nightmare to get the correct java runtime set up correctly.
This is why I recently was happy to find that there are up-to-date docker images available. With
that, a setup is as easy as setting up a
docker-compose file and making sure it starts with my
computer. The following
docker-compose.yml sets up a local LanguageTool server that can be
localhost:8081, the default for most applications and add-ons (see below):
version: "3" services: languagetool: image: erikvl87/languagetool container_name: languagetool restart: unless-stopped ports: - 8081:8010 # Using default port from the image environment: - langtool_languageModel=/ngrams # OPTIONAL: Using ngrams data - Java_Xms=512m # OPTIONAL: Setting a minimal Java heap size of 512 mib - Java_Xmx=1g # OPTIONAL: Setting a maximum Java heap size of 1 Gib volumes: - ./ngrams:/ngrams
If you want even better language checking, you can download more complex (but also more
voluminous in terms of required disk-space) "n-grams" data. At the time of writing, these
language models are available in English, German, French, Spanish and Dutch. Just download the
ones you need and extract the zip-files into a folder
ngrams next to your
docker-compose.yml. The only thing that is now necessary is to run
sudo docker-compose up -d
from within the directory and everything should work just fine. If you want to check the logs of
the docker, you can use
sudo docker-compose logs -f (again, from within the directory).
To spell-check your text inside your browser, you can install the add-on from the respective store, e.g. for Firefox. By default, the add-on uses the LanguageTool servers. You can, however, change the server address to use your freshly installed local server. Thus, no data leaves your machine. For that, just click on the LanguageTool icon in your browser's toolbar, click the Settings icon and go to "Experimental Settings (only for advanced users)". Here you can select to use a local server and "Save" to confirm:
While the official LanguageTool LibreOffice extension provides a small language model itself,
you can make use of your new local server with the
ngrams language models. Just go to
Extension → LanguageTool → Options…, under
Use remote server to check
text and enter the
http://localhost:8081 as the server address.
TeXstudio is, in my opinion, one of the best LaTeX editors out there. This is in large parts due to the easy integration with LanguageTool. In fact, it was through TeXstudio that I discovered LanguageTool in the first place.
Once you have a local LT server running, you just have to add it to LanguageTool like this:
Options → Configure TeXstudio → Language Checking and add the following configuration
Even from within Emacs, the beloved text editor from the 70s, it is possible to check
spelling. Thanks to the awesome langtool package by Masahiro Hayashi, it is as easy as to add
use-package lines to your
(use-package langtool :ensure t :config (setq langtool-http-server-host "localhost" langtool-http-server-port 8081) )
With my setup using EVIL-mode and EVIL-leader, it is as easy as to
SPC l l to start
LanguageTool and check my text and
SPC l c to enter spell checking. Here you can see how
suggestions are presented:
In short, LanguageTool is a powerful automated proofreading tool that can run locally, so none of your secret loveletters, business ideas or scientific breakthroughs will be leaving your computer. Maybe you would even consider setting up a dedicated server in your home or office network so that everybody can benefit from this awesome tool. I will most definitely use LanguageTool from now on even more than before.
Unfortunately, I could not find a way to contribute a few bucks to fund development, neither on the website nor the GitHub. This might be linked to the legal setup of LanguageTool as a company. Perhaps in the future I'll find a way to thank them.
…And of course, this text was proofread by LanguageTool. ✅