Built-in AI: how Mastodon uses translation APIs
Mastodon posts mislabeled with the wrong language as well as costly translations that depend on a server: Those are common problems for many Mastodon instances. In this talk, Thomas Steiner from Chrome's Built-in AI team details how he integrated the Language Detector and Translator APIs into the popular Mastodon client Elk.zone. This isn't a textbook success story—you'll hear about what initially went wrong, the technical hurdles encountered, and how they were overcome to build a feature that now runs reliably in production.
- Published
- Published Nov 26, 2025
- Uploaded
- Uploaded Jun 13, 2026
- File type
- YouTube
- Queried
- 00
- Source
- youtube.com
Full transcript
Showing the full transcript for this video.
AI-generated transcript with timestamped sections.
[00:00] - [00:06] Thank you. [00:07] Good afternoon. [00:08] It's the end of a long day. [00:11] Who here has enjoyed the WebAI Summit so far? Woo-hoo. I'm going to end the blog of presentations [00:20] With my talk, [00:21] Built-in AI [00:22] in the wild. [00:24] "A Mastodon Translation Success Story." [00:27] Thank you. [00:27] Thank you. [00:29] So let's start. [00:30] Hello, my name is Thomas Steiner. [00:32] And I have a weird hobby. [00:35] I love finding the German word for [00:39] Dot, dot, dot. [00:41] So, let me give you an example. [00:44] This is a Mastodon post by a Ukrainian musician called Cat Lady Kidia Music. [00:49] Posts on mastodon are called toots. [00:53] And so in this tool-- [00:54] Cat Lady asks, "What's the German word for [00:58] Manio. [00:59] Wave at someone. [01:01] But the someone wasn't waving at you. [01:06] So... [01:07] I'm not going to be able to do it. [01:08] What's the German word for the situation? [01:11] Any takers? [01:14] Thank you. [01:15] Well, of course, in German, we call it [01:19] Fremdwink, Zurückwinkungs [01:22] Peinlichkeit. [01:23] Fremdwink, Zurückwinkungspeinlichkeit. [01:27] You don't believe me? [01:28] Here's a screenshot of Google Translate. And I had to nudge it just a little bit in the right direction by adding a dash.
[01:36] But here's the translation. "Fremdwink Zurückwinkungsfeindlichkeit" [01:40] Embarrassment of waving back [01:42] At someone else. [01:45] By default, [01:46] I use Mastodon with the English user interface. [01:49] Watch me type. [01:51] as a reply to-- sorry. Watch me type as a reply to the toot. [01:57] Thank you. [01:58] So let me zoom in a bit so you can better understand what's happening. [02:02] Notice how after typing 20 tool characters, [02:06] The language selector [02:07] Lights up in yellow. [02:09] This indicates that the Mastermind software has detected [02:12] that likely what I'm typing isn't English. It's very much not English. But it's the default [02:19] When you have the user interface, [02:21] in English. [02:24] Many people on Macedon don't even realize that this language selector exists. [02:29] The result is that many great attitudes with [02:32] the wrong language. [02:34] When I inspect my TOOT with DevTools, [02:37] You can see that the language of the Toot is [02:39] English. [02:41] So luckily on Mastodon, [02:43] There's a way out. You can add it to it. [02:46] So you can go back. [02:47] to the toot. [02:48] and correct the language to German. [02:51] And when you do that, [02:53] and inspect with that tools again, you can then see that the language is now correctly set to German. [02:59] And because I have my Mastodon user interface set to English, [03:03] The software now also shows me a translate button. [03:06] on all non-English tools, including my own,
[03:09] But what happens when I click it? [03:12] When I click the translate button, [03:14] The masternode software makes a request to a translation API that is running on the server [03:19] You can see the run selection result in the UI now. [03:22] The API responds with foreign waving back embarrassment. [03:27] somewhat correct. [03:30] In this case, the response comes from depol.com. [03:33] And while Deeple offers a free tier, [03:36] which is what I'm using on my single user demo mastermind instance here, [03:40] You can imagine that larger mastermind instances with hundreds or even thousands of users would quickly burn through. [03:47] the three-tier quota. [03:49] which is, of course, then causing them hefty API costs [03:53] that typically in a community kind of supported [03:57] Yeah, world, those community masternons server admins [04:01] are not really super happy to--yeah. [04:04] host and pay for it. [04:06] uh [04:06] So let's recap the two challenges here that we have. [04:11] Language detection exists [04:13] but kicks in kind of late. [04:14] And the UI change [04:16] It's relatively subtle. [04:18] And so many tools-- [04:20] end up being mislabeled with the wrong language. [04:23] And then second, translation happens on the server, which can become really expensive [04:28] And it also means [04:29] You can't really translate private tools like DMs [04:32] without sending the data off to a remote server. [04:36] Thank you. [04:37] I decided to tackle those challenges on a master client called Elk.
[04:42] You can find Elk at the URL elk.zone. [04:46] Thank you. [04:46] It's completely open source. [04:48] And it's available on GitHub. [04:50] With more than 600 forks and almost 6,000 stars, [04:54] It's also a quite popular project. [04:56] ELK is a progressive web app, a PLWA written in Vue. [05:01] Having to detect the language of unknown texts is a common task. [05:05] Task [05:06] It's actually such a common task [05:08] that on the Chrome team, we have decided to make language detection with Kenji earlier today [05:13] called a well-paved... [05:15] Road. [05:16] The API is so simple that I can demo it with two lines of JavaScript in the Chrome DevTools. [05:23] I first create the language detector [05:25] and then run the language detector's detect function, which returns an array of possible language candidates. [05:32] with their confidence. [05:36] And now that you know how the API works, [05:38] Let me show you the PR that I opened to add support for language detection. [05:42] This is what I wrote. [05:43] This PR adds support for automatically detecting the composition language, [05:48] based on the Language Detector API. [05:50] The detected language is updated as user types. [05:53] And the MarText [05:55] I went with a minimum number of six graphemes, [05:58] the more reliable the detection. [06:00] Thank you. [06:00] This is [06:01] should greatly improve the lives of people who compose toots in different languages. [06:05] like many people do, and always forget to update the language picker. [06:10] One browser said, don't support the API.
[06:12] Simply nothing happens. [06:14] Thank you. [06:15] The code consists of about 40 lines of view code, [06:18] First, that's the API setup. [06:20] and feature detection. [06:21] And next? [06:23] I do letter counting, or rather grapheme counting. So language detection only starts working when there's at least five graphemes. [06:30] A grapheme is the smallest functional unit [06:33] of a writing system. [06:34] The actual implementation then just hooks up the API to the view front-end code [06:39] So the language selector in ELK automatically changes [06:42] when they detected language changes. [06:44] My initial implementation had one small [06:48] Yeah, flaw. [06:50] You can see on this slide where I hook up the detected language function to the keyup event [06:56] The problem, of course, is the bubble that I live in. [07:00] More specifically, it's the laptop with the Apple with a byte taken out of it. [07:04] that I coded the extension or the API on. [07:08] Elk is a PWA that aims to be accessible to all sorts of hardware. [07:12] from the highest end, which my laptop is part of, to the lowest end, which means, [07:18] For example, that 11-year-old Windows laptop that my neighbor keeps sticking onto. [07:25] Thank you. [07:26] The L community was quick to file a follow-up PR that fixed my initial flaw. [07:33] by adding language detection with low level [07:37] debouncing. [07:38] So-- [07:39] It would not always run on key up, but be debounced.
[07:42] Thank you. [07:43] Alec even has a setting to optimize the app for low-performance devices. [07:47] So when this preference is set, [07:49] Language detection is turned off. [07:52] like really for low-end devices and for all other situations, [07:56] The follow-up PR just wrapped my code in a debalance function. [08:02] One challenge down, one to go. [08:04] I tackle translation. [08:06] That Translator API is likewise a paved road. [08:09] And the road is so well paved that I can demonstrate the API in DevTools again. [08:14] I first create a translator by calling the create function, passing it the source language [08:19] German? [08:20] and the target language. [08:22] English. [08:23] Here I'm translating from German to [08:25] English. [08:26] And next, all that remains is calling the translate function. [08:30] Guten Morgen in German means good morning in English. [08:34] So, here's my PR, which I added support for the Translator API to alloc. [08:39] ELK doesn't use the original Mastodon software's translation stack, [08:43] but their own infrastructure, powered by an open source translation library called LibreTranslate. [08:49] While the SNAPI quota costs to pay [08:51] There's still server maintenance cost. [08:53] just operation cost. [08:54] that they need to pay. [08:56] I [08:56] The implementation consists first of feature detection to add the Translator API as a progressive [09:03] Thank you. [09:03] The actual implementation then is relatively straightforward. [09:07] and mostly just hooks up the view frontend code [09:09] to the translation logic. [09:11] An interesting detail is that I chose to add language detection before the translation step is run, because as you may remember,
[09:18] Many tools are mislabeled using the wrong language. [09:22] So, [09:23] By using language detection, I catch those cases. [09:27] Both PRs were merged a while ago, [09:30] And while the last core release of ELK from January doesn't include the changes yet, [09:34] You can always switch to the "I'm feeling lucky" release. [09:39] at main.out.zone, [09:41] that's always deployed straight from the main branch. [09:45] All right, with that, thank you very much. [09:47] for-- oh, wait, wait, wait, wait, there's one more thing. [09:51] Thank you. [09:52] On the Chrome team, [09:53] We are also experimenting with the Freeform Prompt API, [09:57] That's a paved road again. [09:58] From an API point of view, [10:00] But the challenge is knowing where the road goes. [10:03] purely for the fun of it. [10:05] I played with the prompt API in the context of my weird hobby. [10:09] First, [10:09] I create a session with the language model, telling it about the expected inputs and outputs. [10:15] which are both of type text. [10:17] And the language is? [10:18] are set to English. [10:20] I don't set the output to German, [10:22] because the language model [10:23] While capable and knowledgeable of German, [10:26] isn't approved for Germany yet for security reasons. [10:29] Thank you. [10:30] But now with the session created, I can finally ask the model, [10:33] What is the German word for when you wave back at someone [10:36] who wasn't waving at you. And you can see that I used some prompt engineering here to nudge the model in the right direction. [10:44] Respond with just one word. [10:46] It can be absurdly long and ridiculous.
[10:49] No dashes. So the model was first adding dashes. The idea is to make fun of long German compound words. [10:58] Well, I'm biased. [10:59] But I'd say my response was better. [11:03] over-messing hand-winkle, [11:06] doesn't quite pass for-- [11:08] What's the German word for? [11:10] joke. [11:11] Thank you. [11:12] And maybe that's the important message to take home here. [11:16] Let's totally use [11:18] artificial intelligence for what it's good at. [11:20] but also, [11:21] Let's not forget to use [11:23] human intelligence for [11:25] Well, [11:25] the human touch. [11:28] I encourage you to learn more about built-in AI [11:30] on developer.chrome.com/docs/ai/built-in. [11:37] And be sure to also sign up. [11:39] to our early preview program. So you hear [11:42] First, when we release new APIs, [11:44] What are features? [11:46] And with that, now it's really time. [11:49] to say thank you for listening. [11:51] My name is Thomas Steiner, and if you ever [11:53] need the German word for something, [11:55] Be sure. [11:57] to reach out. [11:58] Thank you very much.
Want to learn more?