Monday, January 4, 2016

A simple Telegram bot with Node.js


Recently I found a simple real-life task to allow me to experiment with Node.js (I've been peering at it and the surrounding full-stack JavaScript tools). Basically, I wrote a Telegram bot, which periodically posts links to Instagram photos with a specific hashtag.

This was a learning experience for me, so a) I can't claim this is the best approach, and b) I'd appreciate constructive feedback.

The problem

My friends from “Alpha and Omega” Christian Youth Centre started using Telegram messenger (it is nice, by the way, as messengers go, clean and tidy) for their conversations. Also, there is a hashtag in Instagram (#AlphaTopic) which is used by the same group of people for weekly picture contests.

So, a friend suggested (oh well, not my idea), that, since Telegram boasts a simple Bot API, we could implement a mechanism to post hash-tagged Instagram photos into the Telegram chat. This would increase visibility of the ongoing contests, plus could serve as a conversation starter on quieter days.

Challenge accepted!

Solution approach

A telegram bot is a bit of server-side code, which I need to host myself, and which communicates with the API (Application Programming Interface) via simple HTTP calls. In order to be active in a given chat, it needs to be added manually by chat's patrons. Then the bot can respond to commands directed at it (if you implement this functionality) and post messages.

There are two ways to trigger a Telegram bot, as you can check in the Telegrams Bot API docs.

One is via webhooks. These are URLs on the bot's side, which Telegram will call once the bot receives a command from the chat where it is registered. Not a suitable method for me, since I want the bot to periodically post Instagram images, regardless of whether it was asked directly or not.

The second method is for the bot to run by itself, for example on a Cron task, and perform whatever actions it needs. There is a complication that we need to know which chats the bot is registered in, this can be achieved by asking Telegram API whether there are any updates in the chats that it is added to, via a getUpdates call. This returns a list of recently active chats, and then you can post whatever you need to them.

Then it is just a matter of coding a request to Instagram to get media by a hashtag, check which ones haven't been posted yet, and post them to the chat. Easy.

Tools

I tried to use free tools and services, and generally minimize bot's footprint.

For hosting I chose Heroku: it supports Node.js (and lots of other addons, if needed later), and allows running a simple free web-server if it is active less than 6 hours per day. This is another reason I don't want my bot to be triggered by someone's commands—it runs when it needs to, and sleeps otherwise.

Heroku has several Cron addons, yet I don't want to overload it for now, so employed a separate free service, SetCronJob, which allows 50 executions per day. Way more than enough, and for several months it has served me reliably.

On Node.js side I only use built-in modules: HTTPS for communication, QueryString to deal with POST parameters, and Process to get environment variables for server-specific configuration.

It turned out possible to make the bot stateless, so no DB-like storage is required.

Source code

You can check the code on GitHub: https://github.com/npc/ant-lab

I may use it for other toolbelt-type tasks, hence the naming. For now, if you check index.js, you'll see it performs a single meaningful task—if called on a certain URL (configured as “bot_invocation_path” in config.js, it executes method telegram.runInstagramBot() from lib/telegram.js:
app.get('/' + config.bot_invocation_path, function(request, response) {
  console.log('\n\n--=== Telegram Instagram Bot triggered at ' 
              + new Date().toISOString() + ' ===--');
  telegram.runInstagramBot();
  response.render('pages/trigger'); // TODO: Render something useful
});

The response is needed to tell the Cron service that everything is OK (200), it doesn't do anything else for now (as you can tell from my TODO comment). Otherwise Cron will notify me.

Moving to lib/telegram.js you'll see that runInstagramBot() makes a call to get updates chats—telegramGetUpdatedChats(), which returns an Array of chat IDs in the callback.

Then, if chat count is above zero, we retrieve Instagram media via instagramGet() function, and post all recent media (posted less than config.instagram.max_time_difference milliseconds ago), and post process.env.INSTAGRAM_MAX_MEDIA_COUNT (default 5) of these images as links to each Telegram chat, by calling our telegramPost() function.

If no suitable media was found, the bot posts a random quip to the chat using getNotFoundMessage() function (currently in Russian, I guess it would be nice to move these to config.js as well). The main point of these is for me to see that the bot is doing something even if there is nothing to post.

All interactions with Telegram and Instagram are done via GET and POST calls to their open APIs, you can see which ones by following the code. Ping me if you think more explanations are in order.

You'll note that there are debug mode statements, which ignore the list of Telegram chats, and make all posting to a specified chat ID—for testing I started a private chat with the bot (yes, it is that kind of bleak future where I talk to bots privately), and configured this chat ID for debug mode:
if(config.bot_debug_mode)
 chatIDs = [ config.telegram.debug_chat ]; // Private chat for testing

And that's mostly it!

Configuration

For my purposes I decided that the bot should run twice a day, to post updates in the mornings and evenings. So I've set the Cron to run it every 12 hours, and set config.instagram.max_time_difference to 12*60*60*1000 ms. Thus the bot only posts images which were posted since it was last run.

All other meaningful configuration, including the URL that gets called (to sidestep potential DoS attacks, since that would simply stop my free Heroku host from working) is set in config.js.

You'll get Telegram API token when creating you bot on their site (very easy, just talk to the BotFather), and I got my Instagram API token by looking it up in their API console.

I found it very convenient to use process.env.* (Process module from Node.js) to retrieve configuration from the environment. This allows me to share my source code, without risk to sensitive data, such as Telegram and Instagram API tokens, as these values are set on the server side (this is how Heroku does it).

Stateless Caveat

One last note is that for Telegram's getUpdates call to return anything, there should be recent messages in each chat. Otherwise the bot won't know the chat exists. This is a side effect of the bot's statelessness.

This is not ideal—first of all, it means the bot should be allowed to see chat messages (see Privacy mode), or it will not get any updates at all. If there are no active discussions, Telegram won't report anything older than 3 days, so the bot will quietly “forget” about the chat. For my purposes this is perfectly fine, yet if you develop your own bot you may need to address this. This probably will require some storage bot-side, to remember which chats it was ever added to, and watch the response to Telegram's sendMessage call, to see whether all those chats are still valid.

Conclusion

This was a great exercise for me: the actual coding was fairly simple, yet I learnt a lot about setting up the tools and configuring their interaction. Plus, there is nothing like the pleasure of seeing your little bot, you baby, chirping in the chat twice a day, sharing links to photos that my friends posted :)

Let me know if you have any questions, or if you notice any glaring mistakes in my source code.

1 comment: