Automating 350+ Whatsapp Messages using playwright.

Exploring automations playwright, Google AI studio, where seemingly simple task turned into a fun challenge.

Saroj Regmi - Tue, Mar 25, 2025

Background, what and why?

I one day received a request from a friend asking me to help her send 350+ whatsapp messages individually. For an upcoming student election promotion. I without having a second thought said yes and starting working on this seemingly simple task. This is where a journey full of excitement and frustration began. It had to be done within two days, from friday evening to saturday till night 12:00am. since all promotions after saturday were prohibited.

I always wanted to have a personal blog but did not have time to code one. Unfortunately, this time when I wanted to share my experience I got slapped by you exceeded max character limit message in linkedin. And after searching for solutions, thinking of getting a premium (trail that is) and talking to one of my friends. I decided to invest some time and setup a blog and properly documenting this experience.

I have actually open sourced the code for this blog and the project. If you are curious then you can check it out here Project, Blog.

What did i use?

After understanding the problem, I began thinking about the possible solutions that would:

  1. Work flawlessly, since I had to send only a single message to individual phone numbers who may or maynot be connected to my friend.
  2. Work automatically, with minimal to no human intervention.
  3. Was doable within 1.25~ days (an evening and a day).

With all these requirements in mind, I remembered that there was a tool that I was familiar with, but it was not a tool specifically made for this task “only”. Although in theory it could be used for automating tasks. It was designed as a testing tool for the web. To perform integration and E2E testings.

You may have guessed it by now. It’s non other than Playwright.

While looking it from a different angle, it had a slight advantage over other tools. The advantage being “their codegen tool is pretty amazing”. Besides I wanted to test the tool itself (talk about having a good option).

With that taken care of, I quickly began setting up Playwright and the things that were needed to launch their codegen tool. This is where the luck runs out. All the time advantage that the codegen tool would have given me was taken by the same tool.

The issue with the tool I chose.

Good place to start right? As programmers it’s always the case. Regardless here is the what happended. I got greeted with a friendly warning saying.

BEWARE: your OS is not officially supported by Playwright; installing dependencies for ubuntu20.04-x64 as a fallback.

What a great time to say that I use arch by the way.

It was just a warning or so I thought I got hit with a missing dependency error. I checked the aur for those dependencies and found some but for some dependencies it was hard to find. And those which I found were broken or required manual interventions. So, after trying to find and fix them for a while, I gave up.

But something within me was not ready to accept defeat. So, I touched some grass, streched my fingers and typed yay -Syu And patiently waited for half an hour or so (yeah I actually kindof did it).

After that I tried it again followed every instruction that I could find on this topic. First I found a reddit thread which lead me to a couple of github issues. [1] [2] At last I made it working partially. (I got the chrome instance working skipped others). Besides I was not trying to test the code that I was about to write. The test was the code that was going to do all the work.



Finally started to code.

After all this hassle, I was finally starting to code. It was getting late so i called it a day. The next day I went straight into coding. I had to complete it that day or else there was no point of doing it. Why? you may ask. Just go ahead and read the background more clearly.

While this part was not complicated at all, I had fun doing it. Thinking i may require it in future. I tried to make it as modular and generic as I could have though of at that time. The code turned out to be ok. It was working fine. After informing my friend about it, I told her to send the phonenumbers and the message that was to be automated but man was I unprepared.

The twist.

The numbers were in a printed sheet of paper whose photo was taken from a mobile phone and converted into a pdf.(simply put, the “PDF” was just a collection pictures.)

At first I thought let’s just use Google Lens to copy the mobile numbers but it turned out to be pretty unreliable. Some numbers were being recorgnized fine some were getting half recorgnized. Regardless I was unable to use it due to reliablility issues.

So, I began searching a way to do this without potentially exposing those personal details of 350+ people. I tried using someone else’s ocr script and got to same conclusion. It was half good and half bad.

There was no way I could write my own OCR script since, I had only that day to complete the task. After searching for around couple of hours and testing different solutions. I came across a tool that amazed me. If you have been doing OCR recently you may have heard of them, since they are quite popular and good at what they do.

And the turn.

It’s non other than unstract. I tired setting it up locally but was not able to make it work and since it is opensourced and had some good metrics I tried the cloud version. Man was I impressed, it’s a good idea isn’t it?

“Taking your documents and performing OCR and then giving the extracted data to a LLM to further process it”.

I singned up for the free trial of their cloud version. Went to their prompt studio and created a prompt by attaching a document and wrote.

Can you extract the name and phone numbers from this document?
In the below mentioned format:
name:number

It took around 30-40 seconds and the results were just perfect. I did manually check some of the results randomly, they were just perfect. Not only that they also told me the token length and how much price it would take to run it?

“No this, is not a ads for unstract. It was my first time seeing the tools and it really got me impressed”.

Nothings, Perfect.

_ But, As i scrolled passed the 150th result, the result ended there. And the last few numbers were wrong (classic LLM behavior, making fake facts and presenting them proudly).

Luckily, for me I had the ablity to tweak the prompt I copied all the correct data and then told it to generate only the missing ones. It was giving good data that was correct but for some reason the last page’s data was not being parsed correctly.

It made me question entire data’s reliablility.

At that point I suddenly remembered Google AI Competition. why you may ask? Because it introduced me to google Ai studio. And I remembered it supporting images as well as documents while generating good response.

So, I gave it a shot. It generated the data perfectly and it also had a structured data option. Unlike unstract, it completly ignored my requested structure and gave entire table’s columns as JSON data. It was a valid json response. so, It was fine for me. Sadly, enough it too got stuck at the 150th result. I was starting to get frustrated and was getting bored and said it to continue further.

Funnily enought, It did. and got stuck at around 250th result, I said to move forward it did just that. It feels good when LLMs do what you want them to do.

It was a funny and interesting experience which taught me a thing or two and helped me discover some new tools.

After collecting the data, I put it in a file. Made some changes to the code and called my friend to continue sending messages. (I did not want to send it through my personal phone number. So, I told her to scan the QR code and link her Whatsapp with my computer which was about to run the playwright test.)

It worked flawlessly. Quickly in around 10 minutes, messages were sent. Response started coming saying thanks for sending a personal message. I was happy to see that it was working as expected.

She told her other friend who was also participating in the elections to scan the QR code as soon as they did the messages were being sent. The other friend’s reaction was just unexpectedly shocking like they say some magic trick.

Conclusion.

Regardless, It was a fun experience. I would not have setup a blog if it weren’t for the urge to share what it felt like. I mean automaticing some whatsapp messages is not that big of a deal.

But, As developers we often overlook the fact that what we do on a daily basis may be beyond some peoples imagination and list of possible things. Often making the things that we think simple magical for others. Often making the things that we think simple magical for others.

_ Thank you, for making it this far. It means alot. For those who made it this far I am planning something.