Asking questions of past me has been fun but what I wanted to do next is see what future me might conceivably tweet next so a little machine learning fun was my next stop .
This post like my first post in this series on having fun with my twitter feed is about showing you how easy it is to use GCP to help you have fun with your tweets in the simplest way possible ( ok laziest way !) . So I won’t be spending too much time talking through Machine learning concepts but if you’re interested (it’s fascinating) then you really can’t go too wrong then by starting out by reading Cassie’s @quaesita medium posts and watching some of her talks on YouTube she really does make it simple to comprehend ! In my case I am using the microwave not learning how to build it ! ( This is my favourite analogy From Cassie)
Anyway on with my adventure
I knew I wanted to use GCP and I also knew I wanted to keep it as simple as possible .
Basically I wanted to do the following
Input mytweets -> train a model - > model generates new tweets .
So I started from here https://aihub.cloud.google.com/ which is a central hub for all things AI on GCP. It’s a repository of plug-and-play AI components, including end-to-end AI pipelines and out-of-the-box algorithms. Read the intro here for a way more comprehensive intro than I really have space for here .
I entered “text generation” in the search box at aihub as my starting point which returned a number of choices ( 16 when i did this ) the char RNN notebook looked the most likely to achieve what I wanted.
Char RNN is a neural net specifically a recurrent neural net It’s an implementation in Tensorflow .
The first thing I needed was some training data. If you read the previous post you know I have loaded up my twitter archive into a BigQuery dataset . So I had that but Looking at the route I was going down I needed to get my tweets extracted into a text file I could upload to the vm the Colab notebook was running on .
From The BigQuery Console I ran the following query
SELECT full_text FROM 'myproject.mytwtitterdata.gracetweets'
And saved the results as a csv
Saving the results as csv only allows you to export 16,000 rows which I felt was more than enough for my purposes. If you want to use the full dataset then you would have to save the results to another table and export it
I read what ai hub told me about my chosen notebook
and then I opened the Colab notebook. If you’re not sure what a notebook is when used in ML workflows read this great Colaboratory getting started page
As with any notebook you go through running the cells in order in this case I uploaded my own data when given the option so I got 👇🏽
” exported-tweets.csv(text/csv) - 1742950 bytes, last modified: 25/08/2019 - 100% done”
But I had problems running the notebook with my data due to the fact I had emojis in some of my tweets probably and got this error:
'ascii' codec can't encode character u'\U0001f9d0' in position 1017: ordinal not in range(128)'
So I figured I needed to convert the file from unicode to ascii so I ran this against my source file to convert the encoding
iconv -f utf8 -t ISO-8859-1 exported-tweets.csv > sep-8-tweets.csv
I tried running the notebook again and it worked this time!
The first set of tweets it produced were interestingly odd to say the least reckon it needed more time to learn
Here’s a snippet of its early efforts:
"@annnwallace @miolisolu @TLobinger @NanTelarter better think I am reading the #googlecloud :-( "https://t.co/ySnjTMYZ1o "RT @grapesfrog Airline with my travelled like quoting out to care which from it is it all why! Woatle close for what I am his fun pile to be forward this is this time that! http://t.co/mGrRDVDvVF and ?? @ythos I have so not #openingcerr #googlecloud Platfack & annocesses worth #aws http://t.co/a31uzp6EWu @pelarcamber thanks like the names book outsing for team fights Jenson I guess! I can't be it will need to feel I actually think I didn't got that out to do this touth at question Day shels & life the just happen? @mndoci Brain @GoogleNext19 I did the nister with my catch the cutel as a socks for me the faces is a time to! him indied chence will a whore we doodle discovery. It was in experience exanced but the read you will be & folks ! I far person suppised to geek is very single extenning in compation friend to start although around @grapesfrog ?? Opting for a gior @A_chyfigici just pressing ! @grapesfrog @lusis: Yes 2 greeting actual will be the good trange .I. Annoying rather of but The chocolate? ""when is a Jas Katashor 4 it's been the scanney. Head and the Luking till I love the finces to start as I know someone half has next week with more though! @n0rm it at one .. so I will be missen in ""Sad @SmartSOS was all apparently meeting and find one is some time "
The generated web pages that lead no where .. the odd turn of phrase which isn’t really my odd turn of phrase . 🤔
I left it training for a bit longer and it was still tweeting strangely :
"Senter in same favourite ""no mathematical pants to end of best 2 box"" http://t.co/g6fB4aKI21" "RT @theurotic: :-)" @darkflib Have you know they prefer to do I came one of these :-) Vote maybe sure that you've put for anyone taken benefit. :-) @mndoci also maybe he was trying to getting than a song!" I can't get this though "https://t.co/xFQzBZFcmq by @Weemanderlid https://t.co/K07MXCHF6y (""I would create you for the day of train but that expecting "" we has a good experience "" but the mire sale singing I have chilip than out of your pointless better it's a whole coffee & grown up with our blog post when that is pretty funny !" @lasombra_br I have no idea she's membering but all thinks you should jump a very cool "" Certain problems"" :-) The more safely good fair I have some time @chibichibibr @petermark Too sure her colleague & sex mates are in normal doc . 2 summer-downtead connect my world The Audio 3 engosters managed to leave this because I am still being the apps I even even not notice the use of my youtube :-( I was a nice photo shop ! On Me teams on #AWS SPE - and The BigQuery Single http://t.co/aZwS8q7mil via @jonamounny https://t.co/ezu92a7k1 "Me "" It work "" No idea what sort of great doughnut"" https://t.co/FH8NjWrDu0" RT @petermark: @grapesfrog Yeah we actually feel David Subural running on what despite for time but I guess the country socks and explaining them already ? Had a colleague and a fav evil Cerepting Beanst-fore summary days terrestrial by normal bod 2011 - a photos http://t.co/YMCJS6xyMC" "RT @accaPcomputry: Terraform we also better thing they are underloomfully must apparently there was the next week to be asked to using JFXT fert with ""Oh Cerf to chilly Register"" Him"" Can't be bothered to ask it was a
This has been for fun and I know I should have done the encoding in BigQuery and exported to GCS but this was done in an idle few hours on a sunday afternoon so I just used the most direct way to get to the fun bit.
I left it training for a longer period and scarily it was beginning to make tweets that were surreal but kinda almost beginning to make sense in a disturbing kind of way I saved some of them as a gist here
So it seems future me apparently will be tweeting very strangely or that’s how we will all tweet in the future 🤷🏽♀️