r-tastic

Weird and wonderful exploration of data using R

End of Year thoughts

Sometimes it’s worth making New Year resolutions… A year ago I made one for 2017 to start an R blog using RMarkdown and Jekyll static sites. At the time, I didn’t even know git that well, had no clue what static sites are and was mostly oblivious to the rich and vibrant R community on Twitter. Fast-forward one year and… the picture couldn’t be any more different! I’d like to share my thoughts on writing this blog (and data science blog in general) and how it taught me about getting stuff done.

Star Wars Vs Star Trek Word Battle

It will go without saying that I’m super excited about the premiere of another Star Wars movie and I’m not an exception. This, together with with Piotr Migdal’s challenge posted on Data Science PL group on Facebook where he suggested comparing word frequencies between two different sources. It didn’t take me long to decide what source to choose! So in this short kand sweer blogpost I’m comparing word frequencies between two movie scripts: “Star Wars: The New Hope” (1977) and “Star Trek: The Motion Picture” (1979).

Automated and Unmysterious Machine Learning in Cancer Detection

I get bored from doing two things: i) spot-checking + optimising parameters of my predictive models and ii) reading about how ‘black box’ machine learning (particularly deep learning) models are and how little we can do to better understand how they learn (or not learn, for example when they take a panda bear for a vulture!). In this post I’ll test a) H2O’s function h2o.automl() that may help me automate the former and b) Thomas Lin Pedersen’s library(lime) that may help clarify the latter.

Friendships among top R-twitterers

Have you ever wondered whether the most active/popular R-twitterers are virtual friends? :) And by friends here I simply mean mutual followers on Twitter. In this post, I score and pick top 30 #rstats twitter users and analyse their Twitter friends’ network. You’ll see a lot of applications of rtweet and ggraph packages, as well as a very useful twist using purrr library, so let’s begin! BEFORE I START: OFF - TOPIC ON PERFECTIONISM After weeks and months (!

Animated Plots As Part Of Exploratory Data Analysis

The internet seems to be booming with blog posts on animated graphs, whether it’s for more serious purposes or not so much. I didn’t think anything more of it than just a gimmick or a cool way of spicing up your conference talk. However, I’m a total convert now and in this post I want to show a real value that such graph can add to your (absolutely serious!) exploratory analysis.