Home


database

Simon Walkowiak

A few words about me:

I’m an entrepreneur, data scientist and interactive data visualisations programmist with a background in cognitive neuroscience. I provide Big Data and predictive analytics services as well as statistical computing training to a very varied array of clients, from governmental agencies, public research institutes to FTSE companies.

I run Mind Project Ltd – a data analytics and visualisations start-up with offices in Colchester and London (Hoxton area). Recently I also worked as a Data Curator at the UK Data ServiceUK Data Archive (University of Essex) – a leading social and economic data preservation and dissemination centre in Europe.

My skills: 

Big Data, analytics and visualisation tools I use.
A selection of Big Data, analytics and visualisation tools I use.
  • Expert knowledge of R language – including parallel computing, multivariate analysis, data crunching and transformations using core R and third-party packages, machine learning algorithms, static and interactive data visualisations e.g. ggplot2, ggvis, Google charts and Shiny dashboards,
  • Proficient in Python for data analysis including SciPy, NumPy, matplotlib and pandas libraries,
  • SQL and NoSQL queries – e.g. MongoDB pipeline,
  • Strong knowledge and experience of Hadoop ecosystem (HBase, Cassandra, Hive, Pig etc.), MapReduce framework and HDFS,
  • A strong track record in deployment and integration of numerous Big Data tools: R + Python + Hadoop + Spark + MongoDB, on large clusters of commodity servers e.g. through Amazon WS, Microsoft Azure or Google Cloud Platform,
  • (Near) real-time data processing and analysis e.g. marketing/retail data, financial data, social media sentiment analysis etc. using MongoDB, Storm and Lambda architecture,
  • Interactive data visualisations and data products: RShiny, D3.js and other javascript interactive graphics libraries, JSON/BSON/HTTP api,
  • Proficiency in data extraction and data mining techniques,
  • Strong experience in Tableau, SPSS, Stata, Microsoft Excel and Access.

A selection of organisations I’ve worked with/for and consulted at so far:

clients

About this blog:

Whenever possible I will update this blog with:

  • short articles of my personal and research interests,
  • reviews of data science related books of interest,
  • how-to scripts in R, Python, SQL and other languages,
  • data visualisations and infographics,
  • data analyses and research reports, which for some reasons may not meet required conditions for peer-reviewed publications.