Welcome to my blog! Here, I will publish tutorials related to data science, which will also serve as convenient cheatsheets and references for myself. I tend to learn best when simultaneously organizing and summarizing the material for presentation purposes, so this blog serves me as a learning vehicle, as well.
This list contains "mother" posts for my larger undertakings, each spanning multiple blog posts.
All posts ordered by newest
Date | Title | Category | Tags |
---|---|---|---|
10 October 2020 | Starting a new Python Package | Programming | Python |
27 August 2019 | Interpretable Machine Learning / Explainable Artificial Intelligence | DataScience | CaseStudy, MachineLearning |
22 July 2018 | An auto-scaling Shiny Server on AWS | R | AWS, Shiny |
21 July 2018 | Restoring a Wordpress site from a manual backup | Misc | Wordpress, AWS |
13 July 2018 | A Data Science Case Study With R and mlr | R | DataScience, CaseStudy |
04 April 2018 | An LSTM-based Startup Name Generator | Python | DeepLearning, NLP |
22 March 2018 | How to set up a Jupyter Notebook server for Deep Learning on AWS | DataScience | MachineLearning, Python, DeepLearning, AWS |
06 December 2017 | Personal Extreme Programming | Programming | Agile |
15 November 2017 | Find and delete unused R functions | R | codestyle |
09 November 2017 | My git cheatsheet | Programming | Git |
08 November 2017 | Why become an Open Source Developer? | Programming | Python, OpenSource |
17 October 2017 | What is Data Science? | DataScience | |
08 October 2017 | The differences when using Spark with Scala | CCA175 | Spark, Scala |
07 October 2017 | Spark SQL with Python | CCA175 | Spark, SQL |
13 September 2017 | Filter, aggregate, join, rank, and sort datasets (Spark/Python) | CCA175 | Python, Spark |
07 September 2017 | Reading and writing data with Spark and Python | CCA175 | Spark, Python |
08 August 2017 | A basic Spark/Python script | CCA175 | Spark |
05 August 2017 | The LaTeX for WordPress plugin and PHP 7.0 / 7.1 | Misc | LaTeX, MathJaX, Wordpress |
30 July 2017 | Scala introduction and cheatsheet | CCA175 | Scala, Spark, Cheatsheet |
26 July 2017 | Disabling IPv6 on Arch Linux and NetworkManager | Linux | IPv6, VPN |
25 July 2017 | Command-line options for spark-submit | CCA175 | Spark |
23 July 2017 | My Python Cheatsheet | Python | Cheatsheet |
22 July 2017 | How to design a Hadoop architecture | BigData | Architecture, Hadoop |
21 July 2017 | Using Sqoop to move data between HDFS and MySQL | CCA175 | MySQL, SQL, Sqoop |
21 July 2017 | Spark Streaming | BigData | Hadoop, Spark, Streaming |
21 July 2017 | Load data into and out of HDFS using the Hadoop File System commands | CCA175 | Hadoop |
18 July 2017 | Getting streaming data with Kafka and Flume | BigData | Flume, Hadoop, Kafka, Streaming |
16 July 2017 | Preparing for the Cloudera Exam CCA175: Spark and Hadoop Developer | CCA175 | Hadoop, Spark, Cloudera |
11 July 2017 | MongoDB | BigData | MongoDB, NoSQL |
09 July 2017 | NoSQL: non-relational databases | BigData | Hadoop, NoSQL, SQL |
09 July 2017 | Cassandra | BigData | NoSQL |
08 July 2017 | Hive | BigData | Hadoop |
04 July 2017 | Spark | BigData | Hadoop, Spark |
30 June 2017 | The Hadoop core: HDFS and MapReduce | BigData | Hadoop, MapReduce |
29 June 2017 | The Hadoop ecosystem: An overview | BigData | Hadoop |
28 June 2017 | Connect R with Access2007 via RODBC | R | ODBC, Access |
17 June 2017 | Dear Recruiters: Please send e-mails | Work | Freelancing |
13 June 2017 | Sharing confidential data with nginx and htaccess | Linux | VPS |
11 June 2017 | Administrating your own git server | Linux | |
12 April 2017 | lFTP usage | Linux | FTP, Linux |
12 April 2017 | SSH and scp | Linux | |
16 December 2016 | diff tips and tricks | Linux | Diff, Linux |
14 December 2016 | grep - Tips and Tricks | Linux | |
14 August 2015 | Cluster computing on the Sun Grid Engine | Programming | Cluster computing, Sun Grid Engine |
02 May 2014 | Awk tips and tricks and Bioinformatics applications | Programming | awk |
08 January 2014 | Data analysis Hadley Wickham style | R | |
12 October 2013 | Arch Linux on a MacBook Pro 9.2 | Linux |
Date | Title | Tags |
---|---|---|
26 July 2017 | Disabling IPv6 on Arch Linux and NetworkManager | IPv6, VPN |
13 June 2017 | Sharing confidential data with nginx and htaccess | VPS |
11 June 2017 | Administrating your own git server | |
12 April 2017 | lFTP usage | FTP, Linux |
12 April 2017 | SSH and scp | |
16 December 2016 | diff tips and tricks | Diff, Linux |
14 December 2016 | grep - Tips and Tricks | |
12 October 2013 | Arch Linux on a MacBook Pro 9.2 |
Date | Title | Tags |
---|---|---|
22 July 2018 | An auto-scaling Shiny Server on AWS | AWS, Shiny |
13 July 2018 | A Data Science Case Study With R and mlr | DataScience, CaseStudy |
15 November 2017 | Find and delete unused R functions | codestyle |
28 June 2017 | Connect R with Access2007 via RODBC | ODBC, Access |
08 January 2014 | Data analysis Hadley Wickham style |
Date | Title | Tags |
---|---|---|
10 October 2020 | Starting a new Python Package | Python |
06 December 2017 | Personal Extreme Programming | Agile |
09 November 2017 | My git cheatsheet | Git |
08 November 2017 | Why become an Open Source Developer? | Python, OpenSource |
14 August 2015 | Cluster computing on the Sun Grid Engine | Cluster computing, Sun Grid Engine |
02 May 2014 | Awk tips and tricks and Bioinformatics applications | awk |
Date | Title | Tags |
---|---|---|
17 June 2017 | Dear Recruiters: Please send e-mails | Freelancing |
Date | Title | Tags |
---|---|---|
22 July 2017 | How to design a Hadoop architecture | Architecture, Hadoop |
21 July 2017 | Spark Streaming | Hadoop, Spark, Streaming |
18 July 2017 | Getting streaming data with Kafka and Flume | Flume, Hadoop, Kafka, Streaming |
11 July 2017 | MongoDB | MongoDB, NoSQL |
09 July 2017 | NoSQL: non-relational databases | Hadoop, NoSQL, SQL |
09 July 2017 | Cassandra | NoSQL |
08 July 2017 | Hive | Hadoop |
04 July 2017 | Spark | Hadoop, Spark |
30 June 2017 | The Hadoop core: HDFS and MapReduce | Hadoop, MapReduce |
29 June 2017 | The Hadoop ecosystem: An overview | Hadoop |
Date | Title | Tags |
---|---|---|
08 October 2017 | The differences when using Spark with Scala | Spark, Scala |
07 October 2017 | Spark SQL with Python | Spark, SQL |
13 September 2017 | Filter, aggregate, join, rank, and sort datasets (Spark/Python) | Python, Spark |
07 September 2017 | Reading and writing data with Spark and Python | Spark, Python |
08 August 2017 | A basic Spark/Python script | Spark |
30 July 2017 | Scala introduction and cheatsheet | Scala, Spark, Cheatsheet |
25 July 2017 | Command-line options for spark-submit | Spark |
21 July 2017 | Using Sqoop to move data between HDFS and MySQL | MySQL, SQL, Sqoop |
21 July 2017 | Load data into and out of HDFS using the Hadoop File System commands | Hadoop |
16 July 2017 | Preparing for the Cloudera Exam CCA175: Spark and Hadoop Developer | Hadoop, Spark, Cloudera |
Date | Title | Tags |
---|---|---|
04 April 2018 | An LSTM-based Startup Name Generator | DeepLearning, NLP |
23 July 2017 | My Python Cheatsheet | Cheatsheet |
Date | Title | Tags |
---|---|---|
21 July 2018 | Restoring a Wordpress site from a manual backup | Wordpress, AWS |
05 August 2017 | The LaTeX for WordPress plugin and PHP 7.0 / 7.1 | LaTeX, MathJaX, Wordpress |
Date | Title | Tags |
---|---|---|
27 August 2019 | Interpretable Machine Learning / Explainable Artificial Intelligence | CaseStudy, MachineLearning |
22 March 2018 | How to set up a Jupyter Notebook server for Deep Learning on AWS | MachineLearning, Python, DeepLearning, AWS |
17 October 2017 | What is Data Science? |