Alpha Epsilon Logo
    Dr. Alexander Engelhardt
 engelhardt@alpha-epsilon.de
 0176 5690 6728

The Alpha Epsilon Blog

All things data science

Welcome to my blog! Here, I will publish tutorials related to data science, which will also serve as convenient cheatsheets and references for myself. I tend to learn best when simultaneously organizing and summarizing the material for presentation purposes, so this blog serves me as a learning vehicle, as well.

My projects

This list contains "mother" posts for my larger undertakings, each spanning multiple blog posts.

All posts

All Posts

All posts ordered by newest

DateTitleCategoryTags
10 October 2020 Starting a new Python Package Programming Python
27 August 2019 Interpretable Machine Learning / Explainable Artificial Intelligence DataScience CaseStudy, MachineLearning
22 July 2018 An auto-scaling Shiny Server on AWS R AWS, Shiny
21 July 2018 Restoring a Wordpress site from a manual backup Misc Wordpress, AWS
13 July 2018 A Data Science Case Study With R and mlr R DataScience, CaseStudy
04 April 2018 An LSTM-based Startup Name Generator Python DeepLearning, NLP
22 March 2018 How to set up a Jupyter Notebook server for Deep Learning on AWS DataScience MachineLearning, Python, DeepLearning, AWS
06 December 2017 Personal Extreme Programming Programming Agile
15 November 2017 Find and delete unused R functions R codestyle
09 November 2017 My git cheatsheet Programming Git
08 November 2017 Why become an Open Source Developer? Programming Python, OpenSource
17 October 2017 What is Data Science? DataScience
08 October 2017 The differences when using Spark with Scala CCA175 Spark, Scala
07 October 2017 Spark SQL with Python CCA175 Spark, SQL
13 September 2017 Filter, aggregate, join, rank, and sort datasets (Spark/Python) CCA175 Python, Spark
07 September 2017 Reading and writing data with Spark and Python CCA175 Spark, Python
08 August 2017 A basic Spark/Python script CCA175 Spark
05 August 2017 The LaTeX for WordPress plugin and PHP 7.0 / 7.1 Misc LaTeX, MathJaX, Wordpress
30 July 2017 Scala introduction and cheatsheet CCA175 Scala, Spark, Cheatsheet
26 July 2017 Disabling IPv6 on Arch Linux and NetworkManager Linux IPv6, VPN
25 July 2017 Command-line options for spark-submit CCA175 Spark
23 July 2017 My Python Cheatsheet Python Cheatsheet
22 July 2017 How to design a Hadoop architecture BigData Architecture, Hadoop
21 July 2017 Using Sqoop to move data between HDFS and MySQL CCA175 MySQL, SQL, Sqoop
21 July 2017 Spark Streaming BigData Hadoop, Spark, Streaming
21 July 2017 Load data into and out of HDFS using the Hadoop File System commands CCA175 Hadoop
18 July 2017 Sharing Python notebooks on Jekyll Python GitHub, Notebook
18 July 2017 Getting streaming data with Kafka and Flume BigData Flume, Hadoop, Kafka, Streaming
16 July 2017 Preparing for the Cloudera Exam CCA175: Spark and Hadoop Developer CCA175 Hadoop, Spark, Cloudera
11 July 2017 MongoDB BigData MongoDB, NoSQL
09 July 2017 NoSQL: non-relational databases BigData Hadoop, NoSQL, SQL
09 July 2017 Cassandra BigData NoSQL
08 July 2017 Hive BigData Hadoop
04 July 2017 Spark BigData Hadoop, Spark
30 June 2017 The Hadoop core: HDFS and MapReduce BigData Hadoop, MapReduce
29 June 2017 The Hadoop ecosystem: An overview BigData Hadoop
28 June 2017 Connect R with Access2007 via RODBC R ODBC, Access
17 June 2017 Dear Recruiters: Please send e-mails Work Freelancing
13 June 2017 Sharing confidential data with nginx and htaccess Linux VPS
11 June 2017 Administrating your own git server Linux
12 April 2017 lFTP usage Linux FTP, Linux
12 April 2017 SSH and scp Linux
16 December 2016 diff tips and tricks Linux Diff, Linux
14 December 2016 grep - Tips and Tricks Linux
14 August 2015 Cluster computing on the Sun Grid Engine Programming Cluster computing, Sun Grid Engine
02 May 2014 Awk tips and tricks and Bioinformatics applications Programming awk
08 January 2014 Data analysis Hadley Wickham style R
12 October 2013 Arch Linux on a MacBook Pro 9.2 Linux

All posts by category

Posts in Linux
DateTitleTags
26 July 2017 Disabling IPv6 on Arch Linux and NetworkManager IPv6, VPN
13 June 2017 Sharing confidential data with nginx and htaccess VPS
11 June 2017 Administrating your own git server
12 April 2017 lFTP usage FTP, Linux
12 April 2017 SSH and scp
16 December 2016 diff tips and tricks Diff, Linux
14 December 2016 grep - Tips and Tricks
12 October 2013 Arch Linux on a MacBook Pro 9.2
Posts in R
DateTitleTags
22 July 2018 An auto-scaling Shiny Server on AWS AWS, Shiny
13 July 2018 A Data Science Case Study With R and mlr DataScience, CaseStudy
15 November 2017 Find and delete unused R functions codestyle
28 June 2017 Connect R with Access2007 via RODBC ODBC, Access
08 January 2014 Data analysis Hadley Wickham style
Posts in Programming
DateTitleTags
10 October 2020 Starting a new Python Package Python
06 December 2017 Personal Extreme Programming Agile
09 November 2017 My git cheatsheet Git
08 November 2017 Why become an Open Source Developer? Python, OpenSource
14 August 2015 Cluster computing on the Sun Grid Engine Cluster computing, Sun Grid Engine
02 May 2014 Awk tips and tricks and Bioinformatics applications awk
Posts in Work
DateTitleTags
17 June 2017 Dear Recruiters: Please send e-mails Freelancing
Posts in BigData
DateTitleTags
22 July 2017 How to design a Hadoop architecture Architecture, Hadoop
21 July 2017 Spark Streaming Hadoop, Spark, Streaming
18 July 2017 Getting streaming data with Kafka and Flume Flume, Hadoop, Kafka, Streaming
11 July 2017 MongoDB MongoDB, NoSQL
09 July 2017 NoSQL: non-relational databases Hadoop, NoSQL, SQL
09 July 2017 Cassandra NoSQL
08 July 2017 Hive Hadoop
04 July 2017 Spark Hadoop, Spark
30 June 2017 The Hadoop core: HDFS and MapReduce Hadoop, MapReduce
29 June 2017 The Hadoop ecosystem: An overview Hadoop
Posts in CCA175
DateTitleTags
08 October 2017 The differences when using Spark with Scala Spark, Scala
07 October 2017 Spark SQL with Python Spark, SQL
13 September 2017 Filter, aggregate, join, rank, and sort datasets (Spark/Python) Python, Spark
07 September 2017 Reading and writing data with Spark and Python Spark, Python
08 August 2017 A basic Spark/Python script Spark
30 July 2017 Scala introduction and cheatsheet Scala, Spark, Cheatsheet
25 July 2017 Command-line options for spark-submit Spark
21 July 2017 Using Sqoop to move data between HDFS and MySQL MySQL, SQL, Sqoop
21 July 2017 Load data into and out of HDFS using the Hadoop File System commands Hadoop
16 July 2017 Preparing for the Cloudera Exam CCA175: Spark and Hadoop Developer Hadoop, Spark, Cloudera
Posts in Python
DateTitleTags
04 April 2018 An LSTM-based Startup Name Generator DeepLearning, NLP
23 July 2017 My Python Cheatsheet Cheatsheet
18 July 2017 Sharing Python notebooks on Jekyll GitHub, Notebook
Posts in Misc
DateTitleTags
21 July 2018 Restoring a Wordpress site from a manual backup Wordpress, AWS
05 August 2017 The LaTeX for WordPress plugin and PHP 7.0 / 7.1 LaTeX, MathJaX, Wordpress
Posts in DataScience
DateTitleTags
27 August 2019 Interpretable Machine Learning / Explainable Artificial Intelligence CaseStudy, MachineLearning
22 March 2018 How to set up a Jupyter Notebook server for Deep Learning on AWS MachineLearning, Python, DeepLearning, AWS
17 October 2017 What is Data Science?