Studying Statistics with R – 2
In my second post regarding R and statistics, I am going to describe my registering for online classes. See here for Part 1.
First, I registered on Coursera.org for a class named “R Programming”. This is a part of a Data Science Specialization and is offered by JHU. This will begin on August 3rd. The other online course is from EdX.org and it is self-paced and is called “Explore Statistics with R”. The class already started on July 7th but I just joined it today. It will end on August 31st. Hopefully, I will catch up.
Among these 2 classes and the R study group that I mentioned in my last post, I think it will give me very broad and somewhat deep understanding of how to use statistics with R application to solve real life problems. I will be also be attending other complimentary user groups such as ChicagoCityData which explores city’s datasets such as https://data.cityofchicago.org/. It has datasets such as landlord’s list, Police stations, crimes – 2001 to present and other such interesting ones.
Now let me explain more about the courses that I am taking and how it is going to help me in reaching my goal. But wait! What is my goal? What do I want to achieve by going through all this trouble of learning R, statistics and user groups?
As a Senior SQL Server Database Administrator, I am very familiar with data, datatypes, storage, performance, optimization etc. But 20 years ago I graduated as an Electrical Engineer and Statistics and Mathematics were my two favorite subjects. And since becoming a DBA 14 years ago, I did not get to use these 2 areas much. Learning R is just like learning any other programming language, such as TSQL, which I am pretty familiar with. I am somewhat familiar with scripting language “Powershell” also. But learning both together, R and Statistics, and applying them to solve practical issues is my dream come true scenario. Currently I am working with SQL Server 2012 version. SQL Server 2014 is out there and SQL Server 2016 will be out next year. With 2016 version, Microsoft is tightly integrating R Studio functionality. So now Data Scientist do not have to wait to get their big datasets and then work on the analysis. They can do it right on the SQL Server. How much performance hit it will be is yet to be seen. So you can see my motivation here. I am not going to leave the world of SQL Server because I fell in love with it and I would like to fall in love with R and statistics too. I would like to find new meanings in the data that I have on my fingertips. Bring new insight to my company and become successful myself at the same time.
Enough about loves! Back to class descriptions.
1. R Programming : –
I have taken this class online exactly a year ago but it became harder for me after week 3 and I could not finish the last project. So after one year this “Study group for An Introduction to Statistical Learning with Applications in R” has reignited my interest in pursuing this course and towards Data Science Specialization. This is a 4 week course and starting on Aug 3rd, 2015, as mentioned before. It requires about 10 hours per week of your time. It uses a cool R self teaching tool called “Swirl” and a book. The course will cover the following material each week:
- Week 1: Overview of R, R data types and objects, reading and writing data
- Week 2: Control structures, functions, scoping rules, dates and times
- Week 3: Loop functions, debugging tools
- Week 4: Simulation, code profiling
2. Explore Statistics with R : –
I have joined this class today and as mentioned earlier it began on July 7th, 2015. This is an 8 week course and requires about 5 hours per week of time commitment.It uses the materials from here and here. This is a self paced course meaning all the 5 weeks materials were posted on July 7th. So you take your time to finish it by August 31st when some project is due. The main outline of this course is:
- Week 1: Get to know R
- Week 2: How to import and clean data in R
- Week 3: Statistics under the hood: distributions and tests.
- Week 4: Non-parametric tests
- Week 5: Visit the research frontier
I can see some overlap here between the 2 courses and that is why I am thinking I will be able to finish Statistics course by Aug 31st even after starting late. At the meetups, I would also like to help others because I know how it feels when you are struggling and not sure of yourself whether you can do it or not and it is for you or not. I would be able to encourage people to keep learning and not give up. Fruits are right within your reach you just have to go little closer.
See you in next post!