Hi Gapan
I used to carry out various statistical analysis for my researches by several apps like SPSS, Minitab, Xlstat and Mstatc on my windows machine. Whilst I can use these software on linux either via WIne or Virtualbox, I want to focus on native linux packages for that purpose. Unfortunately, I couldn't find any powerful graphical package for statistical analysis on linux. Hence, I am going to focus on learning programing languages like R or Python. However, I don't know which is more suitable for my needs. R seems to have wider community and more packages for all statistical analysis while Python has simpler syntax and is easier to understand. As a statistic teacher, what do you recommend especially for procedures like expreimental design, GLM and multivariate analysis (both exploratory and confirmatory)?
Thanks
Gapan, what do you recommend for statistical analysis?

 Posts: 264
 Joined: 20. Jun 2016, 20:15
Re: Gapan, what do you recommend for statistical analysis?
Hi Hosein,
I gave the link to your post and requested an answer from my son Matthieu, because he has a background in applied mathematics and statistical analysis, and also develop applications in this field.
Here is his answer: "J'irais sur R même (ou parce que) si je connais mieux Python. Même si Python est un super langage avec tout ce dont il y a besoin pour l'analyse statistique (et même plus), ça fait quand même un nouveau langage à apprendre, et donc un overkill comme ils disent. Et R est vraiment la référence en matière de stats. "
In English : "I would go with R even though (or because) I better know Python. Even though Python is an awesome language with all he needs for statistical analysis and even more, that's still a new language to learn, thus an overkill as they say (in English). And R is really the reference in the statistics field".
But still, wait for Gapan's opinion.
Didier
I gave the link to your post and requested an answer from my son Matthieu, because he has a background in applied mathematics and statistical analysis, and also develop applications in this field.
Here is his answer: "J'irais sur R même (ou parce que) si je connais mieux Python. Même si Python est un super langage avec tout ce dont il y a besoin pour l'analyse statistique (et même plus), ça fait quand même un nouveau langage à apprendre, et donc un overkill comme ils disent. Et R est vraiment la référence en matière de stats. "
In English : "I would go with R even though (or because) I better know Python. Even though Python is an awesome language with all he needs for statistical analysis and even more, that's still a new language to learn, thus an overkill as they say (in English). And R is really the reference in the statistics field".
But still, wait for Gapan's opinion.
Didier
Re: Gapan, what do you recommend for statistical analysis?
I suggest to go with R. This is the language for statistical analysis. As it is Matlab for plotting differential equations in mathematics.
Re: Gapan, what do you recommend for statistical analysis?
It depends. If you're more familiar with one, go with that. If you're not familiar with any of them, then I would say that if your focus is on data and statistical analysis, go with R. If your focus is writing code and you also want to do stuff with your data, go with python.
Here's a nice infographic for the comparison: https://www.datacamp.com/community/tuto ... gs.Om1ZHyY
I suggest you choose R. It has every statistical test you've ever imagined implemented in one of its thousands packages. It's not presented in the same way you're used to with the other statistical packages you mention, but it can do anything they can and a lot more.
Some advice on how to start:
1. Install RStudio. It's not mandatory, but it makes R a lot nicer. You'll love how easy it is to make reports with RMarkdown.
2. Once you started R for the first time, install swirl. It includes lots of tutorials which you run with guidance inside R itself. http://swirlstats.com/
Here's a nice infographic for the comparison: https://www.datacamp.com/community/tuto ... gs.Om1ZHyY
I suggest you choose R. It has every statistical test you've ever imagined implemented in one of its thousands packages. It's not presented in the same way you're used to with the other statistical packages you mention, but it can do anything they can and a lot more.
Some advice on how to start:
1. Install RStudio. It's not mandatory, but it makes R a lot nicer. You'll love how easy it is to make reports with RMarkdown.
2. Once you started R for the first time, install swirl. It includes lots of tutorials which you run with guidance inside R itself. http://swirlstats.com/
Re: Gapan, what do you recommend for statistical analysis?
@Didier
Thanks my friend and sorry for taking your time. Please give my best regard to your son.
@djemos
Thanks for the suggestion.
@Gapan
Thanks, one of my main problem with R is it's scattered documentation and many free books and online courses that confuse newbies. In that regard swirl looks promising. Many thanks.
Thanks my friend and sorry for taking your time. Please give my best regard to your son.
@djemos
Thanks for the suggestion.
@Gapan
Thanks, one of my main problem with R is it's scattered documentation and many free books and online courses that confuse newbies. In that regard swirl looks promising. Many thanks.
Re: Gapan, what do you recommend for statistical analysis?
The problem is that R is so powerful and versatile there is so much information out there it cannot be contained within a single book or course. Swirl is only a nice handson introduction, but it's good to start with. Expect to be doing a lot of web searching on how to do things at first.travis82 wrote:Thanks, one of my main problem with R is it's scattered documentation and many free books and online courses that confuse newbies. In that regard swirl looks promising. Many thanks.
Re: Gapan, what do you recommend for statistical analysis?
I do. But, I didn't mean documentation for specific analysis. I have many general books with titles like "introduction to R for statistical analysis" and I don't have any idea which one is better for the starting point."R for SAS and SPSS users" by Robert Muenchen seems a good candidate as I am familiar with SPSS and I will be able to compare the output of various statistical analysis between R and SPSS. however, any recommendation for good books and tutorials would be appreciated.gapan wrote:Expect to be doing a lot of web searching on how to do things at first.
Re: Gapan, what do you recommend for statistical analysis?
It also depends on where you retrieve your information.
If you retrieve your information from websites that do not provide API, I would retrieve and process information with the followings:
PhantomJS (if webpage requires JavaScript in order for it to be loaded)
Python BeautifulSoup or Python lxml (lxm is much faster)
* XPath (query language for (x)HTML/XML)
Python pandas (to analyze data)
If you retrieve your information from websites that do not provide API, I would retrieve and process information with the followings:
PhantomJS (if webpage requires JavaScript in order for it to be loaded)
Python BeautifulSoup or Python lxml (lxm is much faster)
* XPath (query language for (x)HTML/XML)
Python pandas (to analyze data)
Help to make Slackware easier Donate to Salix