Page 1 of 1

Gapan, what do you recommend for statistical analysis?

Posted: 10. Feb 2017, 10:43
by travis82
Hi Gapan

I used to carry out various statistical analysis for my researches by several apps like SPSS, Minitab, Xlstat and Mstatc on my windows machine. Whilst I can use these software on linux either via WIne or Virtualbox, I want to focus on native linux packages for that purpose. Unfortunately, I couldn't find any powerful graphical package for statistical analysis on linux. Hence, I am going to focus on learning programing languages like R or Python. However, I don't know which is more suitable for my needs. R seems to have wider community and more packages for all statistical analysis while Python has simpler syntax and is easier to understand. As a statistic teacher, what do you recommend especially for procedures like expreimental design, GLM and multivariate analysis (both exploratory and confirmatory)?

Thanks

Re: Gapan, what do you recommend for statistical analysis?

Posted: 10. Feb 2017, 13:32
by DidierSpaier
Hi Hosein,

I gave the link to your post and requested an answer from my son Matthieu, because he has a background in applied mathematics and statistical analysis, and also develop applications in this field.

Here is his answer: "J'irais sur R même (ou parce que) si je connais mieux Python. Même si Python est un super langage avec tout ce dont il y a besoin pour l'analyse statistique (et même plus), ça fait quand même un nouveau langage à apprendre, et donc un overkill comme ils disent. Et R est vraiment la référence en matière de stats. "

In English : "I would go with R even though (or because) I better know Python. Even though Python is an awesome language with all he needs for statistical analysis and even more, that's still a new language to learn, thus an overkill as they say (in English). And R is really the reference in the statistics field".

But still, wait for Gapan's opinion.

Didier

Re: Gapan, what do you recommend for statistical analysis?

Posted: 10. Feb 2017, 13:55
by djemos
I suggest to go with R. This is the language for statistical analysis. As it is Matlab for plotting differential equations in mathematics.

Re: Gapan, what do you recommend for statistical analysis?

Posted: 10. Feb 2017, 14:26
by gapan
It depends. If you're more familiar with one, go with that. If you're not familiar with any of them, then I would say that if your focus is on data and statistical analysis, go with R. If your focus is writing code and you also want to do stuff with your data, go with python.

Here's a nice infographic for the comparison: https://www.datacamp.com/community/tuto ... gs.Om1ZHyY

I suggest you choose R. It has every statistical test you've ever imagined implemented in one of its thousands packages. It's not presented in the same way you're used to with the other statistical packages you mention, but it can do anything they can and a lot more.

Some advice on how to start:
1. Install RStudio. It's not mandatory, but it makes R a lot nicer. You'll love how easy it is to make reports with RMarkdown.
2. Once you started R for the first time, install swirl. It includes lots of tutorials which you run with guidance inside R itself. http://swirlstats.com/

Re: Gapan, what do you recommend for statistical analysis?

Posted: 10. Feb 2017, 20:32
by travis82
@Didier
Thanks my friend and sorry for taking your time. Please give my best regard to your son.

@djemos
Thanks for the suggestion.

@Gapan
Thanks, one of my main problem with R is it's scattered documentation and many free books and online courses that confuse newbies. In that regard swirl looks promising. Many thanks.

Re: Gapan, what do you recommend for statistical analysis?

Posted: 11. Feb 2017, 10:52
by gapan
travis82 wrote:Thanks, one of my main problem with R is it's scattered documentation and many free books and online courses that confuse newbies. In that regard swirl looks promising. Many thanks.
The problem :?: is that R is so powerful and versatile there is so much information out there it cannot be contained within a single book or course. Swirl is only a nice hands-on introduction, but it's good to start with. Expect to be doing a lot of web searching on how to do things at first.

Re: Gapan, what do you recommend for statistical analysis?

Posted: 11. Feb 2017, 19:26
by travis82
gapan wrote:Expect to be doing a lot of web searching on how to do things at first.
I do. But, I didn't mean documentation for specific analysis. I have many general books with titles like "introduction to R for statistical analysis" and I don't have any idea which one is better for the starting point."R for SAS and SPSS users" by Robert Muenchen seems a good candidate as I am familiar with SPSS and I will be able to compare the output of various statistical analysis between R and SPSS. however, any recommendation for good books and tutorials would be appreciated.

Re: Gapan, what do you recommend for statistical analysis?

Posted: 14. Feb 2017, 05:10
by zAchAry
It also depends on where you retrieve your information.

If you retrieve your information from websites that do not provide API, I would retrieve and process information with the followings:

PhantomJS (if webpage requires JavaScript in order for it to be loaded)
Python BeautifulSoup or Python lxml (lxm is much faster)
* XPath (query language for (x)HTML/XML)
Python pandas (to analyze data)