ezoic

Thursday, April 4, 2019

One trick on big data analytics

I once worked on big data projects. I analyzed 5,000,000,000 rows of data each day. I used hadoop/hive. To analyze the data with some scripts took a long time. Sometimes when there were some errors with the scripts, the program would break, and I needed to start over. And it cost time. So sometimes it took relatively long time to get projects done.

So, when you have the problem, start with small samples of the data. Then the programs run faster. you will get the jobs done sooner. time saving.

R is not a simple programming language, and it does better on reading excel files than python

R is not a simple programming language, and it does better on reading excel files than python . tried to read excel files to python and R. i...