Google Refine

Ever worked with messy data? Whenever I get xls or csv files, I won’t be so naive to think that the data is clean. Very often there are trailing spaces or even worse newline characters at the end of certain cells, not to mention typos and stuff. I have to physically go through all data cells to do clean up before getting them into the database.

It’s such a pain, and I probably won’t notice anything until errors occur later on. Chances are I missed a few clean ups and now I would have to re-import the dataset into the database again.

But guess what? That’s when Google Refine comes to the rescue. It visualizes your data and lets you do filtering to see the discrepancies. Then you can correct the same kind of errors in one step. Check out the following videos to see things in action!

Wolfram Alpha

Recently, WolframAlpha has been making some noise saying it can perform better searches than google. From what I understand, WolframAlpha claims to understand the words you type into the search bar, thus returning better search results, whereas google will only search your words literally with its PageRank algorithm. Say for example, if you type in “short CEO”, WolframAlpha will give you the info on the short CEO such as his name, his business, and so forth, while google will give you anything related to your search words.

I don’t think WolframAlpha will be anywhere close to being a threat to google. If anything, google can just buy them out.

WolframAlpha will be having their webcast soon on May 15th 7pm CST.

Google Flu Trends




This tool made use of the google searches people around America made to predict if there are any flu outbreaks. The idea behind it is that, there is correlation between people getting sick and people who will do a “flu symptoms” search. However, it is not meant to be an accurate tool because there are people like the elderly who will not go online. Nonetheless, sacrificing accuracy, this tool gives faster flu level “reports” than any traditional tools. CNN has a post on this too. Take a look!