Following are some of my favorite sites and blogs. This is a "live" list and will change over time. Readers are welcome to send their suggestions.
Recommended Reading
- “Data, Data, Everywhere.” The Economist. February 25, 2010. http://www.economist.com/node/15557443 – Reports on the shift from data scarcity to overabundance and the benefits and headaches that result.
- Graham, Mark. “Big Data and the End of Theory.” The Guardian. March 9, 2012. http://www.guardian.co.uk/news/datablog/2012/mar/09/big-data-theory - A measured response to big data hype.
- Press, Gil. “A Short History of Big Data.” Forbes. May 9, 2013. http://www.forbes.com/sites/gilpress/2013/05/09/a-very-short-history-of-... - Seventy years of big data history.
Big Data and the Academy
- Bell, Steven. “Promise and Problems of Big Data.” Library Journal. March 13, 2013. http://lj.libraryjournal.com/2013/03/opinion/steven-bell/promise-and-pro... - Cautionary article on big data ‘solutionism.’
- Parry, Marc. “Big Data on Campus.” The New York Times. July 18, 2012. http://www.nytimes.com/2012/07/22/education/edlife/colleges-awakening-to... - How colleges are using big data to help students chose classes, retain them, and counsel those in need.
- Schwartz, Meredith.”What Governmental Big Data May Mean For Libraries.” Library Journal. May 30, 2013. http://lj.libraryjournal.com/2013/05/oa/what-governmental-big-data-may-m... - Government open data initiatives and how they affect libraries and data collection and retention.
Case Studies
- Howard, Alex. “Predictive Data Analytics is Saving Lives and Taxpayer Dollars in New York City.” O’Reilly RADAR. June 26, 2012. http://strata.oreilly.com/2012/06/predictive-data-analytics-big-data-nyc... - How big data is helping city government be more effective and efficient.
- Madrigal, Alexis. “The Perfect Milk Machine: How Big Data Transformed the Dairy Industry.” The Atlantic Monthly. May 1, 2012. http://www.theatlantic.com/technology/archive/2012/05/the-perfect-milk-m... - The impact of big data on cattle breeding.
- Scherer, Michael. “How Obama’s Data Crunchers Helped Him Win.” Time. November 8, 2012. http://www.cnn.com/2012/11/07/tech/web/obama-campaign-tech-team - Covers how big data analytics helped Obama win the last election.
Privacy and Criticism
- boyd, dannah and Kate Crawford. “Critical Questions for Big Data” Information, Communication & Society. May 10, 2012. http://www.tandfonline.com/doi/abs/10.1080/1369118X.2012.678878#preview – Microsoft researchers ask provocative questions about the use of big data.
- Crawford, Kate. “Think Again: Big Data.” Foreign Policy. May 9, 2013. http://www.foreignpolicy.com/articles/2013/05/09/think_again_big_data - Discusses the limitations and potential downsides of data driven decision making using big data sets.
- Croll, Alistair. “Big Data is Our Generation’s Civil Right’s Issue and We Don’t Know It.” O’Reilly RADAR. August 2, 2012. http://radar.oreilly.com/2012/08/big-data-is-our-generations-civil-right... - Examines how web ‘personalization’ might be another form of redlining or racial profiling.
- Duhigg, Charles. “How Companies Learn Your Secrets.” New York Times. February 16, 2012.
- http://www.nytimes.com/2012/02/19/magazine/shopping-habits.html?pagewant... - How Target uses big data to determine when their customers are pregnant.
Tutorials
- “AMP Camp Big Data Bootcamp.” AMPLab. Accessed May 18, 2013. http://ampcamp.berkeley.edu/big-data-mini-course-home/ - Hands-on mini course on big data from Berkeley’s AMPLab. Requires an Amazon EC2 account, and some technical expertise.
- “Big Data Tutorial: Everything You Need to Know.” SearchStorage. Accessed May 20, 2013. http://searchstorage.techtarget.com/guides/Big-data-tutorial-Everything-... - From the basics to a deeper dive into more technical issues of big data.
- Tariq, Mohammed. “Hadoop Toolbox: When to Use What.” SmartData Collective. April 27, 2013. http://smartdatacollective.com/mtariq/120791/hadoop-toolbox-when-use-wha... – Reports on the set of software tools used for big data.
Sandboxes
- A big data “sandbox” is a free method of using Hadoop and big data tools. Big data ‘sandboxes’ require downloading specific software .
- “Cloudera QuickStart VM.” Cloudera. Accessed May 22, 2013. https://ccp.cloudera.com/display/SUPPORT/Cloudera+QuickStart+VM - This sandbox requires a 64 bit host OS and 4 GB of total RAM.
- “Get Started with Hadoop & Hortonworks Data Platform.” Hortonworks. Accessed March 15, 2013. http://hortonworks.com/get-started/ - This sandbox requires a 64 bit host OS and 4 GB of total RAM.
Articles and Blogs
- Cloudera Blog
- Datarella™
- Hortonworks Blog
- O'Reilly Radar Data
- Smarter Computing Blog
- Twitter Engineering Blog
- Dataspora
- Duarte
- Excel Charts
- Fast Company
- Fathom
- Guardian Data Store
- Infosthetics
- Junk Charts
- Just Plain Data Analysis
- Stamen Design
- Visual Business Intelligence
- Visual Complexity
- Visualizing.org
- Viz Wiz
Tools
- Wordle
- Tableau
- Many Eyes
- visual.ly
- Google Trends
- Google Public Data Explorer
- Google Refine
- Google Ngram