Tuesday, August 21, 2012

Database Landscape - SQL & NOSQL Databases


http://www.alberton.info/nosql_databases_what_when_why_phpuk2011.html#.UNniIOTO0VA


Cassandra - A Decentrailzed Structured Storage System, Lakshman,Ladis

http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf





Towards Robust Distributed Systems,Eric brew

http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf





Cassandra Expalined, Eric Evans

http://www.slideshare.net/jericevans/cassandra-explained





Introduction to Cassandra, Gary Dusbabek

http://www.slideshare.net/gdusbabek/introduction-to-cassandra-june-2010





Hbase,Ryan Rawson

http://www.slideshare.net/adorepump/hbase-nosql





CouchDB vs. MongoDB,Gabriele Lana

http://www.slideshare.net/gabriele.lana/couchdb-vs-mongodb-2982288





NoSQL Dabases,Main Dimitrov

http://www.slideshare.net/marin_dimitrov/nosql-databases-3584443



NoSQL for Dummies,Tobias Ivarsson

http://www.slideshare.net/thobe/nosql-for-dummies



NoSQL Database - Part 1 - Landscape , Vineet Gupta

http://www.vineetgupta.com/2010/01/nosql-databases-part-1-landscape/





Introduction to NoSQL Databases by Derek Stainer

http://www.allthingsdistributed.com/





---------------------------------------------------

BASE: An ACID Alternative, Dan Prichett

http://delivery.acm.org/10.1145/1400000/1394128/p48-pritchett.pdf?ip=171.159.64.10&acc=OPEN&CFID=155770146&CFTOKEN=16853736&__acm__=1355249710_c2c1438f6a022a378edebaaad8253a4b



HBase Architecture 101

http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html



Dynamo: Amazon’s Highly Available Key-value Store

http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf



Bigtable: A Distributed Storage System for Structured Data

http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/bigtable-osdi06.pdf




In-Memory Database

Book
In-Memory Data Management: An Inflection Point for Enterprise Applications
by Hasso Plattner ,  Alexander Zeier

---------------------------------------------------------------------

To Learn MongoDB

https://education.10gen.com/

-----------------------------------------------------------------------

Good Blog about HADOOP

http://hadoopblog.blogspot.com/

--------------------------------------------------------------------

Microsoft related
https://www.microsoftvirtualacademy.com/Studies/SearchResult.aspx
http://www.microsoft.com/bigdata
http://msbiacademy.com/
http://bradmcgehee.com



-----------------------------------------------------------------------

http://www.ibm.com/developerworks/data/library/techarticle/dm-1209hadoopbigdata/



Following is a collection from Linked In

Google's MapReduce Paper:
http://research.google.com/archive/mapreduce.html 
* File System (GFS) paper:

http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//archive/gfs-sosp2003.pdf 
* Google Bogtable:

http://research.google.com/archive/bigtable.html


------------
Jiri Kaplan • Hi,

Some useful links:
http://www.cloudera.com/resources/training/ (basics of hadoop - video presentations)
http://hadoop.apache.org/common/docs/r1.0.3/

If you prefer video and just use cases, hints, news and general talk about hadoop some presentations could be useful:
http://www.hadoopworld.com/agenda/http://hadoopsummit.org/

Best way to learn it? Download it and run it in local mode or pseudo-cluster mode imo. See http://hadoop.apache.org/common/docs/r0.20.2/quickstart.html or https://ccp.cloudera.com/display/DOC/Documentation (look for quick start guide)

Learn hadoop by exampes:
hadoop-examples and hadoop-test jars are good for first touch with Hadoop jobs
$HADOOP_HOME/src/test* there are sources of examples (Package org.apache.hadoop.examples in API)

Good luck.

-------------

Lou Dasaro •

Here are some videos I collected.
The last one does a pretty good job of explaining
the hadoop ecosystem.

See http://www.youtube.com/playlist?list=PLF82F6499E89E1BAE&feature=mh_lolz

------------
Tudor Lapusan • Hi,
I think the best way to start learning hadoop is to :
1. - make a single/multiple cluster, run wordcount example.

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
1. - read more about hadoop core functionalities.

http://developer.yahoo.com/hadoop/tutorial/

bye!

------------


for a complete round up of study materials for Hadoop MapReduce and HDFS, please go to http://lnkd.in/jycmQd

Copying the content from the above link -
Studying Hadoop or MapReduce can be a daunting task if you get your hand dirty at the start.
Some of the prerequisites for learning Hadoop are having a good experience in Java. Good Analytical skills help a lot as well and final secret sauce for being successful is – you need to be motivated to self learn lot of things in the bigdata arena.

For Learning Hadoop ,I followed the schedule as follows :

Start with very basics of MR with http://code.google.com/edu/parallel/dsd-tutorial.htmlhttp://code.google.com/edu/parallel/mapreduce-tutorial.html
Then go for the first two lectures in http://www.cs.washington.edu/education/courses/cse490h/08au/lectures.htm A very good course intro to MapReduce and Hadoop.
Read the seminal paper http://labs.google.com/papers/mapreduce.html and its improvements in the updated version http://www.cs.washington.edu/education/courses/cse490h/08au/readings/communications200801-dl.pdf
Then go for all the other videos in the U.Washington link given above.
Try youtubing the terms Map reduce and hadoop to find videos by ORielly and Google RoundTable for good overview of the future of Hadoop and MapReduce
Then off to the most important videos -
Cloudera Videos
http://www.cloudera.com/resources/?media=Video
and
Google MiniLecture Series
http://code.google.com/edu/submissions/mapreduce-minilecture/listing.html

Along with all the Multimedia above we need good written material
Documents:

Architecture diagrams at http://hadooper.blogspot.com are good to have on your wall
Hadoop: The definitive guide goes more into the nuts and bolts of the whole system where as Hadoop in Action is a good read with lots of teaching examples to learn the concepts of hadoop. Pro Hadoop is not for beginners
pdfs of the documentation from Apache Foundation
http://hadoop.apache.org/common/docs/current/
and http://hadoop.apache.org/common/docs/stable/
will help you learn as to how model your problem into a MR solution in order to gain the advantages of Hadoop in total.
HDFS paper by Yahoo! Research is also a good read in order to gain in depth knowledge of hadoop
Subscribe to the User Mailing List of Commons, MapReduce and HDFS in order to know problems, solutions and future solutions.
Try the http://developer.yahoo.com/hadoop/tutorial/module1.html link for beginners to expert path to Hadoop

In Addition following 3 books are good resources:

Hadoop – Definitive Guide : Good info on Internals
Hadoop in Action : Good Programming Guide
Pro Hadoop : Overall good book with very good explanation
to the advanced concepts.

For Any Queries …
Contact Apache, Google, Bing, Yahoo!

And for setting up a single node hadoop setup and setting up eclipse for Hadoop Programming, please go to http://orzota.com/blog/single-node-hadoop-setup-2/ and http://orzota.com/blog/eclipse-setup-for-hadoop-development/

--------------------


Joey Calca • I gave a speech at DefCon 17 that was 40 minutes
long about how Hadoop works as well as a look at some sample code
and what it does.

You can find the video of the speech in this playlist (you can skip the first video, its just a teaser from the end of the speech)

https://www.youtube.com/watch?v=JYACdhxsUNs&list=PL64E697915C5C24FB&feature=plcp

The Source Code from the presentation and a simple breakdown of what it does can be found on the project page here:

http://hackedexistence.com/project-netflix.html

Hope this helps



1 comment:

Unknown said...

Hadoop is a powerful framework that allows for automatic parallelezation of computing task.
Hadoop Development