Database Technologies......: 2012

Wednesday, December 12, 2012

Scale Up (Vertical Scaling) - Scale Out (Horizontal Scaling)

Scale Up (Vertical Scaling) : Buy a New System and Increase the Hardware.
Scale Out (Horizontal Scaling) is done on two vectors
a. Functional Scaling( Divide the tables depending on functionality into different databases)
b. Sharding (Same Functionality is divided into different databases i.e., adding one more dimension to the above Functional Group).

Both Horizontal Scaling approaches can be applied atonce as follows.

Tuesday, November 6, 2012

Changing Data Compression for a List of Partitions ; when Switchout is failed

Background : I need to move the Partitioned data from my Production Server to Archive Server.
Following steps I follow
a. Switch out data from Main Table (Partioned) to a Temp table (Partitioned).
b. Move the Temp Table to a Temporary Database( ex: DBArchive)
c. Backup the Temporary Database( ex: DBArchive)
d. Restore the Temporary Database( ex: DBArchive) on Archive Server.
e. Merge the Partition Data to the Main Tables

when I am Switching out to a Temp Table sometimes Switch out is failed because source Table (Main Table) partition compression is different from Target Table (Temp Table).

-- Changing Data Compression for a List of Partitions ; when Switchout is failed.
-- Find out the Compression in between Main table and Dump Table
-- If the Table Compression is different then How to Change the Compression in the Target Table.

Check Compression in between Main table and Dump Table

select distinct data_compression_desc From sys.partitions (nolock)
where OBJECT_NAME(object_id) = 'MainTable'
and partition_number >= 609
and partition_number <= 670
go

select distinct data_compression_desc From sys.partitions (nolock)
where OBJECT_NAME(object_id) = 'dumpTable'
and partition_number >= 609
and partition_number <= 670
go

-------------------

ALTER TABLE dumpTable REBUILD PARTITION = WITH (DATA_COMPRESSION = NONE);
In the above should be generated dynamically and will follow as follows.

SELECT TOP 500 ColumnName
INTO #1
FROM dbo.SampleTable (NOLOCK) --
go

WITH RowNumber as
(
SELECT ROW_NUMBER() Over (ORDER BY ColumnName) as rn
FROM #1
)

select 'ALTER TABLE dumpTable REBUILD PARTITION = ' + CONVERT(VARCHAR,rn) + ' WITH (DATA_COMPRESSION = NONE);'
From RowNumber
where rn between 62 and 426
go

Wednesday, October 31, 2012

Different Databases , Where it exactly Fits.

Databases One Size Does not Fit all
Credit goes to Tim Gasper Product Manager at Infochimps
From TechCrunch.com

Friday, September 28, 2012

DNS Suffix

Tuesday, August 21, 2012

Database Landscape - SQL & NOSQL Databases

http://www.alberton.info/nosql_databases_what_when_why_phpuk2011.html#.UNniIOTO0VA

Cassandra - A Decentrailzed Structured Storage System, Lakshman,Ladis

http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf

Towards Robust Distributed Systems,Eric brew

http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf

Cassandra Expalined, Eric Evans

http://www.slideshare.net/jericevans/cassandra-explained

Introduction to Cassandra, Gary Dusbabek

http://www.slideshare.net/gdusbabek/introduction-to-cassandra-june-2010

Hbase,Ryan Rawson

http://www.slideshare.net/adorepump/hbase-nosql

CouchDB vs. MongoDB,Gabriele Lana

http://www.slideshare.net/gabriele.lana/couchdb-vs-mongodb-2982288

NoSQL Dabases,Main Dimitrov

http://www.slideshare.net/marin_dimitrov/nosql-databases-3584443

NoSQL for Dummies,Tobias Ivarsson

http://www.slideshare.net/thobe/nosql-for-dummies

NoSQL Database - Part 1 - Landscape , Vineet Gupta

http://www.vineetgupta.com/2010/01/nosql-databases-part-1-landscape/

Introduction to NoSQL Databases by Derek Stainer

http://www.allthingsdistributed.com/

---------------------------------------------------

BASE: An ACID Alternative, Dan Prichett

http://delivery.acm.org/10.1145/1400000/1394128/p48-pritchett.pdf?ip=171.159.64.10&acc=OPEN&CFID=155770146&CFTOKEN=16853736&__acm__=1355249710_c2c1438f6a022a378edebaaad8253a4b

HBase Architecture 101

http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html

Dynamo: Amazon’s Highly Available Key-value Store

http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf

Bigtable: A Distributed Storage System for Structured Data

http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/bigtable-osdi06.pdf

In-Memory Database

Book
In-Memory Data Management: An Inflection Point for Enterprise Applications
by Hasso Plattner , Alexander Zeier

---------------------------------------------------------------------

To Learn MongoDB

https://education.10gen.com/

-----------------------------------------------------------------------

Good Blog about HADOOP

http://hadoopblog.blogspot.com/

--------------------------------------------------------------------

Microsoft related
https://www.microsoftvirtualacademy.com/Studies/SearchResult.aspx
http://www.microsoft.com/bigdata
http://msbiacademy.com/
http://bradmcgehee.com

-----------------------------------------------------------------------

http://www.ibm.com/developerworks/data/library/techarticle/dm-1209hadoopbigdata/

Following is a collection from Linked In

Google's MapReduce Paper:
http://research.google.com/archive/mapreduce.html
* File System (GFS) paper:

http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//archive/gfs-sosp2003.pdf
* Google Bogtable:

http://research.google.com/archive/bigtable.html

------------
Jiri Kaplan • Hi,

Some useful links:
http://www.cloudera.com/resources/training/ (basics of hadoop - video presentations)
http://hadoop.apache.org/common/docs/r1.0.3/

If you prefer video and just use cases, hints, news and general talk about hadoop some presentations could be useful:
http://www.hadoopworld.com/agenda/http://hadoopsummit.org/

Best way to learn it? Download it and run it in local mode or pseudo-cluster mode imo. See http://hadoop.apache.org/common/docs/r0.20.2/quickstart.html or https://ccp.cloudera.com/display/DOC/Documentation (look for quick start guide)

Learn hadoop by exampes:
hadoop-examples and hadoop-test jars are good for first touch with Hadoop jobs
$HADOOP_HOME/src/test* there are sources of examples (Package org.apache.hadoop.examples in API)

Good luck.

-------------

Lou Dasaro •

Here are some videos I collected.
The last one does a pretty good job of explaining
the hadoop ecosystem.

See http://www.youtube.com/playlist?list=PLF82F6499E89E1BAE&feature=mh_lolz

------------
Tudor Lapusan • Hi,
I think the best way to start learning hadoop is to :
1. - make a single/multiple cluster, run wordcount example.

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
1. - read more about hadoop core functionalities.

http://developer.yahoo.com/hadoop/tutorial/

bye!

------------

for a complete round up of study materials for Hadoop MapReduce and HDFS, please go to http://lnkd.in/jycmQd

Copying the content from the above link -
Studying Hadoop or MapReduce can be a daunting task if you get your hand dirty at the start.
Some of the prerequisites for learning Hadoop are having a good experience in Java. Good Analytical skills help a lot as well and final secret sauce for being successful is – you need to be motivated to self learn lot of things in the bigdata arena.

For Learning Hadoop ,I followed the schedule as follows :

Start with very basics of MR with http://code.google.com/edu/parallel/dsd-tutorial.htmlhttp://code.google.com/edu/parallel/mapreduce-tutorial.html
Then go for the first two lectures in http://www.cs.washington.edu/education/courses/cse490h/08au/lectures.htm A very good course intro to MapReduce and Hadoop.
Read the seminal paper http://labs.google.com/papers/mapreduce.html and its improvements in the updated version http://www.cs.washington.edu/education/courses/cse490h/08au/readings/communications200801-dl.pdf
Then go for all the other videos in the U.Washington link given above.
Try youtubing the terms Map reduce and hadoop to find videos by ORielly and Google RoundTable for good overview of the future of Hadoop and MapReduce
Then off to the most important videos -
Cloudera Videos
http://www.cloudera.com/resources/?media=Video
and
Google MiniLecture Series
http://code.google.com/edu/submissions/mapreduce-minilecture/listing.html

Along with all the Multimedia above we need good written material
Documents:

Architecture diagrams at http://hadooper.blogspot.com are good to have on your wall
Hadoop: The definitive guide goes more into the nuts and bolts of the whole system where as Hadoop in Action is a good read with lots of teaching examples to learn the concepts of hadoop. Pro Hadoop is not for beginners
pdfs of the documentation from Apache Foundation
http://hadoop.apache.org/common/docs/current/
and http://hadoop.apache.org/common/docs/stable/
will help you learn as to how model your problem into a MR solution in order to gain the advantages of Hadoop in total.
HDFS paper by Yahoo! Research is also a good read in order to gain in depth knowledge of hadoop
Subscribe to the User Mailing List of Commons, MapReduce and HDFS in order to know problems, solutions and future solutions.
Try the http://developer.yahoo.com/hadoop/tutorial/module1.html link for beginners to expert path to Hadoop

In Addition following 3 books are good resources:

Hadoop – Definitive Guide : Good info on Internals
Hadoop in Action : Good Programming Guide
Pro Hadoop : Overall good book with very good explanation
to the advanced concepts.

For Any Queries …
Contact Apache, Google, Bing, Yahoo!

And for setting up a single node hadoop setup and setting up eclipse for Hadoop Programming, please go to http://orzota.com/blog/single-node-hadoop-setup-2/ and http://orzota.com/blog/eclipse-setup-for-hadoop-development/

--------------------

Joey Calca • I gave a speech at DefCon 17 that was 40 minutes
long about how Hadoop works as well as a look at some sample code
and what it does.

You can find the video of the speech in this playlist (you can skip the first video, its just a teaser from the end of the speech)

https://www.youtube.com/watch?v=JYACdhxsUNs&list=PL64E697915C5C24FB&feature=plcp

The Source Code from the presentation and a simple breakdown of what it does can be found on the project page here:

http://hackedexistence.com/project-netflix.html

Hope this helps

Thursday, August 9, 2012

Synch up production SSAS cube to TFS.

Steps Synch up production SSAS cube to TFS.

1. In Visual Studio , File -> New -> Project -> Import Analysis Services 2008 Project -> select the temporary folder path and generate Project file

2. Go to TFS get the Latest version and check out all the files within the cube folder.

3. Now copy temporary folder path (step 1 above) to the TFS folder path

4. In Visual Studio, File -> Source Control -> Change Source Control -> Bind the TFS Path.

5. Check in Solution file.

6. If you still see few files which have not bind to TFS , then go to TFS folder Path, click add file , select the files which are not bind and then check in.

7. If you still see couple of files got deleted then deleted the files in the TFS folder Path and then check in.

Wednesday, May 23, 2012

Partitioned Table / Non Partitioned Table with Logical and Physical File Name

select distinct PartitionedTable = OBJECT_NAME(SI.object_id), sysindexes_data_spaceId = SI.data_space_id , destination_data_spaces_data_spaceId = DDS.data_space_id, PhysicalFileName = DF.physical_name , LogicalFileName = DF.name from sys.indexes AS SI (nolock) -- Partitioned Table contains Partition schema data_space_id -- Non-Partitioned Table contains File Group data_space_id join sys.data_spaces AS DS (nolock) -- Data_space_id info for Partition schema and File Group -- Determines PS or FG on DS.data_space_id = SI.data_space_id join sys.destination_data_spaces AS DDS -- Partition schema (Data_space_id) inturn linked to File Group (Data_space_id) on DS.data_space_id = DDS.partition_scheme_id join sys.database_files AS DF -- File Group data_space_id with Physical file name on DF.data_space_id = DDS.data_space_id AND DF.type = 0 where DS.type = 'PS' and OBJECTPROPERTYEX(SI.object_id, 'BaseType') = 'U' and SI.index_id IN(0,1) order by DDS.data_space_id go select distinct PartitionedTable = OBJECT_NAME(SI.object_id) ,DS.data_space_id ,DF.physical_name from sys.indexes AS SI (nolock) join sys.data_spaces AS DS (nolock) on DS.data_space_id = SI.data_space_id join sys.database_files AS DF on DF.data_space_id = DS.data_space_id AND DF.type = 0 where DS.type = 'FG' and OBJECTPROPERTYEX(SI.object_id, 'BaseType') = 'U' and SI.index_id IN(0,1) --and DS.data_space_id IN ( 65606,65607) order by DS.data_space_id go

Tuesday, March 20, 2012

In SharePoint 2010 Integrated Mode - Reporting Services does not display Hyperlink for Text Box Properties with Go to URL Action.

When using Reporting Services in Sharepoint Integrated mode a Text box label with Go to URL action will not render as a hyperlink for VALID internal links.

Notice that the links work in Visual Studio Preview mode for the report. It's only after you deploy the report that all valid internal links are stripped.

Here is a solution (someone else's idea)

While designing the report:

Open up a text box properties window
Go to Action
Enable Jump to URL
In the URL enter ="javascript:void window.location.href='http://xxx/xxx/xxxxx.docx')"

Friday, March 9, 2012

Using an HTTP Module to assist in adjusting the value of aspnet:MaxHttpCollectionKeys imposed by MS11-100

After Sharepoint Upgarded to 2010. one of SSRS report stopted working because Parameter value is limited to 1000. To resolve please follow the following steps.

http://blogs.msdn.com/b/paulking/archive/2012/01/16/using-an-http-module-to-assist-in-adjusting-the-value-of-aspnet-maxhttpcollectionkeys-imposed-by-ms11-100.aspx

Main Points.

1. Build MS11-1000Helper.DLL (download from the above link) and place it in your web application’s Bin directory

2. Add the following to your web.config file to enable the HTTP Module

system.web
httpModules
add type="MS11_100_Helper.CounterModule, MS11-100-Helper" name="CounterModule"
/httpModules
/system.web

3. Add the following to your appSettings in web.config file

appSettings
add key="aspnet:MaxHttpCollectionKeys" value="9999"
appSettings