Pages

Friday, October 28, 2011

Easy performance burst for large data

This week I started importing data from a new source. That means I take around 5-6 times as many data into the system as before. Going from handling around 10.000.000 (10M ) records a day to almost 60M. This is, lucky me, still in development so its possible to use all my tips and tricks I learn at SQL PASS this year.

I found a couple of things which quite cheap gave me a boost in performance. Data types! I have always just used int, datetime and float regardless of actual size of data. Never gave that much thought - we just bought bigger disks.

But as PASS I attended several sessions discovered two main issues with that approach: Memory spills (think that was the buzzword) and pages. Never mind I will explain.

Most of the operation on SQL Server starts out in memory but when data gets to large to keep up there it goes down under to the hard drive in tempdb or the OS paging file. This means a huge difference in speed as performance of the hard drive is drastic reduced compared to RAM. The lesson learn was keep data as small as possible which postpone or reduces the times when data has to go to disk to be sorted, joined or what ever.

I also realized another thing which I actually knew but didn't put into this context (shame on me). Table data are stored on disk in 8Kb pages.  And it always reads a full page at the time. So if you have a record which is below 8Kb then the server needs only to read this one page to access this record. But if the record is 8.1Kb (well max allowed for a row is 8060 bytes but stay with me) then it needs to read two pages. And as IO to disk is the slowest thing we have on the server, the more pages it needs to read/write every time, the slower it is. It's very easy to see how much IO you actually use by turning on statistics io. Let my give you a simple example.

Create a table with char(8000) just to show the point


create table test(c1 char(8000))

Then turn on statistics: set statistics io on
Run the following: insert test values('horse')
And then: select * from test
In the message window you should see:

Table 'test'. Scan count 0, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

The thing to look for in this case is the logical reads 1 which means that the server had to read one page.
Insert another row: insert test values('cow')
And: select from test
This time you see that logical reads says 2! It had to read two pages.

Drop the table (drop table test) and then lets reduce the records size by half.

create table test(c1 char(4000))

Run the two inserts again and do a select between each of them. You should now see that logical reads are 1 even after two insert. If you insert one more record then it will go to two reads.

Well it is not often we can reduce a record by this much. But if you have a int field which only are used for storing 1 to 10 then a tinyint would do just fine and give you 3 bytes in return. In one of my tables I found that my using tinyint, smallint and smalldatetime instead of int and datetime I could reduce the record by 54 bytes. In round numbers that means I can spare a page for each 150 records. And in a table with almost 1.000.000.000 records that is somewhat a lot :) Almost 50Gb less data.

Of cause I never get all those records as once. But I always gets way more than 150 records at the time. And this is done a lot. So that's a lot of io reduction - for free.

I found another table which had 9 columns of ValidSomething fields which was 0 if that condition was not valid and 1 if it was. They where all int. That is 9x4 bytes (36) which can be reduces to 2 bytes. If the column only holds 0 and 1 (what a developer would call a boolean) then it can be stored as a bit column. If you only have one bit column in a record the tinyint and bit would be the same namely 1 byte. But the server stores up to 8 bit columns in one byte so with 9 bit columns only 2 bytes are need. And the smart thing is that adding 7 extra bit fields later will not cost me anything - I like that.

So working with large data every byte counts. I am currently redesigning my data collection flow. I tried to load the new daily data burst the old way at once. It took 13 hours. By reducing the data size and then I did some T-SQL redesign as well I had it down to around half an hour. Not bad for a couple of days work.

Maybe I should optimize me - by that rate I only need to go to work half of Monday then my 40+ hour week would be over :) But where is the fun in that, right? :D 

I will return with a few queries "not to" tips I discovered when re-writing some of the code.
Happy optimizing,

Thursday, October 20, 2011

Do it your self perfmon

At SQL PASS I attended Adam Machanic pre-con one day class of 


I really liked the well structured way of storing the recorded samples. With the risk of being #sqlsue I will try to lay out his idea on the database structure.

He has two timing tables. Dates which holds all dates from today and until 2020. One record for each date with a unique dateid (int) and then all the components of the date as a separate column like mount, day, year, quater, week, day number in week and year and so on. The same with the time. That table holds a record for each 15 seconds on a day. And the smart thing is that all times has been marked with a segment of day like morning, busy period and so on. And a 15 min segment time. 

All recorded data is then stored with a dateid and a timeid. This means that your are able to easily see how all parameters has performed on exactly this time of day the last month to see if today or any other day behave abnormally. Or your can see how the behavior is the last day of the month compared to all the other last day of month. And so on.

I have done a lot of perf mon recording but that time layout never struck me. I always had to deal with the slow extraction of datetime and between. So that I really liked. 

He also had a beta version of his datacollector which can monitor and store data in that format. It works very well but wasn't that good on a lot of instances on the same server - which I have. You could set it up to use instances but that meant adding a counter for each server instance instead of just adding the instances and have the collector pick it up. Adam will make this in a up-coming version.

But I though it was fun to make the data collector my self. That is actually my main work programming data collectors for everything from radars to stuck exchanges. So this should be fun.

Actually it's quite easy to get perfmon data from the server using C#. 

                    // Create the counter
                    Counter = new PerformanceCounter(CategoryName, CounterName, InstanceName, MachineName);
                    // "Prime the pump" by getting the first value
                    Counter.NextValue();

PerformanceCounter is part of System.Dianostics and you just give it like
  • Category like "PhysicalDisk"
  • Counter like "Avg. Disk sec/Read"
  • Instance like "1 C:"
  • Machine like "YourServer"
Note instance has nothing to do with SQL Server instance.


You can just use PerfMon to see the categories, counters and instances.


Then every time you call Counter.NextValue() you get the current value.

Create as many counters as you like. I have 192 counters in total for all my instances (4) and it takes like 0.2 secs to sample and the data is send to database all in all around 0.4 secs every 15 seconds for the counters and a few wait_stat, file_stat and query_stat queries.


If you want to get all the instances for the counters for a given category use


PerformanceCounterCategory cat = new PerformanceCounterCategory(CategoryName, MachineName);
cat.GetInstanceNames()


Note that all counters in the given category always have the same instances.

If you don't want to go into the more complex thread programming you could just use a Timer with a callback function to fire every 15 seconds and you are on your way.

No mater how you do it - it's a really good idea to have a baseline sample. Because when things go bad you actually can see where it goes wrong just by comparing it to your baseline. PerfMon don't tell you how it usually is only how it is now. And you need as least a month of good sampling data to establishing a good baseline. So better get started.

But the really genius is in storing the records with that very well segmented date and time fragmentation. That really gives you a heads up on exactly how everything looks this time of day.

So thanks to Adam for sharing his database layout and hints for baseline.

Saturday, October 15, 2011

CTE Christmas tree

Just went to Audrey's T-SQL Awesomeness at SQL Pass for some enthusiastic pointers on writing awesome and cool code :) I really loved to see how excited she was about writing readable, functional but smart looking code.

She talk mostly about CTE (Common Table Expressions) and she showed how to make recursive functions. She had a simple ex. which produce a 25 records each with one more + using recursive CTE. She said she wanted to make a Christmas tree but could figure out how. So I though I would give it a go :)




with ChristmasTree (RowNumber, TreeData) as
(
/* Select the top - this is the anchor */
select 0 as RowNumber, cast(replicate(' ', 25)+'*'+replicate(' ', 25) as varchar(51)) as TreeData
union all
/* Select the rest from it self until we have 25 rows (plus the top)
  For each time it picks up one more star on each side of the tree
  This is the recursive menber */
select RowNumber+1 as RowNumber, cast(replicate(' ', 25-RowNumber)+replicate('*',RowNumber+1)+replicate(' ', 25-RowNumber) as varchar(51)) as TreeData
from ChristmasTree
where RowNumber < 25
)
/* Statement to fire it all up */
select TreeData from ChristmasTree



The last statement is the one firing up under the recursive call. It call into ChristmasTree (CTE).

The first statement call in the CTE is the anchor. It needed to lay the foundation of the recursiveness.

The statement below is call. And was it really does is sending RowNumber and TreeData into the next recursive call. So the next call has RowNumber 2 and so on. This keeps happening until RowNumber hits 25.

If you remove the where part it will keep going until it has nested down for 32767 (signed int max number) and then it breaks. Well it might stop after 100 iteration which is SQL server default recursion deept.

But this is a truly powerful tool to get for instance all children from a parent throug multible levels.

Like Audrey would say. That is awesome :)  

Sunday, October 9, 2011

Ready PASS but who to meet?

We arrived yesterday at Seattle after traveling for 19 hours strait. After checking into The Paramount Hotel we went to Fox Sports Grill. They had everything a guy is looking for. Football on TV, Streaks on white toasted  bread and a beer :) Great evening.

Normally I feel very uncomfortable meeting new people so I try to avoid it. But reading up on all the great tips for SQL PASS Summit and what I see over and over again is the words networking and meeting new people. So I decided that to get the most out of this I will try to meet some new people each day.

And to insure that I at least meet some new people I will put it into writing that I will at least try to meet the following people:

Brent Ozar
This guy really kicked my interest into diving deeper into the SQL Server. I attended one of his online web casts on how to take over a SQL Server. I was so excited after hearing his enthusiasm about the SQL Server. I thing they where three guys and the really had a lot of fun. I was sold and he might be the reason why (besides my company how send me) I'm here today. He kick started things for us. So this guy I want to meet.

Jes Borland
I joined twitter to figure out more about this Summit and to try to learn some people beforehand. And this girl are hard to miss on twitter :) She is a sheer energy bomb. She has already, even though she is quite young, done so much. Actually I thing she is running her first half marathon today. I can relate to this. I remember every inch of my first half marathon a couple of year ago. This girl I just have to meet.

Pinal Dave
I have used a script for many years to find unused indexes. This was blog post has been my first bookmark in the browsers bookmark bar. So this is used quite a lot. I find out that this post well the entire blog is written by Dave and he is coming to the Summit as well. So this would be kind of fun to get a chance to say thanks and let him know what a important part of my DBA life this one post has.

Jason Strate
Well have you seen his videos pre-con :) I'm still laughing just thinking about it :D Well him and his wife are very dedicated to the SQL Server community. I like to meet people like that which burns with such passion for there "believes" :) And then he is from Minnesota. I lived there for a year (go Vikings :) - I even brought my Vikings Bret Farve jersey). This time in MN I have looked back up for the last 26 years with a lot of joy So him and his wife I like to meet.

And many more...
There are a lots of out people I like to meet but lets start here. And if you see me at the summit and are not afraid of new people like me - please come and say Hi to me. You can find me on twitter as well.

Now its time for a hot bath and some breakfast before we go out to see Seattle. I need to get a convert for my electronic equitment and then I would like to see if its possible to buy a small mobil data packet just for this week. The iPhone are best with data but it cost the tip of an fighter jet (so my boss says) on our Danish mobil provider when we are over here.

Friday, September 30, 2011

SQL PASS Summit plans

I think I have my execution plan for the summit ready. Me and my co-worker will leave Denmark on Saturday the 8th of October around noon - yes that's only a week from now :) Yeahhhhhhh
We have to fly to Amsterdam and from there to Seattle where we will be a couple of hours later - not considering the time difference :) So it will only be around 5pm when we arrive at the SEA airport.

We are staying at The Paramount, in working distance from the summit - even after a few drinks.

I was hoping to catch a Seahaws game that weekend but unfortunately there play away. I'm a big fan of american football (called that here since we call soccer for football :)) so it would have been nice to see a live game. Even though I'm a Vikings fan - that would have been great. Anybody know if there is a college football game that weekend near by? Wachington Huskies an't playing the weekend at the arrival.

Well I guess we will just take care of the jet lag on Saturday and maybe get a steak :) love steaks. But on Sunday it will be time to see Seattle. We both know another great Dane which comes to the summit and I would love to try to do some stuff with him on Sunday. Maybe a look at one off my troubled SQL Servers over drinks :) See that would be a nice day!

As this is my first trip to the summit I don't really know what I'll (we) doing in the evenings. But I like to mingle so I will figure out something to do. Hopefully I find someone which wants to talk about SQL stuff over a couple of whiskeys - my two top choices. You too? Then we have a play date lined up.

From Monday it's all about the summit. My current plan looks like this:


Monday, October 10, 2011
8:30 AM - 4:30 PM[AD-498-P] Fixing Queries With Advanced T-SQL Constructs[RM:609-610]  
Tuesday, October 11, 2011
8:30 AM - 4:30 PM[DBA-399-P] No More Guessing! An Enlightened Approach to Performance Troubleshooting[RM:4C1-2] 
Wednesday, October 12, 2011
10:15 AM - 11:45 AM[DBA-311-S] Bad plan! Sit![RM:6E] 
1:30 PM - 2:45 PM[DBA-406] Tips for handling large data volumes[RM:4C1-2] 
4:45 PM - 6:00 PM[DBA-320] Why are we Waiting..[RM:606-607] 
Thursday, October 13, 2011
10:15 AM - 11:30 AM[DBA-403] Advanced SQL Server 2008 Troubleshooting[RM:6E] 
1:30 PM - 2:45 PM[DBA-307] 100% Availability without 100% Uptime[RM:2AB] 
3:00 PM - 4:15 PM[DBA-319-C] All the Magic Knobs[RM:2AB] 
5:00 PM - 6:15 PM[DBA-317] Discovering the Plan Cache[RM:602-604] 
Friday, October 14, 2011
10:15 AM - 11:30 AM[DBA-305] Inside the SQL Server Transaction Log[RM:602-604] 
1:00 PM - 2:15 PM[AD-305] Rewrite Your T-SQL for Great Good![RM:615-617] 
2:30 PM - 3:45 PM[DBA-207] Baseline Basics: Where to Start, What You Need[RM:4C1-2] 
4:15 PM - 5:45 PM[AD-308-S] T-SQL Awesomeness - 3 Ways to Write Cool SQL[RM:6E]


This is my priority right at this moment anyway.

And on Saturday we go back :( at lunch time. I guess the plan on moving Europe during that week because the flight back to Denmark takes almost 24 hours - we are not home before Sunday afternoon :) Well I might have to learn something about time zones.

Thursday, September 29, 2011

Let the little ones join up first

The order of tables in a join - third blog in my series "From newbie to DBA"


In my last blog I promised to look at the simple way to use the execution plan. That will have to wait as I just remember another important tip I found some time ago to easily bust performance when handling large amount of data. And it leads up to one of the things I found to look for en the execution plan.

It's kinda up the ball park with my last post. Its all about giving the SQL Server the best information to work with.

When you make joins between a large table and a smaller one the order of the tables does matter. If I for instance make this query:

select * from BigTable a
inner join SmallTable b on a.Key1 = b.Key1 and a.Key2 = b.Key2

then everything is runned based on BigTable. Back to the selectivity I talked about ind the last post. You start out with a large amount of data and then you select out based on the small amount. Then yYou have a large amount of data to match up.

But if you instead use the SmallTable as base:

select * from SmallTable a
inner join BigTable b on a.Key1 = b.Key1 and a.Key2 = b.Key2

In this case you only have a small amount of data to match up. And you have already reduced the amount of data SQL Server has to go through in order to give you the result.

Its all about high selectivity. And you can see if your query uses small over large right away in the execution plan. More on that later. But you should all the time have in mind that the fastest solution is high selectivity

Decimal reduction with ease

A co-worker of mine just asked me today how the best way to get a some percentages outputted with just two decimal - like 0.00 up til 100.00.

Well the simple answer is just casting the output to decimal(5,2). This decimal floater type can hold five digits where 2 of them will be after the decimal point.

So select cast(100 as decimal(5,2)) will output 100.00

A simple example is here:


set nocount on
/* Make a demo table holding the numbers */
declare @numbers table(num float)
/* Counter and a stop number */
declare @num float = 1, @maxnum float = 10000
/* Insert the counter and count it up by something until we reach the stop number*/
while @num < @maxnum
begin
insert @numbers select @num
set @num = @num + 12.3456789
print @num
end
/* select the output and show the casted decimal(5,2) number */
select num as Number, num/@maxnum*100 as FloatNumber, cast(num/@maxnum*100 as decimal(5,2)) as DecimalNumber from @numbers

Wednesday, September 28, 2011

Select me first please

The order of columns in an index - second blog in my series "From newbie to DBA"


When writing this new blog and really start thinking "oh there is a long way to I can call my self a real DBA". There are still a lot of things I need to learn. But as long as I keep learning I'm happy :)

I started this blog as an inspiration toward SQL PASS Summit 2011 in Seattle, which I really looking forward to. My goal is to put a lots of tips and tricks from one newbie to another trying to use terms and descriptions that are easy to understand and implement. So if anyone finds this useful then please comments.

Yesterday I talked about clusters and how the order of the cluster has an impact on inserting/deleting on heavy tables. Back when I did this my self I found out an other thing which I wasn't aware of but really is quit logical when you thing about it. It is selectivity which can be categorized as high or low.

Just a little background info on index before I get to the real point.

Imaging you have a group of 1000 people standing i front of you and from that crowd you should select all women which has brown hair and at the age of 40 - like 3 fields: Sex, HairColor and Age. You have to run around in the crow and ask each one about the 3 parameters and that can take time. So you can put up and index just like having the crowd standing in a certain order. So if I made my index: Sex, HairColor and Age then I could go to the middle of the crowd and ask are you a women (and then I would get slapped in the face :)). If she was I could then ask for the HairColor but if not I could to the middle of the middle and keep doing that until I found a women. I know that they are order Man and Women and Black, Blond, Brown for hair and so on. That way using the index I don't need to ask everybody to select the people I want.

But I can do even better. With the order I have made in the above "index" I only de-select around half namely the men. So when I come to hair color I still have to find them between 500 women. The selectivity on the sex field is low. But if I started with the hair color and let say we distinguish 5 different colors then I only have to find women between 200 people assuming that there are equally as many with each color.

And I can do better yet as it turns out that the selectivity is high for age. Potentially will only 10 (age 1-100) of then be 40 years old and of those 10 will only 2 have the color brown and only 1 be a women :) Well I know this is unlikely but the point been that by using the high selectivity field Age first in my index and HairColor second the SQL Server will faster find into the fields I want. So the index for this crowd should be Age, HairColor and Sex - uhm I guess my wife is pretty effective then - she always have sex as the last choice as well :)

BTW: I think that the statistics that the SQL Server gathers is used to figure out this as well. I have to look into that. See it is helpful to write down your knowledge - it gets you thinking.

When I first learn about selectivity and the order of fields in an index I wonder if I had to switch the order of the fields in my select as well. But no as long as your where part only uses equal (= sign) the SQL Server figures this out.

But if you uses a non-equal like <> then its another story.Because it has to check all the fields that are remain if they are different from the <> field.

Lets go back to our example. If the index is Age, HairColor and Sex then I would be fine writing:
select * from crowd where Age = 40 and HairColor = 'Brown' and Sex <> 'Man'. Using our index and our easy assumption about the distribution of the crowd,  when we have found the age and haircolor there are only to people to ask.

But if I with the same index ask for select * from crowd where Age <> 40 and HairColor = 'Brown' and Sex = 'Women' then the index is not very helpfull and the SQL Server has to ask (scan) everybody what there age is and then ask the remainer about color and sex.

So if this is a select statement you use a lot then a HairColor, Sex and Age index would be better.

So lesson learn by me was when designing cluster and indexes consider first 3 things. How does data arrive - is it like prices for today like in the previous blog. How do I use the data? The selects that our systems uses to get data. And finally what are the selectivity on the different fields that I consider for cluster and/or indexes. This is actually very simple steps to better performance.

I think in my next post, I might tell you a little about how I found out that the execute plan can do wonders telling where the bottlenecks are. It will also give you a hint if your index needs to be looked at.

BTW only 10 days until I will leave for the SQL PASS Summit :)

What a clusterfuck

Clustered index on large tables - first blog in my series "from newbie to DBA"

Well I geuss it's not nice to start your blog by swearing :) But as a former soldier clusterfuck is the correct term for my use of clusters on the SQL Server.

Uptil a year ago I didn't really know what the cluster did or was. But I was about to find out.

As mention in my first blog, our data has grown fast during the last couple of year. I'm managing a stack of SQL Servers which is used to collect data from stockexchanges. My main problem was the table holding the prices.

 This table had close to 900.000.000 records. 8 columns in the table is key fields. The order of those keys was nearly random as they where set up in logical reading order. And I couldn't understand why inserting around 1.000.000 new records every day took so long time - until I understood the concept of clusters.

So if you have the same problem this is where you put the smile back on your face. If you don't have that problem then the smile should already be there :)

When you create a table in SQL Management Studio (MS), using the design editor, your actually really don't pay attention to whats really happend behind the scenes. And as you can see from my CV - in most cases you actually don't need to. I made a good living letting MS do the hard work. But when the complexity increases and you need to find that DBA hat - you are a bit lost.

This was the first time I learned the hard way to use scripts for everything. I still often uses MS, also for designing my tables, but instead of pressing the OK button I always select the little script sign. And I always makes sure I totally understand what the script does.

But back to the cluster. So what happens when you design a table, select som keys an presses the OK button. Lets take a simpel version of my price table. For those prices I need an id (called ISIN), a trade date and a value. So for me the logical order is like I just descriped. Then I select ISIN and TradeDate as keys and I'm done. The table is created just like that and because I have selected the two keys the table has a cluster index on the order ISIN, TradeDate. See the scripts a the bottom of the post.

So when I select * from Prices where ISIN = 'DK00000111' and TradeDate = '20110927' it matches right up with the index and because its a cluster index the data is physically stored in order by ISIN, TradeDate sort of speak. Thats all very nice and quick.

Until.... you have a lot of data. See what happens when I each day stores the prices from today. I have around 500.000 different ISIN's and let say one TradeDate. And because the cluster says the data should be order by ISIN, TradeDate it (SQL Server) makes a lot of moving around to get those 500K of prices in order.

The solution is simple. I should decide the order of the cluster myself. And if I switch around on the two keys and use the order TradeDate, ISIN instead what would that mean? Well then I don't move data around but insert directly order by ISIN for todays prices. And now instead of using an hour for a few million records I just use minuts. And it does not affect my select * from Prices where ISIN = 'DK00000111' and TradeDate = '20110927' as it can still use the cluster index. See the changed script at the bottom of the post.

So whats important when designing tables which will hold large amount of data is design the clustered index meet the best crossing between inserts and selection for your job. Especially if its a detail table which often have more than one key - the natural order of columns is almost never the natural order of the cluster. And about the selectivity of the index I'll write a blog soon about that - as I just found out a few weeks ago that this can be optimized using a certain order of the columns in the index :) More about that later.

As I mention in my real live table for prices I have 8 columns which is the key fields. And sadly enough my prices don't come just for one day as I get prices from all over the world. So some prices may be 5-10 days old when I get them. So there was still a lot of moving going on.

That let me to thing about why did I have a keys on - and why the cluster? For this table we rarely do selects as all the prices we use for calculation is in a upfront table which only holds the last week of prices for quick processing. So I only used the cluster index to insure that records was unique and when a rare select was performed that it would not take ages.

That means I didn't really need the cluster. I could insure the uniqueness and the selection with a normal non-clustered index. So I removed the cluster and just made a non-cluster index. And my insertion of new prices is faster than ever and no data moving or ordering are done. Just into the pin and away we go. And the selection was still fast as the cluster didn't matter for that anyway. The many columns for the cluster and the nature of the data made sure that the SQL Server couldn't just spool data in from the order of the cluster. It had to go digging all the same.

I guess I could have made a non-clustered constraint to insure the uniqueness of that but I figured that the non-clustered index did the same trick and gave me an index all in one.

This is a very simple explanation of cluster and what to do. But the best lesson this gave me was always go the script way. Below you see the above sample table th MS way and the way I would create it.

MS Way


CREATE TABLE dbo.Prices
(
ISIN varchar(12) NOT NULL,
TradeDate datetime NOT NULL,
Value float(53) NULL
)  ON [PRIMARY]
GO
ALTER TABLE dbo.PricesADD CONSTRAINT
PK_Prices PRIMARY KEY CLUSTERED
(
ISIN,
TradeDate
) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]


My way 


CREATE TABLE dbo.Prices
(
ISIN varchar(12) NOT NULL,
TradeDate datetime NOT NULL,
Value float(53) NULL
)
GO
ALTER TABLE dbo.PricesADD CONSTRAINT
PK_Prices PRIMARY KEY CLUSTERED
(
TradeDate,
ISIN
)

Tuesday, September 27, 2011

From newbie to DBA


I'm was originally a software engineer which specialized in data acquisition. I have done programs for collecting data from radar systems in the military to prices from stock exchanges. 
When catching large quantities of live data you need a fast and safe data storage and for that I have always, since 1994 anyway, used MS SQL Server. The SQL Server (of any brand) gives the possiblity of throwing a lot of data at it and you don't need to worry about the rest.
When the databases got bigger we just bought a bigger server. It was cheaper than me spendig time optimizing and understandig how to really utililize the true power of the SQL Server.
But about a year ago the data caught up with me. On the project I have been working on since 2006, the amount of data was now so big that my time was cheaper than a new server. I figured a raise would be in order to leverage this but my boss didn't :)
So now I needed to figure out how to really use the SQL Server - after having abused it for 16-17 years. I needed to be a DBA - I think thats the right term :)
In this blog I will try to descripe the steps I went and still are going throug in order to get better performance and to gain knowledge to become a DBA.

In 11 days I'm going to SQL Pass Summit i Seattle and this trip really put action to an old dream about writing a blog. I need to have a place to collect my ideas and troubles anyway so why not share it with you :)