Author Topic: Workflow server (Read 4572 times)

sdweim85 · « **on:** March 18, 2013, 07:21:00 pm »

The company I work for asked my opinion on a new project that is ongoing which is bogging down on server tremendously. The way our network is setup is that everyone just has their own desktops and they just get their work from a fileshare on a server that houses all the projects. They basically just open up 1000 thumbnails jpegs and look for mistakes. Thing is when more than 6 people start viewing thumbnails they all start going really slow. Everyone is on a gigabit connection with i3s, and the server is a Xeon 3.40ghz 16 gigs of ram server 2008 r2.

a few thousand dollars worth of equipment and no more than 6 people view thumbnails over the network simultaneously? How do mega corporations get by? Is it all hardware based? Or is there something i'm missing, like the users log into a virtualized server through hyper V to do their work. So instead they are using the server hardware instead of relying on network bandwidth to view images.

kahlid74 · « **Reply #1 on:** March 19, 2013, 10:19:51 am »

You've got a lot going on here but you're still limited to 1000Gbe which translates into ~ 120MB/s. Realistically 6 computers can max that out if they are all transferring at the same time.

Can you give more details specifically about your application and how it accesses the thumbnails? Or are users just opening up a folder and the thumbnails are populating?

kegga · « **Reply #2 on:** March 19, 2013, 10:48:33 am »

There could be a lot of potential bottle necks here:
Storage are the files on a SAN or Local disk to the server, what is the raid level, what speed are the disks. If on a SAN are you seeing any read latency issues?
On the Network how are the switchs configured is it true 1Gb or a shared backplain, Are there any mismatches between speed and duplex settings of the clients and switch ports?
As you can see there are a lot of things that can be checked before you go making major changes.

MonMotha · « **Reply #3 on:** March 19, 2013, 11:22:37 am »

If I had to guess, you're bottlenecked on disk IO on the server. I've got 6 Xen domains running on a server nowhere near that capable CPU-wise (and less memory, and probably less memory bandwidth, too), and my bottleneck is ALWAYS the darned spinning metal. Lots of simultaneous requests for distinct files like you describe will easily get you well into seek hell. Throw an SSD at the thing if you can, and your problems will probably get better.

Failing that, upgrade the server's link to the LAN to 10Gb or 4xGbE. Make sure you've got a real switch, not some crummy Linksys thing.

Of course, real stats are always king. Instrument the heck out of the setup and see where your problems actually are.

Malenko · « **Reply #4 on:** March 19, 2013, 12:14:14 pm »

Have you done any monitoring to see what the actual slow down issue is? Check the logs or set up a resource monitor.
Dont guess what will fix it, find out whats broken first.

lilshawn · « **Reply #5 on:** March 19, 2013, 03:24:19 pm »

Quote from: MonMotha on March 19, 2013, 11:22:37 am

If I had to guess, you're bottlenecked on disk IO on the server. I've got 6 Xen domains running on a server nowhere near that capable CPU-wise (and less memory, and probably less memory bandwidth, too), and my bottleneck is ALWAYS the darned spinning metal. Lots of simultaneous requests for distinct files like you describe will easily get you well into seek hell. Throw an SSD at the thing if you can, and your problems will probably get better.

Failing that, upgrade the server's link to the LAN to 10Gb or 4xGbE. Make sure you've got a real switch, not some crummy Linksys thing.

Of course, real stats are always king. Instrument the heck out of the setup and see where your problems actually are.

i would have to agree. "the herd is only as fast as it's slowest member"

depending on the setup, your disk IO is only going to be a single IO operation at a time. Times that by 6 computers doing it a thousand times each at the same time that's 6000 IO's that have to be served. the server spends more time backlogged with IO requests.

I'm thinking spreading the load out to multiple controllers and disks can reduce this, being that data will be coming from multiple sources.

sdweim85 · « **Reply #6 on:** March 19, 2013, 04:02:05 pm »

Everyone is running at Gigabit speeds.

But yes, everyone's requests are coming off a single HDD.

The resource monitor shows a major slowdown in bandwidth alone. The CPU and memory are barely touched when everyone is viewing thumbnails over the network simultaneously.

That's expected though, since when they are viewing the thumbnails over the network, the file is writing to their desktop PC. So the server hardware isn't doing much except the HDD itself, and the NIC card.

We also tested Hyper-V Virtual machines with 3 VMS loading up Thumbnails all at the same time directly on the server. It was faster than doing it over the network, but its still a lot of stress on one HDD. Probably leading to the same issue. Using the network method puts the stress on the desktop PCs rather than the server.

kahlid74 · « **Reply #7 on:** March 20, 2013, 10:03:11 am »

Quote from: sdweim85 on March 19, 2013, 04:02:05 pm

But yes, everyone's requests are coming off a single HDD.

Only one HDD? Ummm yeah, the red light is going off. You need to go past bandwidth monitors and look at metrics. Compare Busy time on the disk to Avg Queue Length and Current Queue Length.

lilshawn · « **Reply #8 on:** March 20, 2013, 10:39:19 am »

miliseconds of access time for a disk, times thousands and thousands of requests = FOREVER to get your info. sure a couple of miliseconds doesn't sound like much, but multiply that times a billion operations and it REALLY adds up.

I'd consider a RAID0 or RAID1 setup. you would see a HUGE gain in your output speeds, and it's pretty easy to set up.

MonMotha · « **Reply #9 on:** March 20, 2013, 04:30:30 pm »

...or an SSD. I realize you may have a large amount of data, but the IOPS on even a mid-range SSD will blow a fairly fancy RAID0 array out of the water and probably be more reliable, too (though a big RAID6 array is probably even better in the reliability department, it'll take quite a large number of disks to match the SSD's performance on random access).

lilshawn · « **Reply #10 on:** March 20, 2013, 06:56:59 pm »

...or an SSD RAID0...

mystic96 · « **Reply #11 on:** March 21, 2013, 03:09:51 pm »

I would stay away from SSD, unless you're talking enterprise SSD... but if the company is worried about a few grand for a server, then I doubt you'll get approval to spend 2-3x that just for a couple of disks so it's a moot point

I would imagine a raid 1 for just 6 users would be fine, or maybe a new raid 1 and migrate those images over to separate the IO. Stay away from raid 0 unless it's throwaway data, or you don't mind recovering from <insert media type> backups. Even over a dedicated gigabit backup network, if you're talking 100+ GBs be prepared to be down for at least half a day.

Kahlid is spot on though. You can't really fix an issue unless you properly diagnose it. Do a few days' worth of perf mon for disk IO and review. It may be possible to just do a single raid 1 for both OS + data, or you may find that it would be prudent to separate OS from Data spindles. As far as servers are concerned, you should pretty much never, ever build one without raid redundancy. It may save a few hundred bucks up front (I believe current MSRP for an HP branded 300gb dp 6gb sas drive is right at $300 - but that's MSRP), but the potential loss of data and/or employee productivity could cost many times that.

Server talk on an arcade forum, I'm in heaven

MonMotha · « **Reply #12 on:** March 21, 2013, 11:44:05 pm »

RAID 1 generally does not improve performance. It can improve READ performance, at the expense of not catching errors until you do a scrub, but most implementations do simultaneous reads of all members to verify integrity prior to passing the data on.

You need striping to get improved performance. This meas RAID 0 (which will have reliability issues), RAID 5 (not recommended for more than 3-4 disks with modern, high capacity drives, as unrecoverable read errors have gotten relatively common even on "enterprise" kit), or RAID 6 (minimum 4 drives). A 4-5 element RAID 6 array of relatively fast drives will have OK sequential throughput, comparable to a mid-range consumer SSD, but still somewhat poor random access performance, which seems to be a consideration here.

My experience has been that the reliability of decent consumer grade SSDs (e.g. modern Intel) has been drastically understated. They seem to do at least as well as consumer spinning metal, if not better. They do tend to inexplicably "just up and completely die" relatively randomly, whereas spinning metal tends to exhibit a more recognizable and incremental failure in many cases, but the frequency of failure seems no worse than spinning metal except on POS low end models.

You can also of course get "enterprise" SSDs. I've seen 2TB models for ~$2000, which is what a decent array of 4-5 "enterprise" HDDs will run you, albeit for less capacity, and the performance will totally kill any spinning metal array you can make for that price. There's a reason they come as 8x PCIe cards. I'm not sure that the reliability is any better than consumer models, though, but that's generally true of "enterprise" HDDs too (they just have better error reporting since the consumer ones are crippled for marketing reasons).

All this said, yes, do some actual diagnostics and figure out what the issue is. From the description, it's very likely disk IO, but it's not overly hard to figure out for sure.

mystic96 · « **Reply #13 on:** March 22, 2013, 08:56:29 am »

Quote from: MonMotha on March 21, 2013, 11:44:05 pm

You can also of course get "enterprise" SSDs. I've seen 2TB models for ~$2000, which is what a decent array of 4-5 "enterprise" HDDs will run you, albeit for less capacity, and the performance will totally kill any spinning metal array you can make for that price. There's a reason they come as 8x PCIe cards. I'm not sure that the reliability is any better than consumer models, though, but that's generally true of "enterprise" HDDs too (they just have better error reporting since the consumer ones are crippled for marketing reasons).

I can't say with 100% assurance that this is true of all makes, but EMC's ent SSD drives actually have double the stated capacity. The second set of chips are left in a low power state and when an in-use chip hits it's max write count then that data is moved to a standby chip and the original is disabled, pointers are updated, etc, etc. The big thing about enterprise class drives is that they are meant to be spun 95%+ of the time, and their MTBFs are rated accordingly.

Any particular article you can point me to regarding the raid 5 issue you speak of? I'm having a really hard time digesting that having never experienced it myself.

kahlid74 · « **Reply #14 on:** March 22, 2013, 09:15:02 am »

Quote from: mystic96 on March 22, 2013, 08:56:29 am

Quote from: MonMotha on March 21, 2013, 11:44:05 pm
You can also of course get "enterprise" SSDs. I've seen 2TB models for ~$2000, which is what a decent array of 4-5 "enterprise" HDDs will run you, albeit for less capacity, and the performance will totally kill any spinning metal array you can make for that price. There's a reason they come as 8x PCIe cards. I'm not sure that the reliability is any better than consumer models, though, but that's generally true of "enterprise" HDDs too (they just have better error reporting since the consumer ones are crippled for marketing reasons).

I can't say with 100% assurance that this is true of all makes, but EMC's ent SSD drives actually have double the stated capacity. The second set of chips are left in a low power state and when an in-use chip hits it's max write count then that data is moved to a standby chip and the original is disabled, pointers are updated, etc, etc. The big thing about enterprise class drives is that they are meant to be spun 95%+ of the time, and their MTBFs are rated accordingly.

Any particular article you can point me to regarding the raid 5 issue you speak of? I'm having a really hard time digesting that having never experienced it myself.

Basically what Mon Motha is saying is that RAID5 since it's origin, has flaws, which were masked by small drive sizes. With 1-4 TB drives now, a RAID5 of 8 disks each with 4TB is a time bomb waiting to go off. The likely hood of a data set on two drives failing is considerably high because of how much space they hold.

By comparison, if you had EMC walk in the front door to design your new system, they would use a max of 6 drives per group at RAID6 and then make super groups of RAID 60 for your data.

mystic96 · « **Reply #15 on:** March 22, 2013, 04:35:06 pm »

Quote from: kahlid74 on March 22, 2013, 09:15:02 am

Quote from: mystic96 on March 22, 2013, 08:56:29 am
Quote from: MonMotha on March 21, 2013, 11:44:05 pm
You can also of course get "enterprise" SSDs. I've seen 2TB models for ~$2000, which is what a decent array of 4-5 "enterprise" HDDs will run you, albeit for less capacity, and the performance will totally kill any spinning metal array you can make for that price. There's a reason they come as 8x PCIe cards. I'm not sure that the reliability is any better than consumer models, though, but that's generally true of "enterprise" HDDs too (they just have better error reporting since the consumer ones are crippled for marketing reasons).

I can't say with 100% assurance that this is true of all makes, but EMC's ent SSD drives actually have double the stated capacity. The second set of chips are left in a low power state and when an in-use chip hits it's max write count then that data is moved to a standby chip and the original is disabled, pointers are updated, etc, etc. The big thing about enterprise class drives is that they are meant to be spun 95%+ of the time, and their MTBFs are rated accordingly.

Any particular article you can point me to regarding the raid 5 issue you speak of? I'm having a really hard time digesting that having never experienced it myself.

Basically what Mon Motha is saying is that RAID5 since it's origin, has flaws, which were masked by small drive sizes. With 1-4 TB drives now, a RAID5 of 8 disks each with 4TB is a time bomb waiting to go off. The likely hood of a data set on two drives failing is considerably high because of how much space they hold.

Oh I got what he said, he's just clearly been misinformed.

The whole point of raid 5 is the parity, or, redundancy. The "data set" (block, in IT speak) isn't mirrored across two drives like a raid 1 - it's calculated. That's why it sucks at write speed, because every write i/o on the app side causes 4 i/os on the controller side -- 2 (shoot, maybe 3... I forget) of those having to do with re-calculating and updating the parity. The great thing about parity is that it isn't just stored on a single drive. That's why you can remove an entire drive from the array and not lose ANY data... because parity (I'm going to meme that later). A block of data is a block of data, it doesn't matter the capacity of the spindle it resides on.

Since it hasn't been posted yet, I took a break from this response to look for this misinformation on the web, but was only able to find the opposite (no surprise). Not that I know this site, but it was the first response and yet still factually correct while having a title that sounds quite the contrary! Here's my source: http://www.standalone-sysadmin.com/blog/2012/08/i-come-not-to-praise-raid-5/ , and here was my Google search query: Raid 5 on large drives losing data

Quote

Now, lets move on to RAID-5. You need at least 3 drives in a RAID-5, because unlike the exact copy of RAID-1, RAID-5 has a parity, so that any individual pieces of data can be lost, and instead of recovering the data by copying it, it's recalculated by examining the remaining data bits.

So when we encounter a URE during normal RAID operations, the array calculates what the missing data was, the data is re-written so we'll have it next time, and the array carries on business as usual.

URE = unrecoverable read error, more common in larger drives w/ more spindles.

Quote from: kahlid74 on March 22, 2013, 09:15:02 am

By comparison, if you had EMC walk in the front door to design your new system, they would use a max of 6 drives per group at RAID6 and then make super groups of RAID 60 for your data.

Come on man... be honest. How many meetings have you had with EMC about proper solution design? Because we've got PBs of DMX and VMAX here (need I mention the other manufacturers?), and I can tell you exactly how many times I've heard raid 6/60 as the answer to anything other than a joke - zero.

MonMotha · « **Reply #16 on:** March 22, 2013, 07:06:37 pm »

Unrecoverable read errors happen. RAID5 protects against one, which is great. However, suppose you lose an entire drive. It happens. You now have no redundancy. You go to rebuild the array and, lo and behold, you get another unrecoverable read error on another drive. It happens amazingly often, and, as you point out the frequency that it happens at is essentially a function of the amount of data you have, not the number of drives you have. Hence, with modern really high capacity drives, you're more likely to have it happen than you were on the old 320GB things. The thing is, you've probably got a lot of infrequently accessed data. You may not notice that there's an unreadable sector on the drive until you go to do a full rebuild due to that failed drive.

RAID6 gives you a second layer of redundancy so that you can actually have a really good chance at a successful rebuild.

You can also routinely "scrub" a RAID5 array to try and find those errors while you still have the N+1 redundancy from the parity available, but that has pretty nasty overhead. If you've got a lot of downtime on your server, it can be OK. If it's a 24/7 active server, it can be a real problem. I do this on my 2-element RAID 1 arrays since there's a similar problem. The IO hit during the scrub sucks bigtime, but it's acceptable in most of my situations, and the tradeoff of going to a 3 element RAID1 or 4 element level 1+0 or 6 could not be justified.

And yes, I'm well aware of the extra "unused" capacity set aside on SSDs for wear leveling. I've done a fair bit of work with bare MTD devices on Linux going back to the early days of JFFS2 before NAND was popular. Even if you do "wear out" the flash (which is amazingly hard to do except in pathological cases), it "should" fail read-only. Sadly, controller bugs, seem to cause...other modes of failure most of the time, and seemingly well before erase cycle limitations have probably been reached, in most cases, assuming proper wear leveling.

As to your 60PB system, they're probably doing things well beyond simple RAID to make that work.

ark_ader · « **Reply #17 on:** March 22, 2013, 08:06:34 pm »

Mirrored drives, an image database that is denormalized, added cache and...this always makes me laugh:

mystic96 · « **Reply #18 on:** March 22, 2013, 08:35:20 pm »

Quote from: ark_ader on March 22, 2013, 08:06:34 pm

Mirrored drives, an image database that is denormalized, added cache and...this always makes me laugh:

OMFG, that is by far the baddest-assed thing I have ever seen!!!! Thanks for sharing, that quickly got forwarded to work buddies

MonMotha,

Sorry dude - I wrote a response and hit Post - lost it due to timeout for my account

Quick & dirty because I have to pick up the lady here in a few -- UREs happen very infrequently, certainly not even close to frequently enough to worry about losing an array. I think that article that I cited above said you would have to read a 3TB drive from sector 1 to end 3 times over before experiencing a single URE... and said URE would be fixed nearly instantly via the raid controller. I'm sorry, but you're just absolutely mistaken about raid 5's inability to protect data using larger drives. Multiple drive failures are statistically impossible for all but the worst-case scenarios using improper gear. I've never seen it in prod, though I did see it once on a server that got power spiked. Again - worst case scenario for being on unclean power. Even a mom & pop shop can afford a $50 UPS.

As for raid 6, again, sorry but at best it's a niche solution - much the same as raid 50 is a niche deal. And if you're talking about a small array (say 4 drives, as in the bare minimum required) - you'll never find a single storage person that would recommend using that over a raid 10. Your r6 still loses 2 drives, has nearly double the overhead of r5 (two parities, remember?), and way less performance than an r10.

Because I want to stay here for a while and not be "that guy," I want to close with -- we all have to run our own environments our own way. My initial response/hammer throwing regarding raid 5 may have been construed as over the top and I apologize if it was taken as an attack. But I felt it necessary to point out that it was incorrect due to the fact it was directed towards OP who was looking for advice. I wish you all as few failures and lowest queue lengths as possible with whatever solution you choose

P.S. I didn't say 60PBs (I intentionally left the number blank - even 1pb is enough to show my employer takes storage freakishly serious).

MonMotha · « **Reply #19 on:** March 22, 2013, 08:57:21 pm »

Everybody I've talked to has expressed very serious concern over getting an unexpected URE during RAID5 rebuild. I've seen this sentiment expressed pretty widely. Either it's a total myth that's propagated widely, or it has merit. The logic certainly makes sense. If you don't scrub your RAID5 with some regularity, it could bite you hard, and RAID6 would save your butt at the expense of just needing one more drive (consider it's only a real issue on largeish arrays - >6 disks or so - where the incremental cost of one more disk isn't generally a huge deal). There may also be a small performance issue due to the double parity, but it seems like there's probably a striping pattern that mostly nulls that back out. At this point, it seems like your goal is probably capacity over performance - see below regarding my experiences with non-SAN scale RAID arrays vs. SSDs.

You may be right; I don't normally build storage systems bigger than a few TB, which is easily done with RAID 1 or 5 still, especially since I can deal with scrubbing it once a month or so. I just know I've seen it mentioned a lot over the past couple years, and everybody says its of specific importance on drives >1TB or so.

As to performance, it depends on your strategy. If you can do the parity calcs at full speed (easily done these days in software, and with minimal overhead) and have "pessimistic" reading (where you always check parity/mirror on read, even if the drive indicates no error), you've got more potential bandwidth from a 6 disk RAID6 than a 3+3 RAID 1+0, and you get the capacity of 4 drives, not just 3. If you're willing to treat reads "optimistically", you can get higher read BW out of the 0+1 (since you can stripe the read across both halves of the mirror, or schedule separate IO ops), but you may miss an unsignaled read error (and this is where having "enterprise" drives makes a difference - "consumer" drives are frequently silent upon read error, whereas "enterprise" ones complain).

I'd suspect that a single high end consumer SSD will still blow away a 5-6 element RAID5, and probably even a RAID0 on random operations. Sequential may be more of a shootout. Reliability is tough to guess at. The single SSD has no redundancy, whereas a RAID5 will have N+1, but you've also got 5-6 times the devices to experience a failure in. RAID 1 on SSDs can unfortunately be of limited use due to controller glitches. Of course, for a given number of $$$, you'll get way more capacity out of the revolving metal in any case.

mystic96 · « **Reply #20 on:** March 23, 2013, 11:31:55 am »

I think everybody you are talking to is stuck in 2007. Back then, UREs were expressed in 10^13 or 10^14 bits read. Nowadays it's 10^15 and 10^16. These are those numbers in our speak:
10^13 = 1.13tb
10^14 = 11.3tb
10^15 = 113tb
10^16 = 1.13pb

This is per drive, not per array. Your assertion also assumes that the drives are 100% full. Maybe it's an east-coast/west-coast thing, but again I've never heard a single person bring this up as a legitimate concern. The numbers speak for themselves and the only legit article I was able to find to back up your side was from 2007 talking about 2009+. That is to say, it was basically some dude in North Scottsdale being a Nostradumbass.

You bring up a good point about performance differences. Raid5/6 are better for writes and 1/10 are better for reads. The problem I have with 6 is that DP is worse than r5 and P+Q is only marginally better (we're talking typically less than a 5% increase in throughput) for double the overhead when it matters most (rebuilds). Taking URE-related failures off the table here... I would take an increase in capacity over a marginal increase in write performance.

Enterprise SSDs are truly nuts in a throughput capacity, but I still question their longevity. And as you point out, in some cases they are no faster than spindled drives. I'm just not sold on them as an enterprise solution yet, although that video that was posted in the IT history thread is going to raise some eyebrows on Monday

Crap, time to go camping. Have a good weekend man, to be continued?


Main	Restorations	Software	Audio/Jukebox/MP3	Everything Else	Buy/Sell/Trade
Project Announcements	Monitor/Video	GroovyMAME	Merit/JVL Touchscreen	Meet Up	Retail Vendors
Driving & Racing	Woodworking	Software Support Forums	Consoles	Project Arcade	Reviews
Automated Projects	Artwork	Frontend Support Forums	Pinball	Forum Discussion	Old Boards
Raspberry Pi & Dev Board	controls.dat	Linux	Miscellaneous Arcade	Wiki Discussion	Old Archives
Lightguns	Arcade1Up	Try the site in https mode		Site News


Unread posts \| New Replies \| Recent posts \| Rules \| Chatroom \| Wiki \| File Repository \| RSS \| Submit news